Tag Archives: Alfresco

Reading List: Getting Started with Alfresco SURF

Goal for today is to absorb the following:

AlfrescoWiki: Surf Platform

AlfrescoWiki: Deployment Configurations

benh: SURF Part 1 – Getting Started

benh: SURF Part 2 – Pages and Navigation

benh: SURF Part 3: Alfresco WCM Content

Back in February, there was a Code Camp run focussing on SURF that Jeff Potts has details of. Once you get the backing information, and can successfully build Alfresco from SVN, you follow along with exercises participants worked on.

A Reading List: High Availability for Alfresco ECM

Start with watching the Alfresco hosted webinar: High Availability Clustering with Alfresco to get a high level overview.

Then read thru rivetlogic’s comprehensive page that includes notes on disaster recovery: Deploying HA Alfresco on Linux. This is mirrored on the AlfrescoWiki. Not sure which wiki page is definitive.

Finally, put it into practice with a working example by following Jeff Potts’s walk-thru of a simple set up to get a feel for it: Alfresco 3.1 clustering easier with JGroups.

Reference details:

AlfrescoWiki: Configuring JGroups and Alfresco Clusters

AlfrescoWiki: Cluster Configuration V2.1.3 and Later

Alfresco Webinar: High Scalability with Alfresco WCM (great information about various options)

Special thanks to Jeff Potts who answered a related query of mine on Twitter.

I’ll be sure to post progress here once I’ve a few working examples.

Notes on the Alfresco Web Content Management Evaluation Guide

This walk-thru requires 3.1 Enterprise or 3.2 Community as a prerequisite. Clean install of Alfresco seems a must. Network connection is required! Sometimes you need
to restart your machine (if you see a deploy or preview task ‘freeze up’ for
example). Note: this tutorial is far more comprehensive (and usable) than the WCM Forms Quick Tutorial posted to the Wiki. I wouldn’t waste your time with that.

  1. Download the Web Content Management Evaluation Guide from:
    http://www.alfresco.com/products/ecm/enttrial/files/getting_started_with_wcm_for_enterprise3_1.pdf
  2. Add sample website to hosts file

    127.0.0.1 admin.alfrescosample.www--sandbox.127-0-0-1.ip.alfrescodemo.net
  3. Start Alfresco and virtual alfresco server
    $ ./virtual_start.sh start
    $ ./alf_start.sh start
  4. All files required for evaluation guide are under ${ALFRESCO INSTALL DIRECTORY}/extras/wcm

Installing Alfresco on OSX – quick and dirty

Note: These are terrible instructions – no security or any set up in regards to
making upgrades easy. But this gets you up and running fast.

  1. Prerequisites: JDK 5.x, MySQL 5.x
  2. First, insure there is no pre-existing alfresco database

    $ mysql -u root -p <ENTER> <ENTER>
    mysql> drop database alfresco;
    mysql> exit
    $ sudo /Library/StartupItems/MySQLCOM/MySQLCOM stop {Enter OSX admin password} <ENTER>
    $ sudo /Library/StartupItems/MySQLCOM/MySQLCOM start
  3. Create the directory you are going to install alfresco into

    $ mkdir /opt/alfresco
  4. Download and extract Alfresco-Community-3.2-MacOSXInstall.tar.gz from Alfresco

    $ tar xvf Alfresco-Community-3.2-MacOSXInstall.tar.gz 
  5. Run the installer

    $ ./Alfresco-Community-3.2-MacOSXInstall
  6. Choose defaults until destination folder. Override that and select /opt/alfresco
  7. When dialog asks for root password, leave blank, it is referring to MySQL
    root password. When you click Next it will inform you that database
    creation was successful.
  8. After finishing, using terminal cd to the directory Alfresco was installed into:

    $ cd /opt/alfresco
  9. Fire it up:

    $ ./alf_start.sh start
  10. Fire up the virtual server

    $./virtual_start.sh
  11. First time start up can take up to 5 minutes. Give it time. Refresh
    http://localhost:8080/alfresco/ every minute or so and then you should get the
    default dashboard. Username/password admin/admin.
  12. When finished, shut ‘er down.

    $ ./alf_stop.sh
  13. The virtual server too

    $./virtual_stop.sh

Reading up on ETL (Extract, Transform, Load) processing

Wikipedia: Extract, transform, load

Wikipedia: Talend Open Studio

Talend Open Studio: Tutorials

Manageability: Open Source ETL (Extraction, Transform, Load) Written in Java

richard.gluga.com: Data Migration Done Right

kJube: Vendors and tools – ETL

AlfrescoForge: ETL Connector

Talend job for Job Scheduler implement

High Scalability: How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data

NYTimes: Announcing the Map/Reduce Toolkit

core-user@hadoop.apache.org: Andreas Kostyrka: Re: hadoop in the ETL process
Re: hadoop in the ETL process