ETL Connector for Alfresco - Alfresco Server Extension

Project page and downloads at http://forge.alfresco.com/projects/etlconnector/

1. About ETL Connector

The ETL Connector extension for Alfresco allows to import documents in an Alfresco repository by using compatible ETL Tools (for now Talend). It also provides an ETL client library that makes it easy to integrate in any ETL tool.

Features

  • imports to any kind of Alfresco content (not only file and folder but also custom types or aspects, any properties and associations, any document tree)
  • configure permissions on imported content
  • create vs update modes on document and containers
  • provides import result logs
  • works by simple REST HTTP interactions with Alfresco, content provided as fully compliant ACP (Alfresco Content Package) XML

Team

License

  • Alfresco Server Extension : GPL
  • ETL client library : LGPL

2. Installation

Compatible Alfresco releases

  • validated with 2.1 Entreprise for Tomcat
  • should work with all 2.x Alfresco releases
  • reported to work on Labs 2.9b

Compatible ETLs

Installation

  • client side (ETL tool) : get a compatible ETL tool release (see above)
  • server-side (Alfresco repository) : get it from http://forge.alfresco.com/frs/?group_id=206 . Alternatively, it may be provided in compatible ETL release bundles.
  • put the etlconnector-alfresco*.jar file in the WEB-INF/lib of your Alfresco installation, ex. $ALF_HOME/tomcat/webapps/alfresco/WEB-INF/lib
  • restart alfresco . If it's been correctly installed, there should be in the startup logs (alfresco.log) a line like this one :
19:20:49,635 INFO [org.alfresco.config.source.UrlConfigSource] Found META-INF/web-client-config-custom.xml in file:/C:/dev/workspace/etlconnector-alfresco-deploy/tomcat/webapps/alfresco/WEB-INF/lib/etlconnector-alfresco_1.0.jar

Test

You can test it by using the samples provided in the companion project etlconnector-samples , and a compatible ETL like Talend 3.1 on the client side.

For the Quitus sample, using Talend :

  • put the etlconnector-samples*jar in WEB-INF/lib in your alfresco web application
  • start the Alfresco server (after having installed the ETL Connector extension)
  • import the etlconnector-samples/quitus/GED_TECHNIQUE.acp document package in a new "GED TECHNIQUE" folder within the company home folder, using the custom action wizard in the Alfresco web interface
  • start Talend
  • import the etlconnector-samples/quitus/talend/ALFRESCO_ETLCONNECTOR_QUITUS as a Talend workspace project
  • open the single Talend document import job ("ALFRESCO IMPORT_QUITUS 0.1")
  • click in the left panel on Context > PATHS 0.1 to open the configuration dialog and there set the PATH_SOURCE variable of the job to the location of the etlconnector-samples/quitus folder
  • run it in Talend : the complex document tree has been imported in Alfresco, including custom metadata and associations

3. Documentation

ETL Connector

Using ETL Connector with Talend

5. FAQ

What is ETL Connector interesting for ?

ETL Connector's main benefits stem from the productivity gains inherent to ETL tools : allowing to design graphically how existing information maps to Alfresco metadata, in an easy manner and using an ETL's raw power when it comes to accessing data sources in the Information System. Moreover, an ETL provides all kind of tools to first partition data in smaller batches, and afterwards handle errors.

Known problems on Alfresco 3

Performances

  • successfully tested on an import job creating 4000 nodes in 30 minutes, using the Talend ETL to target an Alfresco 2.1 Entreprise server sitting on Oracle.
  • in some deployment environments, a sustained speed of 12 nodes per second has even been experienced.

4. For developers

Alfresco Server Extension architecture

  • builds on the existing Alfresco Content Package (ACP) import
  • enriches it with : import of each node in its own transaction,better name path addressing, full error logs, custom import strategies allowing creation vs update import modes
  • XML REST / HTTP server implemented as Alfresco web Commands (though a Java webscript would be a viable alternative today)

Building the Alfresco Server Extension

  • provide the etlconnector-alfresco Eclipse project with the Alfresco SDK and java 1.5 dependencies
  • run Ant on the given build.xml
  • the ETL Connector Server release is in build/export/ , ready to be added to an Alfresco installation

Building the ETL client library

  • provide the etlconnector-client Eclipse project with the java 1.5 dependencies
  • run Ant on the given build.xml
  • the ETL Connector Client release is in build/export/
  • you want to integrate it in an ETL ? Ask questions on the forums at http://forge.alfresco.com/projects/etlconnector/

-- MarcDutoo - 22 May 2009

Topic attachments
I Attachment Action Size Date Who Comment
pngpng sample_quitus_alfresco.png manage 136.2 K 25 May 2009 - 21:38 MarcDutoo Quitus sample document import - imported documents in Alfresco
pngpng sample_quitus_doc.png manage 105.9 K 25 May 2009 - 21:38 MarcDutoo Quitus sample document import - Opening an imported document
pngpng sample_quitus_talend.png manage 117.3 K 25 May 2009 - 21:37 MarcDutoo Quitus sample document import - Talend job design
Topic revision: r9 - 12 Apr 2010 - 13:09:10 - MarcDutoo
 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback