ETL Connector for Alfresco - Alfresco Server Extension
Project page and downloads at http://forge.alfresco.com/projects/etlconnector/
1. About ETL Connector
The ETL Connector extension for Alfresco allows to import documents in an Alfresco repository by using compatible ETL Tools (for now Talend).
It also provides an ETL client library that makes it easy to integrate in any ETL tool.
Features
- imports to any kind of Alfresco content (not only file and folder but also custom types or aspects, any properties and associations, any document tree)
- configure permissions on imported content
- create vs update modes on document and containers
- provides import result logs
- works by simple REST HTTP interactions with Alfresco, content provided as fully compliant ACP (Alfresco Content Package) XML
Team
License
- Alfresco Server Extension : GPL
- ETL client library : LGPL
2. Installation
Compatible Alfresco releases
- validated with 2.1 Entreprise for Tomcat
- should work with all 2.x Alfresco releases
- reported to work on Labs 2.9b
Compatible ETLs
Installation
- client side (ETL tool) : get a compatible ETL tool release (see above)
- server-side (Alfresco repository) : get it from http://forge.alfresco.com/frs/?group_id=206 . Alternatively, it may be provided in compatible ETL release bundles.
- put the etlconnector-alfresco*.jar file in the WEB-INF/lib of your Alfresco installation, ex. $ALF_HOME/tomcat/webapps/alfresco/WEB-INF/lib
- restart alfresco . If it's been correctly installed, there should be in the startup logs (alfresco.log) a line like this one :
19:20:49,635 INFO [org.alfresco.config.source.UrlConfigSource] Found META-INF/web-client-config-custom.xml in file:/C:/dev/workspace/etlconnector-alfresco-deploy/tomcat/webapps/alfresco/WEB-INF/lib/etlconnector-alfresco_1.0.jar
Test
You can test it by using the samples provided in the companion project etlconnector-samples , and a compatible ETL like Talend 3.1 on the client side.
For the Quitus sample, using Talend :
- put the etlconnector-samples*jar in WEB-INF/lib in your alfresco web application
- start the Alfresco server (after having installed the ETL Connector extension)
- import the etlconnector-samples/quitus/GED_TECHNIQUE.acp document package in a new "GED TECHNIQUE" folder within the company home folder, using the custom action wizard in the Alfresco web interface
- start Talend
- import the etlconnector-samples/quitus/talend/ALFRESCO_ETLCONNECTOR_QUITUS as a Talend workspace project
- open the single Talend document import job ("ALFRESCO IMPORT_QUITUS 0.1")
- click in the left panel on Context > PATHS 0.1 to open the configuration dialog and there set the PATH_SOURCE variable of the job to the location of the etlconnector-samples/quitus folder
- run it in Talend : the complex document tree has been imported in Alfresco, including custom metadata and associations
3. Documentation
ETL Connector
Using ETL Connector with Talend
5. FAQ
What is ETL Connector interesting for ?
ETL Connector's main benefits stem from the productivity gains inherent to ETL tools : allowing to design graphically how existing information maps to Alfresco metadata, in an easy manner and using an ETL's raw power when it comes to accessing data sources in the Information System. Moreover, an ETL provides all kind of tools to first partition data in smaller batches, and afterwards handle errors.
Known problems on Alfresco 3
Performances
- successfully tested on an import job creating 4000 nodes in 30 minutes, using the Talend ETL to target an Alfresco 2.1 Entreprise server sitting on Oracle.
- in some deployment environments, a sustained speed of 12 nodes per second has even been experienced.
4. For developers
Alfresco Server Extension architecture
- builds on the existing Alfresco Content Package (ACP) import
- enriches it with : import of each node in its own transaction,better name path addressing, full error logs, custom import strategies allowing creation vs update import modes
- XML REST / HTTP server implemented as Alfresco web Commands (though a Java webscript would be a viable alternative today)
Building the Alfresco Server Extension
- provide the etlconnector-alfresco Eclipse project with the Alfresco SDK and java 1.5 dependencies
- run Ant on the given build.xml
- the ETL Connector Server release is in build/export/ , ready to be added to an Alfresco installation
Building the ETL client library
- provide the etlconnector-client Eclipse project with the java 1.5 dependencies
- run Ant on the given build.xml
- the ETL Connector Client release is in build/export/
- you want to integrate it in an ETL ? Ask questions on the forums at http://forge.alfresco.com/projects/etlconnector/
--
MarcDutoo - 22 May 2009
Topic revision: r9 - 12 Apr 2010 - 13:09:10 -
MarcDutoo