www.coglinmill.com

 

     

ETL

ETL (Extract Transform and Load) sounds simple enough. However, if you’ve been involved in a data warehouse or data integration project you probably know better. Let’s review some of the requirements:

 

Especially in enterprise wide data warehouse implementations, you are forced to deal with many different data sources.  From our IBM i standpoint, local DB2 for i tables are fairly straightforward, but what about remote tables on another system or LPAR?

 

Maybe you also need to pull in some DB2 data from an AIX box, or even SQL Server or Oracle data. Much more challenging! What about text files, delimited files and other ‘legacy’ data not in a database?

 

And you probably don’t want all of it – you need to select today’s data only, or just the records that have changed using Change Data Capture.

 

Then you need to consider data quality. ALL data is dirty – or at least you must assume that it might be. Poor data quality is one of the major causes of failure of business intelligence projects.  So, you need to build cleansing rules. That implies error management and reporting.

 

Transformations can be simple – just map the data out to the target field - or quite complex. In many cases, it will take a few tries to get it right – so you need to be able to easily implement, test, revise and re-run. But more importantly, the final transformations (business rules) need to be visible to other developers and be made available to end users – so they can understand the data. Logic hidden in program code is of no use to anyone. We need metadata!

 

Loading into the target tables is not as simple as it sound either. New data must be inserted. Changes need to update existing rows – but how? Should I accumulate numeric columns or replace the value? More rules!

 

Do you need to consider Compliance. Public companies using their data warehouse for financial reporting need to be able to demonstrate where the data came from, what was done to it and when. This means audit trails, change management and more. Do you have the time or resources to build all of this around your ETL processes?

 

 

 

 

 

  866-RODIN-DW

  (866-763-4639) 

 Request Information
 Request a Quote
 Evaluation Copy
  View FAQs 
 Support

 

Subscribe