Sunday, September 04, 2011

Execution orders for Stage Variables, Constraints and Column Derivations in Transformer stage


For people that work in DataStage ETL tools, using Transformer stage in a Server/Parallel is very common and personally I would say transformer stage will be used at least once most of the time. Have anyone clear about the execution order for Stage variables, Constraints and Column derivations in Transformer? 

Today what I wish to share is about the execution order for stage variables, constraints and column derivations.

In transformer stage, execution order for the three components I mentioned above is stage variables, constraints and column derivations. Basically, DataStage executes these from top to bottom. This is clearly shown when you double click on the Transformer stage and you would see stage variable is located the top, followed by constraints and column derivations.

Please always remember the following characteristics for each component:
  • Stage variables - Executed for every rows that we processed/extracted
  • Constraints - Can be treated as a filter condition which limits the number of rows/records coming from our input based on the business rules we defined. Stage variable can be used in constraints.
  • Column derivations - Used to get or modify our input values, i.e. concatenation of two values from inputs, set the column to constant value, etc.

Saturday, April 02, 2011

Data Warehouse Glossary

I came across this document that contains all the data warehouse related glossary. The document is compiled by Father of Data Warehouse, Bill Inmon. The link to the Data Warehouse Glossary is here. 

Hope the document will be beneficial to everyone especially those that currently working in Data Warehouse and Business Intelligence space. :) Feel free to share with me if you guys have good data warehouse and business intelligence related docs/articles. 

Sunday, February 27, 2011

Default Null in DataStage

In DataStage, the way to default null in transformer stage for Server job and Parallel job are different.

In transformer stage in Server job, @NULL is used.
In transformer stage in Parallel job, SetNull() is used.