Sunday, September 04, 2011

Execution orders for Stage Variables, Constraints and Column Derivations in Transformer stage


For people that work in DataStage ETL tools, using Transformer stage in a Server/Parallel is very common and personally I would say transformer stage will be used at least once most of the time. Have anyone clear about the execution order for Stage variables, Constraints and Column derivations in Transformer? 

Today what I wish to share is about the execution order for stage variables, constraints and column derivations.

In transformer stage, execution order for the three components I mentioned above is stage variables, constraints and column derivations. Basically, DataStage executes these from top to bottom. This is clearly shown when you double click on the Transformer stage and you would see stage variable is located the top, followed by constraints and column derivations.

Please always remember the following characteristics for each component:
  • Stage variables - Executed for every rows that we processed/extracted
  • Constraints - Can be treated as a filter condition which limits the number of rows/records coming from our input based on the business rules we defined. Stage variable can be used in constraints.
  • Column derivations - Used to get or modify our input values, i.e. concatenation of two values from inputs, set the column to constant value, etc.