We have moved to www.dataGenX.net, Keep Learning with us.

Monday, April 29, 2013

Optimzie your DataStage Job Performance with relevant Environment Variables

DataStage has many parameters which can be tweaked and used to optimize the performance of various DataStage Jobs. Even many available to collect more information during the event of crash to get more traces.

For any DataStage Job if you run into problem or want to get more details need to check following variables.

$APT_CONFIG_FILE: This allows you to define Configuration file based on your requirement. You can keep many configuration files with n-node combination and assign it dynamically for Job based in criteria or time.

$APT_SCORE_DUMP: It creates a job run report that shows the partitioning used, degree of parallelism, data buffering and inserted operators. It is Useful for finding out what your high volume job is doing.

$APT_PM_PLAYER_TIMING: This option lets you see what each operator in a job is doing, especially how much data they are handling and how much CPU they are consuming. It helps in identifying various bottlenecks.

Please refer to following link from IBM for a detailed list of Environment Variables which can help you in various areas. It covers Buffering, Building Custom Stages, Compiler, Debugging, Disk I/O, General Job Administration and many other relevant areas in the field.
Here is Extensive List of DataStage_Environment_Variables with each of them categorized and explained and how to use it is discussed.

Here is Some more details on

14 Good design tips in Datastage
Environment Variable for Data Stage Best Practices and Performance Tuning
Tips & Tricks for debugging a DataStage job
Using DataStage 8 Parameter Sets to Tame Environment Variables