We have moved to www.dataGenX.net, Keep Learning with us.

Thursday, July 12, 2012

Tips for debugging a datastage job


Some tips for the beginners of datastage. Hope this will help you to debug the job in datastage.

Enable the following environment variables in DataStage Administrator:
  • APT_PM_PLAYER_TIMING – shows how much CPU time each stage uses
  • APT_PM_SHOW_PIDS – show process ID of each stage
  • APT_RECORD_COUNTS – shows record counts in log
  • APT_CONFIG_FILE – switch configuration file (one node, multiple nodes)
  • OSH_DUMP – shows OSH code for your job. Shows if any unexpected settings were set by the GUI.
  • APT_DUMP_SCORE – shows all processes and inserted operators in your job
  • APT_DISABLE_COMBINATION – do not combine multiple stages in to one process. Disabling this will make it easier to see where your errors are occurring.
  • Use a Copy stage to dump out data to intermediate peek stages or sequential debug files. Copy stages get removed during compile time so they do not increase overhead.
  • Use row generator stage to generate sample data.
  • Look at the phantom files for additional error messages: c:\datastage\project_folder\&PH&
  • To catch partitioning problems, run your job with a single node configuration file and compare the output with your multi-node run. You can just look at the file size, or sort the data for a more detailed comparison (Unix sort + diff commands).



njoy the simplicity.......
Atul Singh