DataStage provides a number of
environment variables to control how jobs operate on a UNIX system. In
addition to providing required information, environment variables can be used
to enable or disable various DataStage features, and to tune performance
settings.
Data Stage Environment Variable Settings for All Jobs
DataStage recommends the following environment variable settings for all Enterprise Edition jobs. These settings can be made at the project level, or may be set on an individual basis within the properties for each job.
Environment Variable Settings For All Jobs
Environment Variable
|
Setting
|
Description
|
$APT_CONFIG_FILE
|
filepath
|
Specifies the full pathname to the EE configuration file.
|
$APT_DUMP_SCORE
|
1
|
Outputs EE score dump to the DataStage job log, providing
detailed information about actual job flow including operators, processes,
and datasets. Extremely useful for understanding how a job actually ran in
the environment. (see section 10.1 Reading a Score Dump)
|
$OSH_ECHO
|
1
|
Includes a copy of the generated osh in the job’s DataStage
log. Starting with v7, this option is enabled when “Generated OSH
visible for Parallel jobs in ALL projects” option is enabled in DataStage
Administrator.
|
$APT_RECORD_COUNTS
|
1
|
Outputs record counts to the DataStage job log as each
operator completes processing. The count is per operator per partition.
|
$APT_PM_SHOW_PIDS
|
1
|
Places entries in DataStage job log showing UNIX process ID
(PID) for each process started by a job. Does not report PIDs of DataStage
“phantom” processes started by Server shared containers.
|
$APT_BUFFER_MAXIMUM_TIMEOUT
|
1
|
Maximum buffer delay in seconds
|
$APT_THIN_SCORE
(DataStage 7.0 and earlier)
|
1
|
Only needed for DataStage v7.0 and earlier. Setting this
environment variable significantly reduces memory usage for very large
(>100 operator) jobs.
|
Additional Environment Variable Settings
DataStage recommends setting the
following environment variables on an as-needed basis. These variables can be
used to tune the performance of a particular job flow, to assist in debugging,
and to change the default behavior of specific EE stages.
NOTE: The environment
variable settings in this section are only examples. Set values that are
optimal to your environment.
Sequential File Stage Environment Variables
Environment Variable
|
Setting
|
Description
|
$APT_EXPORT_FLUSH_COUNT
|
[nrows]
|
Specifies how frequently (in rows) that the Sequential File
stage (export operator) flushes its internal buffer to disk. Setting this
value to a low number (such as 1) is useful for realtime applications, but
there is a small performance penalty from increased I/O.
|
$APT_IMPORT_BUFFER_SIZE
$APT_EXPORT_BUFFER_SIZE
|
[Kbytes]
|
Defines size of I/O buffer for Sequential File reads (imports)
and writes (exports) respectively. Default is 128 (128K), with a minimum of
8. Increasing these values on heavily-loaded file servers may improve
performance.
|
$APT_CONSISTENT_BUFFERIO_SIZE
|
[bytes]
|
In some disk array configurations, setting this variable to a
value equal to the read / write size in bytes can improve performance of
Sequential File import/export operations.
|
$APT_DELIMITED_READ_SIZE
|
[bytes]
|
Specifies the number of bytes the Sequential File (import)
stage reads-ahead to get the next delimiter. The default is 500 bytes, but
this can be set as low as 2 bytes.
This setting should be set to a lower value when reading from
streaming inputs (eg. socket, FIFO) to avoid blocking.
|
$APT_MAX_DELIMITED_READ_SIZE
|
[bytes]
|
By default, Sequential File (import) will read ahead 500 bytes
to get the next delimiter. If it is not found the importer looks ahead
4*500=2000 (1500 more) bytes, and so on (4X) up to 100,000 bytes.
This variable controls the upper bound which is by default
100,000 bytes. When more than 500 bytes read-ahead is desired, use this
variable instead of APT_DELIMITED_READ_SIZE.
|
Oracle Environment Variables
Environment Variable
|
Setting
|
Description
|
$ORACLE_HOME
|
[path]
|
Specifies installation directory for current Oracle instance.
Normally set in a user’s environment by Oracle scripts.
|
$ORACLE_SID
|
[sid]
|
Specifies the Oracle service name, corresponding to a TNSNAMES
entry.
|
$APT_ORAUPSERT_COMMIT_ROW_INTERVAL
$APT_ORAUPSERT_COMMIT_TIME_INTERVAL
|
[num]
[seconds]
|
These two environment variables work together to specify how
often target rows are committed for target Oracle stages with Upsert method.
Commits are made whenever the time interval period has passed
or the row interval is reached, whichever comes first. By default, commits
are made every 2 seconds or 5000 rows.
|
$APT_ORACLE_LOAD_OPTIONS
|
[SQL*
Loader options] |
Specifies Oracle SQL*Loader options used in a target Oracle
stage with Load method. By default, this is set to OPTIONS(DIRECT=TRUE,
PARALLEL=TRUE)
|
$APT_ORA_IGNORE_CONFIG_FILE_PARALLELISM
|
1
|
When set, a target Oracle stage with Load method will limit
the number of players to the number of datafiles in the table’s tablespace.
|
$APT_ORA_WRITE_FILES
|
[filepath]
|
Useful in debugging Oracle SQL*Loader issues. When set, the
output of a Target Oracle stage with Load method is written to files instead
of invoking the Oracle SQL*Loader. The filepath specified by this environment
variable specifies the file with the SQL*Loader commands.
|
$DS_ENABLE_RESERVED_CHAR_CONVERT
|
1
|
Allows DataStage to handle Oracle databases which use the
special characters # and $ in column names.
|
Job Monitoring Environment Variables
Environment Variable
|
Setting
|
Description
|
$APT_MONITOR_TIME
|
[seconds]
|
In v7 and later, specifies the time interval (in seconds) for
generating job monitor information at runtime. To enable size-based job
monitoring, unset this environment variable, and set $APT_MONITOR_SIZEbelow.
|
$APT_MONITOR_SIZE
|
[rows]
|
Determines the minimum number of records the job monitor
reports. The default of 5000 records is usually too small. To minimize the
number of messages during large job runs, set this to a higher value (eg.
1000000).
|
$APT_NO_JOBMON
|
1
|
Disables job monitoring completely. In rare instances, this
may improve performance. In general, this should only be set on a per-job
basis when attempting to resolve performance bottlenecks.
|
$APT_RECORD_COUNTS
|
1
|
Prints record counts in the job log as each operator completes
processing. The count is per operator per partition.
|
njoy the simplicity.......
Atul Singh
No comments :
Post a Comment