We have moved to www.dataGenX.net, Keep Learning with us.

Monday, August 05, 2013

Orchadmin Command : DataStage

Orchadmin is a command line utility provided by datastage to research on data sets.

The general callable format is : $orchadmin <command> [options] [descriptor file]

1. Before using orchadmin, you should make sure that either the working directory or the $APT_ORCHHOME/etc  contains the file “config.apt” OR
The environment variable $APT_CONFIG_FILE  should be defined for your session.

Orchadmin  commands

 The various commands available with orchadmin are

1. CHECK: $orchadmin check

Validates the configuration file contents like  , accesibility of all nodes defined in the configuration file, scratch disk definitions and accesibility of all the nodes etc. Throws an error when config file is not found or not defined properly

2. COPY : $orchadmin copy <source.ds> <destination.ds>

Makes a complete copy of the datasets of source with new destination descriptor file name. Please not that
a. You cannot use UNIX cp command as it justs copies the config file to a new name. The data is not copied.
b. The new datasets will be arranged in the form of the config file that is in use but not according to the old confing file that was in use with the source.

  3. DELETE : $orchadmin < delete | del | rm >  [-f | -x] descriptorfiles….

The unix rm utility cannot be used to delete the datasets. The orchadmin delete or rm command should be used to delete one or more persistent data sets.
-f options makes a force delete. If some nodes are not accesible then -f forces to delete the dataset partitions from accessible nodes and leave the other partitions in inaccesible nodes as orphans.
-x forces to use the current config file to be used while deleting than the one stored in data set.

4.  DESCRIBE: $orchadmin describe [options] descriptorfile.ds

This is the single most important command.
1. Without any option lists the no.of.partitions, no.of.segments, valid segments, and preserve partitioning flag details of the persistent dataset.
-c : Print the configuration file that is written in the dataset  if any
-p: Lists down the partition level information.
-f: Lists down the file level information in each partition
-e:  List down the segment level information .
-s: List down the  meta-data schema of the information.
-v:  Lists all segemnts , valid or otherwise
-l : Long listing. Equivalent to -f -p -s -v -e

5. DUMP: $orchadmin dump [options] descriptorfile.ds

The dump command is used to dump(extract) the records from the dataset.
Without any options the dump command lists down all the records starting from first record from first partition till  last record in last partition.
-delim ‘<string>’ : Uses the given string as delimtor for fields instead of space.
-field <name> : Lists only the given field instead of all fields.
-name : List all the values preceded by field name and a colon
-n numrecs : List only the given number of records per partition.
-p period(N) : Lists every  Nth record from each partition starting from first record.
-skip N: Skip the first N records from each partition.
-x : Use the current system configuration file rather than the one stored in dataset.

6. TRUNCATE: $orchadmin truncate [options] descriptorfile.ds

Without options deletes all the data(ie Segments) from the dataset.
-f: Uses force truncate. Truncate accessible segments and leave the inaccesible ones.
-x: Uses current system config file rather than the default one stored in the dataset.
-n N: Leaves the first N segments in each partition and truncates the remaining.

7. HELP: $orchadmin -help OR $orchadmin <command> -help

Help manual about the usage of orchadmin or orchadmin commands.