We have moved to www.dataGenX.net, Keep Learning with us.

Friday, June 29, 2012

File Stages : DataStage


Data set stage
The Data Set stage is a file stage. It allows you to read data from or write data to a data set. The stage can have a single input link or a single output link. It can be configured to execute in parallel or sequential mode.
What is a data set? parallel jobs use data sets to manage data within a job. You can think of each link in a
job as carrying a data set. The Data Set stage allows you to store data being operated on in a persistent
form, which can then be used by other InfoSphere DataStage jobs. Data sets are operating system files,
each referred to by a control file, which by convention has the suffix .ds. Using data sets wisely can be
key to good performance in a set of linked jobs. You can also manage data sets independently of a job
using the Data Set Management utility, available from the InfoSphere DataStage Designer or Director.



Sequential file stage
The Sequential File stage is a file stage. It allows you to read data from or write data one or more flat
files. The stage can have a single input link or a single output link, and a single rejects link.



File set stage
The File Set stage is a file stage. It allows you to read data from or write data to a file set. The stage can
have a single input link, a single output link, and a single rejects link. It only executes in parallel mode.
What is a file set? InfoSphere DataStage can generate and name exported files, write them to their
destination, and list the files it has generated in a file whose extension is, by convention, .fs. The data
files and the file that lists them are called a file set. This capability is useful because some operating
systems imposea2GB limit on the size of a file and you need to distribute files among nodes to prevent
overruns.



Lookup file set stage
The Lookup File Set stage is a file stage. It allows you to create a lookup file set or reference one for a
lookup. The stage can have a single input link or a single output link. The output link must be a
reference link. The stage can be configured to execute in parallel or sequential mode when used with an
input link.
When creating Lookup file sets, one file will be created for each partition. The individual files are
referenced by a single descriptor file, which by convention has the suffix .fs.



External source stage
The External Source stage is a file stage. It allows you to read data that is output from one or more
source programs. The stage calls the program and passes appropriate arguments. The stage can have a
single output link, and a single rejects link. It can be configured to execute in parallel or sequential mode.


External Target stage
The External Target stage is a file stage. It allows you to write data to one or more source programs. The
stage can have a single input link and a single rejects link. It can be configured to execute in parallel or
sequential mode. There is also an External Source stage, which allows you to read from an external program



Complex Flat File stage
The Complex Flat File (CFF) stage is a file stage. You can use the stage to read a file or write to a file, but
you cannot use the same stage to do both.
As a source, the CFF stage can have multiple output links and a single reject link. You can read data from
one or more complex flat files, including MVS™ data sets with QSAM and VSAM files. You can also read data from files that contain multiple record types. The source data can contain one or more of the
following clauses:
·                           GROUP
·                           REDEFINES
·                           OCCURS
·                           OCCURS DEPENDING ON
CFF source stages run in parallel mode when they are used to read multiple files, but you can configure
the stage to run sequentially if it is reading only one file with a single reader.
As a target, the CFF stage can have a single input link and a single reject link. You can write data to one
or more complex flat files. You cannot write to MVS data sets or to files that contain multiple record
types.


No comments :

Post a Comment