Data
set stage
The Data Set stage is a
file stage. It allows you to read data from or write data to a data set. The
stage can have a single input link or a single output link. It can be
configured to execute in parallel or sequential mode.
What is a data set?
parallel jobs use data sets to manage data within a job. You can think of each
link in a
job as carrying a data
set. The Data Set stage allows you to store data being operated on in a
persistent
form, which can then be
used by other InfoSphere DataStage jobs. Data sets are operating system files,
each referred to by a
control file, which by convention has the suffix .ds. Using data sets wisely
can be
key to good performance
in a set of linked jobs. You can also manage data sets independently of a job
using the Data Set
Management utility, available from the InfoSphere DataStage Designer or
Director.
Sequential
file stage
The Sequential File
stage is a file stage. It allows you to read data from or write data one or
more flat
files. The stage can
have a single input link or a single output link, and a single rejects link.
File
set stage
The File Set stage is a
file stage. It allows you to read data from or write data to a file set. The
stage can
have a single input
link, a single output link, and a single rejects link. It only executes in
parallel mode.
What is a file set?
InfoSphere DataStage can generate and name exported files, write them to their
destination, and list
the files it has generated in a file whose extension is, by convention, .fs.
The data
files and the file that
lists them are called a file set. This capability is useful because some
operating
systems imposea2GB
limit on the size of a file and you need to distribute files among nodes to
prevent
overruns.
Lookup
file set stage
The Lookup File Set
stage is a file stage. It allows you to create a lookup file set or reference
one for a
lookup. The stage can
have a single input link or a single output link. The output link must be a
reference link. The
stage can be configured to execute in parallel or sequential mode when used
with an
input link.
When creating Lookup
file sets, one file will be created for each partition. The individual files
are
referenced by a single
descriptor file, which by convention has the suffix .fs.
External
source stage
The External Source
stage is a file stage. It allows you to read data that is output from one or
more
source programs. The
stage calls the program and passes appropriate arguments. The stage can have a
single output link, and
a single rejects link. It can be configured to execute in parallel or
sequential mode.
External
Target stage
The External Target
stage is a file stage. It allows you to write data to one or more source
programs. The
stage can have a single
input link and a single rejects link. It can be configured to execute in
parallel or
sequential mode. There
is also an External Source stage, which allows you to read from an external
program
Complex
Flat File stage
The Complex Flat File
(CFF) stage is a file stage. You can use the stage to read a file or write to a
file, but
you cannot use the same
stage to do both.
As a source, the CFF
stage can have multiple output links and a single reject link. You can read
data from
one or more complex
flat files, including MVS™ data sets with QSAM and VSAM files. You can also
read data from files that contain multiple record types. The source data can
contain one or more of the
following clauses:
·
GROUP
·
REDEFINES
·
OCCURS
·
OCCURS DEPENDING ON
CFF source stages run
in parallel mode when they are used to read multiple files, but you can
configure
the stage to run
sequentially if it is reading only one file with a single reader.
As a target, the CFF
stage can have a single input link and a single reject link. You can write data
to one
or more complex flat
files. You cannot write to MVS data sets or to files that contain multiple
record
types.
No comments :
Post a Comment