Nuts & Bolts of DataStage: File Stages : DataStage

Data set stage

The Data Set stage is a file stage. It allows you to read data from or write data to a data set. The stage can have a single input link or a single output link. It can be configured to execute in parallel or sequential mode.

What is a data set? parallel jobs use data sets to manage data within a job. You can think of each link in a

job as carrying a data set. The Data Set stage allows you to store data being operated on in a persistent

form, which can then be used by other InfoSphere DataStage jobs. Data sets are operating system files,

each referred to by a control file, which by convention has the suffix .ds. Using data sets wisely can be

key to good performance in a set of linked jobs. You can also manage data sets independently of a job

using the Data Set Management utility, available from the InfoSphere DataStage Designer or Director.

Sequential file stage

The Sequential File stage is a file stage. It allows you to read data from or write data one or more flat

files. The stage can have a single input link or a single output link, and a single rejects link.

File set stage

The File Set stage is a file stage. It allows you to read data from or write data to a file set. The stage can

have a single input link, a single output link, and a single rejects link. It only executes in parallel mode.

What is a file set? InfoSphere DataStage can generate and name exported files, write them to their

destination, and list the files it has generated in a file whose extension is, by convention, .fs. The data

files and the file that lists them are called a file set. This capability is useful because some operating

systems imposea2GB limit on the size of a file and you need to distribute files among nodes to prevent

overruns.

Lookup file set stage

The Lookup File Set stage is a file stage. It allows you to create a lookup file set or reference one for a

lookup. The stage can have a single input link or a single output link. The output link must be a

reference link. The stage can be configured to execute in parallel or sequential mode when used with an

input link.

When creating Lookup file sets, one file will be created for each partition. The individual files are

referenced by a single descriptor file, which by convention has the suffix .fs.

External source stage

The External Source stage is a file stage. It allows you to read data that is output from one or more

source programs. The stage calls the program and passes appropriate arguments. The stage can have a

single output link, and a single rejects link. It can be configured to execute in parallel or sequential mode.

External Target stage

The External Target stage is a file stage. It allows you to write data to one or more source programs. The

stage can have a single input link and a single rejects link. It can be configured to execute in parallel or

sequential mode. There is also an External Source stage, which allows you to read from an external program

Complex Flat File stage

The Complex Flat File (CFF) stage is a file stage. You can use the stage to read a file or write to a file, but

you cannot use the same stage to do both.

As a source, the CFF stage can have multiple output links and a single reject link. You can read data from

one or more complex flat files, including MVS™ data sets with QSAM and VSAM files. You can also read data from files that contain multiple record types. The source data can contain one or more of the

following clauses:

· GROUP

· REDEFINES

· OCCURS

· OCCURS DEPENDING ON

CFF source stages run in parallel mode when they are used to read multiple files, but you can configure

the stage to run sequentially if it is reading only one file with a single reader.

As a target, the CFF stage can have a single input link and a single reject link. You can write data to one

or more complex flat files. You cannot write to MVS data sets or to files that contain multiple record

types.

njoy the simplicity.......
Atul Singh

victimizeit.blogspot.com

Nuts & Bolts of DataStage

Friday, June 29, 2012

File Stages : DataStage

No comments :

Post a Comment