In Datastage, Schemas are an alternative way for you to specify column
definitions for the data used by parallel jobs. You do not need to define any column and data format for reading a file. We are using this method when we needed a generic job for scenario like reading multiple files having different metadata defined.
Here, I am going to show you how to read the seq file with help of schema
A. Design
I am using below design to demonstrate this functionality, I am reading the data and dumping it to another file with different format.
B. Seq File Stage Properties :
In Source, I have given the file name as usual. There is one more OPTION available under OPTIONS ( just click on OPTION and look into 'AVAILABLE PROPERTIES TO ADD' box ) called 'Schema File'. I have given the file name which is containing the schema for Input file
Input File - /home/atul/ldata
Schema File - /home/atul/schema
For my input file schema is like this -
record
{final_delim=end, delim=',', quote=double}
(
No:string[max=2] ;
Name:string[max=15] ;
)
Now, we do not need to setup FORMAT as well as COLUMNS tab as this is already covered by this schema file.
Secondly, we have to enable the RCP for this stage, for doing so go to COLUMNS tab and at bottom check the RCP
C. Output Seq File Properties
Now, setup the Output File where we are collecting the data. this is as usual as we are doing with Seq File, Give the File Name, Format details but COLUMNS tab is still not having any column to defined as RCP is enable to its source. So, Keep blank the Column definition tab.
Save, Compile and Run the job now :-)
Like the Facebook Page & join Group
https://www.facebook.com/DataStage4you
https://www.facebook.com/groups/DataStage4you
https://twitter.com/datastage4you
For WHATSAPP group , drop a msg to 91-88-00-906098
No comments :
Post a Comment