We have moved to www.dataGenX.net, Keep Learning with us.

Friday, February 28, 2014

Check whether DataStage Job is Multi-Instance or not with Sctipt

The following script can be used to check if the given datastage job is multi instance or not.

Arguments to the Scripts:

Arg1:Datastage Project Name

Wednesday, February 26, 2014

Get DataStage Job Information without using Director

With the help of this script, You can get the job no, category and other information without opening DataStage Director.

This Script need two arguments :
1. DataStage Project Name
2. DataStage Job Name

Script :

Monday, February 24, 2014

How to find the patch History of IBM Infosphere DataStage - Version.xml

Version 8.0.1 onwards IBM Infosphere DataStage maintains the version/patch history in Version.xml file. Version.xml file contains all the information about the product version, modules of product installed, installed date etc.

Version.xml file is existing in $ISHOME path. So

a) cd $ISHOME
b) If this will not work, try below steps to get the IS Home dir
c) Execute below command to get ..

Sunday, February 23, 2014

DataWareHouse (ETL) Designing Steps

A) Planning and Designing Steps

i) Requirement and Realities Gathering

                       1. Business Needs
                       2. Data Profiling and other Data source realities
3. Compliance Requirement
4. Security Requirement
5. Data Integration
6. Data Latency
7. Archiving and Lineage
8. End user delivery interfaces
9. Available Development Skills
10. Available management Skills
11. Legacy licenses

Thursday, February 20, 2014

Dimension Table and Its Type in Data WareHouse

A dimension is a structure that categorizes data in order to enable users to answer business questions. It contain attributes that describe fact records in the fact table. Some of these attributes provide descriptive information; others are used to specify how fact table data should be summarized to provide useful information to the analyst. Dimension tables contain hierarchies of attributes that aid in summarization. Calculations on fact table are performed through dimensions.

Dimension table fields

The dimension tables should have at least fields listed below and contain fields used to group data during the database inquiry process.

Those are three types of fields:

Monday, February 17, 2014

Datastage Coding Checklist

  1. Ensure that the null handling properties are taken care for all the nullable fields. Do not set the null field value to some value which may be present in the source.
  2. Ensure that all the character fields are trimmed before any processing. Normally extra spaces in the data may lead to some errors like lookup mismatch which are hard to detect.
  3. Always save the metadata (for source, target or lookup definitions) in the repository to ensure re usability and consistency.

Friday, February 14, 2014

List of strong points for InterView :-)

Most of time we stuck when Interviewer ask about our Strong Point, there are lot of reasons behind this ;-) anyways, Here sharing some words which can help you to decide. Pick which suits you but know them ;-) before using.

• A activating, adapting, administering, analyzing information, arranging, advising

• B budgeting, building teams, briefing, balancing,

• C communicating, controlling, co-ordinating, creating, checking, counseling, compiling, coaching

• D deciding, detailing, developing people, directing, devising, discovering, data input

Thursday, February 13, 2014

DataStage Parallel job: Retrieve sql codes on a failed upsert

When an enterprise database stage such as DB2 or Oracle is set to upsert it is possible to create a reject link to trap rows that fail any update or insert statements. By default this reject link holds just the columns written to the stage, they do not show any columns indicating why the row was rejected and often no warnings or error messages appear in the job log.

Wednesday, February 12, 2014

Interview Questions : DataWareHouse - Part 4

What are the types of Synonyms?
There are two types of Synonyms Private and Public

What is a Redo Log?
The set of Redo Log files YSDATE, UID, USER or USERENV SQL functions, or the pseudo columns LEVEL or ROWNUM.

What is an Index Segment?
Each Index has an Index segment that stores all of its data.

Explain the relationship among Database, Table space and Data file?
Each databases logically divided into one or more table spaces one or more data files are explicitly created for each table space.

Tuesday, February 11, 2014

DataStage Error : Loop final value not numeric - cannot execute it

Generally, you will face this Error when you are getting values from a file or Unix commands and using that value in Loop ( either transformer loop or sequencer loop ).

Solution : 


Friday, February 07, 2014

How to find Agents Ports in IBM InfoSphere Server - Version.xml

The list of installed products can be obtained from the Version.xml file that is located in ISHOME directory on the client, engine and domain machines where InfoSphere Information Server is installed.

Default locations
    UNIX /opt/IBM/InformationServer,
    WINDOWS C:\IBM\InformationServer

Below is a example listing of the Version.xml file:

Wednesday, February 05, 2014

How to find the product installed with IBM Infosphere Server - Version.xml

The Installed product details we can get from Version.xml file, this file contains all the information about Server Installation, version etc.

Version.xml file is existing in $ISHOME path. So

 a) cd $ISHOME
 b) If this will not work, try below steps to get the DSEngine dir

 Execute below command to get ..
 cd `ps -ef| grep dsrpc | grep -v grep | awk '{print $NF}' | sed 's/.\{26\}$//'`

Tuesday, February 04, 2014

DataStage Scenario - Design2 - job2

DataStage Scenario Problem -->  DataStage Scenario - Problem2

Solution Design :

a) Job Design :
In job design, we are using Copy, Aggregator, Filter and Join stage to get the output.

Monday, February 03, 2014

DataStage Scenario - Design 2 - job1

 DataStage Scenario Problem -->  DataStage Scenario - Problem2

Solution Design :

a) Job Design :

Below is the design which can achieve the output as we needed. Here, we are reading seq file as a input, then data is passing through Aggregator and Filter stage to achieve the output.

DataStage Scenario - Problem16

1st I/P file 
OV ,1 ,RE 
VG ,2 ,RE 
WU ,3, RE 

2nd I/P file 
OV ,4, CX 
VG ,5, CX 
WU ,6, CX