Nuts & Bolts of DataStage: rows

Showing posts with label rows. Show all posts

Monday, May 25, 2015

Execution Steps in Transformer Stage - Explanation

You can access Part1 Here - Execution Steps in Transformer Stage

Certain constructs are inefficient if they are included in output column derivations, because they are evaluated once for every output column that uses them. The following examples describe these constructs:

The same part of an expression is used in multiple column derivations.

For example, if you want to use the same substring of an input column in multiple columns in output links, you might use the following test in a number of output columns derivations:

IF (DSLINK1.col1[1,3] = "001") THEN ...

In this case, the evaluation of the substring of DSLINK1.col1[1,3] is repeated for each column that uses it. The evaluation can be made more efficient by moving the substring calculation into a stage variable. The substring is then evaluated once for every input row. This example has thus stage variable definition for StageVar1:

Execution Steps in Transformer Stage

I've been asked this questions so many times in interviews and by different practitioner also that What are the data processing steps when datastage is processing transformer, So here I tried to compiled. Have a look -

To write efficient Transformer stage derivations, it helps to understand what items get evaluated and when.

Delete Duplicate Rows in DB2 Database

Many times and Many places, this question is being asked :-) that How to delete the duplicate row from a table in different DBs. Here, we will see How to do this in DB2 DB.

DataStage Scenario - Problem17

Goal : Count the data in each column

Input :

col1	col2	col3
a	{NULL}	b
f	k	{NULL}
h	{NULL}	n
i	d	{NULL}
{NULL}	s	{NULL}
g	u	m
l	x	o
m	{NULL}	{NULL}
c	d	z

DataStage Scenario - Problem15

Goal : Get the Previous column value in current value

Input file :

Sq, No
1,1000
2,2200
3,3030
4,5600

DataStage Scenario - Problem14

Goal : get below outputs

Input : 
dept, emp
----------------------------
20,            R
10,            A
10,            D
20,            P
10,            B
10,            C
20,            Q
20,            S

DataStage Scenario - Problem13

Goal : Repeat the input as it is in input

Repeat the input in output, Here what we have to do is repeat the same row no of times in output file. Suppose if 2 is there then repeat 2 two times, 4 four times in output files.

Input :
2
3
4
1

DataStage Scenario - Problem12

Goal : Add a comment line with each row of input file

Input1 :

Name, Dept, Salary
CHRISTINE,A00,152750.00
MICHAEL,B01,94250.00
SALLY,C01,98250.00
JOHN,E01,80175.00
IRVING,D11,72250.00
EVA,D21,96170.00
EILEEN,E11,89750.00
THEODORE,E21,86150.00

DataStage Scenario - Problem11

Which stages are needed to achieve below output ??

Input :

col1,col2
1,1
2,rajesh
3,15000
4,2
5,suresh
6,16000
7,3
8,veeru
9,17000

Tail Stage in DataStage

Tail Stage is another one stage from development stage category. It can have a single input link and a single output link.
The Tail Stage selects the last N records from each partition of an input data set and copies the selected records to an output data set.

a) Job Design :

Head Stage in DataStage

Welcome to Basic Intro with Stage Series, We are going to look into HEAD stage ( Developmet/Dubug Categoty). It can have a single input link and a single output link.

The Head Stage selects the first N rows from each partition of an input data set and copies the selected rows to an output data set. You determine which rows are copied by setting properties which allow you to specify:

The number of rows to copy
The partition from which the rows are copied
The location of the rows to copy
The number of rows to skip before the copying operation begins.

Convert a single row into multiple rows ( horizontally pivoting ) with Pivot stage ?

In this example, the Pivot Enterprise stage is set up to horizontally pivot some data.

You can generate a pivot index that will assign an index number to each row within sets of horizontally pivoted data. The following tables provide examples of data before and after a horizontal pivot operation.

Input Data
REPID,last_name,Jan_sales,Feb_sales,Mar_sales
100,Smith,1234.08,1456.80,1578.00
101,Yamada,1245.20,1765.00,1934.22
102,Xing,2190.89,1287.98,2054.55
103,Anderson,1498.09,1287.23,3298.76

Pivot stage made easy

Many people have the following misconceptions about Pivot stage.
1) It converts rows into columns
2) By using a pivot stage, we can convert 10 rows into 100 columns and 100 columns into 10 rows
3) You can add more points here!!

How to find duplicate values in a table?

With the SQL statement below you can find duplicate values in any table, just change the tablefield into the column you want to search and change the table into the name of the table you need to search.

Subscribe to: Posts ( Atom )

Nuts & Bolts of DataStage

Monday, May 25, 2015

Execution Steps in Transformer Stage - Explanation

Thursday, May 21, 2015

Execution Steps in Transformer Stage

Tuesday, December 30, 2014

Delete Duplicate Rows in DB2 Database

Monday, March 10, 2014

DataStage Scenario - Problem17

Thursday, January 30, 2014

DataStage Scenario - Problem15

DataStage Scenario - Problem14

Wednesday, January 29, 2014

DataStage Scenario - Problem13

Monday, January 27, 2014

DataStage Scenario - Problem12

Sunday, January 26, 2014

DataStage Scenario - Problem11

Thursday, November 28, 2013

Tail Stage in DataStage

a) Job Design :

Wednesday, November 27, 2013

Head Stage in DataStage

Thursday, July 11, 2013

Convert a single row into multiple rows ( horizontally pivoting ) with Pivot stage ?

Tuesday, January 29, 2013

Pivot stage made easy

Wednesday, December 26, 2012

How to find duplicate values in a table?