We have moved to www.dataGenX.net, Keep Learning with us.
Showing posts with label source. Show all posts
Showing posts with label source. Show all posts

Wednesday, May 27, 2015

Data sources in DataStage


IIS Datastage connectivity options give us a wide scope to connect with different source or targets. It's support RDBMS, ERP, z/OS DB, OLAP system and many more.

Below listed Data Sources are available in IIS v11.3


Wednesday, June 04, 2014

Surrogate Key Generator - Generate Surrogate Key for Data


In this post, We will see how to generate surrogate key for data, where we have to use surrogate key stage.

A) Design :  
Below design is a demo design of job. Here our data source is a row generator which is generating rows. In real time scenario, Source can be a flat file, DB stages, Passive Stage or can be a Active stage also.
In Row Generator Stage, we are generating a col "Name".



Sunday, May 04, 2014

ETL Testing : Approach


Testing is undoubtedly an essential part of DW life-cycle but it received a few attention with respect to other design phases.

DW testing specification:
 – Software testing is predominantly focused on program code, while Software testing is predominantly focused on program code, while DW testing is directed at data and information DW testing is directed at data and information.  .
 – DW testing focuses on the correctness and usefulness of the information delivered to users information delivered to users
 – Differently from generic software systems, DW testing involves a huge data volume huge data volume, which significantly impacts performance and productivity.
 – DW systems are aimed at supporting any views of data, so the possible number of use scenarios is practically infinite and only few them are known from the beginning.
 – It is almost impossible to predict all the possible types of errors that will be encountered in real operational data.


Wednesday, December 11, 2013

DataStage Scenario - Problem6


Goal : Get the count of Vowels in Columns

Input :

Akash Aggrawal
Priya Awasthi  
Anil chahal    
Diya Singh    
Kashish Patel 
Sunil Verma    
Rashid Patel    
Rashmi Arya   
Gopal Joshi     
Neha Tomar    

Friday, November 22, 2013

ETL Job Design Standards - 2



Part 1 --> ETL Job Design Standards - 1



Parameter Management Standards
This section defines standards to manage job parameters across environments. Jobs should use parameters liberally to avoid hard coding as much as possible. Some categories of parameters include: 

  • Environmental parameters, such as directory names, file names, etc.
  • Database connection parameters
  • Notification email addresses
  • Processing options, such as degree of parallelism

Thursday, October 17, 2013

How to install Linux / UNIX *.tar.gz tarball files


tar.gz also known as tarball, an archive format for electronic data and software. Most Linux tarball contains a source code for software. If you are new to Linux I recommend using apt-get, rpm and yum command to install all binary packages.

Tarballs are a group of files in one file. Tarball files have the extension .tar.gz, .tgz or .tar.bz2. Most open source software use tarballs to distribute programs/source codes.

Thursday, August 22, 2013

How to split source column into multiple target columns ( full name to first and Last)


Approach:

CREATE SET TABLE test
fullname varchar(30)
);


INSERT INTO test12 ('nitin raj');
INSERT INTO test12 ('nitin agarwal');
INSERT INTO test12 ('abhishek gupta');

Tuesday, August 20, 2013

Oracle Interview Questions - Part-3


51. What is a database instance? Explain.
A database instance (Server) is a set of memory structure and background processes that access a set of database files. The processes can be shared by all of the users. The memory structure that is used to store the most queried data from database. This helps up to improve database performance by decreasing the amount of I/O performed against data file.

52. What is Parallel Server?
Multiple instances accessing the same database (only in multi-CPU environments)

Friday, May 24, 2013

Issuing commands to a Queue Manager (runmqsc)



Once we have created a Queue Manager, we will want to perform administrative tasks, such as creating queues, among others. To enable us to communicate with our Queue Manager, we use the RUNMQSC MQ command, which opens the MQSC (MQ Script Center) environment.

After entering the MQSC environment, we can issue one of the following MQSC commands: ALTER, CLEAR, DEFINE, DELETE, DISPLAY, END, PING, REFRESH, RESET, RESOLVE, RESUME, START, STOP, or SUSPEND. Each of these commands has it's own options, which are shown in the following table:


Wednesday, April 17, 2013

What is MQ Stage ???

 

The WebSphere MQ stage is a passive stage that offers a message based solution to customers where messaging represents another form of source and target data. The WebSphere MQ stage lets WebSphere DataStage read from and write to WebSphere MQ message queues. You can use this stage as:
  • An intermediary between applications, transforming messages as they are sent between programs
  • A conduit for the transmission of legacy data to a message queue
  • A message queue reader for transmission to a non-messaging target