Something about DataStage, DataStage Administration, Job Designing,Developing, DataStage troubleshooting, DataStage Installation & Configuration, ETL, DataWareHousing, DB2, Teradata, Oracle and Scripting.
Wednesday, February 12, 2014
Interview Questions : DataWareHouse - Part 4
What are the types of Synonyms?
There are two types of Synonyms Private and Public
What is a Redo Log?
The set of Redo Log files YSDATE, UID, USER or USERENV SQL functions, or the pseudo columns LEVEL or ROWNUM.
What is an Index Segment?
Each Index has an Index segment that stores all of its data.
Explain the relationship among Database, Table space and Data file?
Each databases logically divided into one or more table spaces one or more data files are explicitly created for each table space.
What are the different types of Segments?
Data Segment,
Index Segment,
Rollback Segment
and
Temporary Segment
What are Clusters?
Clusters are groups of one or more tables physically stores together to share common columns and are often used together.
What is an Integrity Constrains?
An integrity constraint is a declarative way to define a business rule for a column of a table.
What is an Index?
An Index is an optional structure associated with a table to have direct access to rows, which can be created to increase the performance of data retrieval. Index can be created on one or more columns of a table.
What is an Extent?
An Extent is a specific number of contiguous data blocks, obtained in a single allocation, and used to store a specific type of information.
What is a View?
A view is a virtual table. Every view has a Query attached to it. (The Query is a SELECT statement that identifies the columns and rows of the table(s) the view uses.)
What is Table?
A table is the basic unit of data storage in an ORACLE database. The tables of a database hold all of the user accessible data. Table data is stored in rows and columns.
What is schema?
A schema is collection of database objects of a User.
Describe Referential Integrity?
A rule defined on a column (or set of columns) in one table that allows the insert or update of a row only if the value for the column or set of columns (the dependent value) matches a value in a column of a related table (the referenced value). It also specifies the type of data manipulation allowed on referenced data and the action to be performed on dependent data as a result of any action on referenced data.
What is Meta data?
Metadata is data about data. E.g. if in data mart we are receiving any file. Then metadata will contain information like how many columns, file is fix width/limited, ordering of fields, data types of field etc.
Briefly state different between data ware house & data mart?
Data warehouse is made up of many datamarts. DWH contain many subject areas. However, data mart focuses on one subject area generally. E.g. If there will be DHW of bank then there can be one data mart for accounts, one for Loans etc. This is high-level definitions.
What is galaxy schema?
Galaxy schema is also known as fact constellation scheme. It requires no of fact tables to share dimension tables. In data, wares housing mainly the people are using the conceptual hierarchy.
Suppose you are filtering the rows using a filter transformation only the rows meet the condition pass to the target. Tell me where the rows will go that does not meet the condition.
Informatica filter transformation default value is 1 i.e. true. If you place a break point on filter transformation and run the mapping in a debugger mode, you will find these values 1 or 0 for each row passing through filter. If you change 0 to 1, the particular row will be passed to next stage.
After we create a SCD table, can we use that particular Dimension as a dimension table for Star Schema?
Yes.
What is Core Dimension?
Core Dimension is a Dimension table, which is used dedicated for single fact table or Datamart. Conform Dimension is a Dimension table which is used across fact tables or Data marts.
How much data hold in one universe.
Universe does not hold any data. However, practically the universe is known to have issues when the objects cross 6000.
Can any one explain about Core Dimension, Balanced Dimension, and Dirty Dimension?
Dirty Dimension is nothing but Junk Dimensions. Core Dimensions are dedicated for a fact table or Data mart. Conformed Dimensions are used across fact tables or Data marts.
Can any one explain the Hierarchies level Data warehousing.
In Data warehousing, levels are columns available in dimension table. Levels are having attributes. Hierarchies are used for navigational purpose; there are two types of Hierarchies. You can define hierarchies in top down or bottom up.
1. Natural Hierarchy: Best example is Time Dimension - Year, Month, Day etc. In natural Hierarchy definite relationship exists between each level
2. Navigational Hierarchy: You can have levels like
Ex - Production cost of Product, Sales Cost of Product.
Ex - Lead Time defined to procure, Actual Procurement time,
In this, two levels need not to have relationship. This Hierarchy is created for navigational purpose.
What is data cleaning? How can we do that?
Data cleaning is a self-explanatory term. Most of the data warehouses in the world source data from multiple systems - systems that were created long before data warehousing was well understood, and hence without the vision to consolidate the same in a single repository of information. In such a scenario, the possibilities of the following are there:
► Missing information for a column from one of the data sources;
► Inconsistent information among different data sources;
► Orphan records;
► Outlier data points;
► Different data types for the same information among various data sources, leading to improper conversion;
► Data breaching business rules
In order to ensure that the data warehouse is not infected by any of these discrepancies, it is important to cleanse the data using a set of business rules, before it makes its way into the data warehouse.
What is dimension modeling?
A logical design technique that seeks to present the data in a standard, intuitive framework that allows for high-performance access. There are different data modeling concepts like ER Modeling (Entity Relationship modeling), DM (Dimensional modeling), Hierarchal Modeling, Network modeling. However, popular are ER and DM only.
Where the cache files stored?
Caches are stored in Repository.
How can you import tables from a database?
In Business Objects Universe Designer you can open Table Browser and select the tables needed then insert them to designer.
What is drilling across?
Drill across corresponds to switching from 1 classification in 1 dimension to a different classification in different dimension
Fore More --> CLICK HERE