Data Warehousing Basics Interview Questions

download Data Warehousing Basics Interview Questions

of 24

Transcript of Data Warehousing Basics Interview Questions

  • 8/7/2019 Data Warehousing Basics Interview Questions

    1/24

    1. Data Warehousing Basics Interview Questions & Answers

    1. What is hybrid slowly changing dimension?

    Ans:- Hybrid SCDs are combination of both SCD 1 and SCD 2.

    It may happen that in a table, some columns are important and weneed to track changes for them i.e capture the historical data for themwhereas in some columns even if the data changes, we don't care.For such tables we implement Hybrid SCDs, where in some columns areType 1 and some are Type 2.You can add that it is not an intelligent key but similar to a sequencenumber and tied to a timestamp typically!

    2. can a dimension table contains numeric values?

    Ans:- Yes.But those datatype will be char (only the values cannumeric/char).

    yes,diemesions even contain numerical because these are disscriptiveelements ofur business

    3. How to Create Surrogate Key using Ab Initio?

    Ans:- There are many ways to create Surrogatekey but it depends onyour business logic. Here you can try these ways.

    1. use next_in_sequence() function in your transform.

    2.use Assign key values component (if your gde is higher than 1.10)

    3.write a stored proc to this and call this stor proc wherever you need.

    Yes, dimension table contains numerics but not contain measures and

    facts

    4. Differences between star and snowflake schemas ?

    Ans:- Star schema

    A single fact table with N number of Dimension

  • 8/7/2019 Data Warehousing Basics Interview Questions

    2/24

    Snowflake schema

    Any dimensions with extended dimensions are know as snowflakeschema

    STAR MODEL :Radical arrangement of the organisation with one facttable and more dimension tables SNOWFLAKE MODEL: The dimensiontable subdivide itself forming new dimension table and start acting as afact table.

    a datamart is subject oriented data base through which we can analysethe business in each department in an organisation.

    data marts are called high query performance structures

    5. What is a CUBE in datawarehousing concept?

    Ans:- Cubes are logical representation of multidimensional data.Theedge of the cube contains dimension members and the body of thecube contains datavalues.

    6. Difference between Snow flake and Star Schema. What are

    situations where Snowflake Schema is better than Star Schema to use and when the

    opposite is true?

    Ans:- Star schema contains the dimesion tables mapped around one ormore fact tables.

    It is a denormalised model.

    No need to use complicated joins.

    Queries results fastly.

    Snowflake schema

    It is the normalised form of Star schema.

  • 8/7/2019 Data Warehousing Basics Interview Questions

    3/24

    contains indepth joins ,bcas the tbales r splitted in to many pieces.We caneasily do modification directly in the tables.

    We hav to use comlicated joins ,since we hav more tables .

    There will be some delay in processing the Query .

    7. What is ER Diagram ?

    Ans:- The Entity-Relationship (ER) model was originally proposed by Peterin 1976 [Chen76] as a way to unify the network and relational databaseviews.

    Simply stated the ER model is a conceptual data model that views the realworld as entities and relationships. A basic component of the model is theEntity-Relationship diagram which is used to visually represents dataobjects.

    Since Chen wrote his paper the model has been extended and today it iscommonly used for database design For the database designer, the utility ofthe ER model is:

    it maps well to the relational model. The constructs used in the ER modelcan easily be transformed into relational tables.it is simple and easy to understand with a minimum of training. Therefore,the model can be used by the database designer to communicate the design

    to the end user.

    In addition, the model can be used as a design plan by the databasedeveloper to implement a data model in a specific database managementsoftware.

    8. What is degenerate dimension table?

    Ans:- Degenerate Dimensions : If a table contains the values, which rneither dimesion nor measures is called degenerate dimensions.Ex : invoice

    id,empno9. What is VLDB??

    Ans:- The perception of what constitutes a VLDB continues to grow. A oneterabyte database would normally be considered to be a VLDB.

  • 8/7/2019 Data Warehousing Basics Interview Questions

    4/24

    degenerate dimension:it doesn't have any link with dimensions and i wonthave any attribute.

    10. What is meant by metadata in context of a Datawarehouse and

    how it is important?

    Ans:- Meta data is the data about data; Business Analyst or data modelerusually capture information about data - the source (where and how thedata is originated), nature of data (char, varchar, nullable, existance, validvalues etc) and behavior of data (how it is modified / derived and the lifecycle ) in data dictionary a.k.a metadata. Metadata is also presented at theDatamart level, subsets, fact and dimensions, ODS etc. For a DW user,metadata provides vital information for analysis / DSS.

    11. 1.what is incremintal loading?2.what is batch processing?3.what

    is crass reference table?4.what is aggregate fact table??

    Ans:- Incremental loading means loading the ongoing changes in the OLTP.

    Aggregate table contains the [measure] values ,aggregated/grouped/summed up to some level of hirarchy.

    12. What are the possible data marts in Retail sales.?

    Ans:- Product information,sales information

    13. What is the main differnce between schema in RDBMS andschemas in DataWarehouse....?

    Ans:- RDBMS Schema* Used for OLTP systems,* Traditional and old schema,* Normalized* Difficult to understand and navigate.,* Cannot solve extract and complexproblems* Poorly modelled

    DWH Schema

    * Used for OLAP systems.,* New generation schema.,* De Normalized* Easy to understand and navigate., * Extract and complex problems can beeasily solved .,* Very good model

    14. What are the vaious ETL tools in the Market?

    Ans:- Various ETL tools used in market are:

  • 8/7/2019 Data Warehousing Basics Interview Questions

    5/24

    1)Informatica.2)Data Stage 3) Oracle Warehouse Bulider 4)Ab Initio.,DataJunctionApart from the above Humming Bird Genio, Business Objects DataIntegrator

    15. What is Dimensional Modelling?

    Ans:- Dimensional Modelling is a design concept used by many datawarehouse desginers to build thier datawarehouse. In this design model allthe data is stored in two types of tables - Facts table and Dimension table.Fact table contains the facts/measurements of the business and thedimension table contains the context of measuremnets ie, the dimensions onwhich the facts are calculated.

    16. What is Data warehosuing Hierarchy?

    Ans:- HierarchiesHierarchies are logical structures that use ordered levels as a means oforganizing data. A hierarchy can be used to define data aggregation. Forexample, in a time dimension, a hierarchy might aggregate data from themonth level to the quarter level to the year level. A hierarchy can also beused to define a navigational drill path and to establish a family structure.

    Within a hierarchy, each level is logically connected to the levels above andbelow it. Data values at lower levels aggregate into the data values at higherlevels. A dimension can be composed of more than one hierarchy. For

    example, in the product dimension, there might be two hierarchies--one forproduct categories and one for product suppliers.

    Dimension hierarchies also group levels from general to granular. Querytools use hierarchies to enable you to drill down into your data to viewdifferent levels of granularity. This is one of the key benefits of a datawarehouse.

    When designing hierarchies, you must consider the relationships in businessstructures. For example, a divisional multilevel sales organization.

    Hierarchies impose a family structure on dimension values. For a particularlevel value, a value at the next higher level is its parent, and values at thenext lower level are its children. These familial relationships enable analyststo access data quickly.

    Levels

  • 8/7/2019 Data Warehousing Basics Interview Questions

    6/24

  • 8/7/2019 Data Warehousing Basics Interview Questions

    7/24

    sequence generator, or Oracle sequence, or SQL Server Identity values forthe surrogate key.

    It is useful because the natural primary key (i.e. Customer Number inCustomer table) can change and this makes updates more difficult.

    Some tables have columns such as AIRPORT_NAME or CITY_NAME which arestated as the primary keys (according to the business users) but ,not onlycan these change, indexing on a numerical value is probably better and youcould consider creating a surrogate key called, say, AIRPORT_ID. This wouldbe internal to the system and as far as the client is concerned you maydisplay only the AIRPORT_NAME.

    2. Adapted from response by Vincent on Thursday, March 13, 2003

    Another benefit you can get from surrogate keys (SID) is :

    Tracking the SCD - Slowly Changing Dimension.

    Let me give you a simple, classical example:

    On the 1st of January 2002, Employee 'E1' belongs to Business Unit 'BU1'(that's what would be in your Employee Dimension). This employee has aturnover allocated to him on the Business Unit 'BU1' But on the 2nd of Junethe Employee 'E1' is muted from Business Unit 'BU1' to Business Unit 'BU2.'All the new turnover have to belong to the new Business Unit 'BU2' but the

    old one should Belong to the Business Unit 'BU1.'

    If you used the natural business key 'E1' for your employee within yourdatawarehouse everything would be allocated to Business Unit 'BU2' evenwhat actualy belongs to 'BU1.'

    If you use surrogate keys, you could create on the 2nd of June a new recordfor the Employee 'E1' in your Employee Dimension with a new surrogate key.

    This way, in your fact table, you have your old data (before 2nd of June)

    with the SID of the Employee 'E1' + 'BU1.' All new data (after 2nd of June)would take the SID of the employee 'E1' + 'BU2.'

    You could consider Slowly Changing Dimension as an enlargement of yournatural key: natural key of the Employee was Employee Code 'E1' but foryou it becomesEmployee Code + Business Unit - 'E1' + 'BU1' or 'E1' + 'BU2.' But thedifference with the natural key enlargement process, is that you might not

  • 8/7/2019 Data Warehousing Basics Interview Questions

    8/24

    have all part of your new key within your fact table, so you might not beable to do the join on the new enlarge key -> so you need another id.

    19. What is a linked cube?

    Ans:- Linked cube in which a sub-set of the data can be analysed into greatdetail. The linking ensures that the data in the cubes remain consistent.

    20. What is a source qualifier?-

    Ans:- When you add a relational or a flat file source definition to a mapping,you need to connect it to a Source Qualifier transformation. The SourceQualifier represents the rows that the Informatica Server reads when itexecutes a session.

    21. What are the steps to build the datawarehouse??

    Ans:- As far I know...Gathering bussiness requirements,Identifying Sources,Identifying Facts,Defining Dimensions,Define Attribues,Redefine Dimensions & AttributesOrganise Attribute Hierarchy & Define Relationship,Assign Unique Identifiers,Additional convetions:Cardinality/Adding ratios.1 business modeling 2 data modeling 3 data from the source databases 4Extration Transformation Loading 5 DataWare house (Data Marts)22. What are the different architecture of datawarehouse??

    Ans:- I think, there are two main things.,

    1. Top down - (bill Inmon)2.Bottom up - (Ralph kimbol)

    23. What is the difference between view and materialized view??

    Ans:- View - store the SQL statement in the database and let you use it as atable. Everytime you access the view, the SQL statement executes.

    Materialized view - stores the results of the SQL in table form in thedatabase. SQL statement only executes once and after that everytime you

    run the query, the stored result set is used. Pros include quick query results.24. What is the main difference between Inmon and Kimball

    philosophies of data warehousing?

    Ans:- Both differed in the concept of building the datawarehosue..According to Kimball ...

  • 8/7/2019 Data Warehousing Basics Interview Questions

    9/24

    Kimball views data warehousing as a constituency of data marts. Data martsare focused on delivering business objectives for departments in theorganization. And the data warehouse is a conformed dimension of the datamarts. Hence a unified view of the enterprise can be obtain from thedimension modeling on a local departmental level.

    Inmon beliefs in creating a data warehouse on a subject-by-subject areabasis. Hence the development of the data warehouse can start with datafrom the online store. Other subject areas can be added to the datawarehouse as their needs arise. Point-of-sale (POS) data can be added laterif management decides it is necessary.i.e.,Kimball--First DataMarts--Combined way ---DatawarehouseInmon---First Datawarehouse--Later----Datamarts

    25. what is junk dimension?what is the difference between junk dimension and degenerated

    dimension?

    Ans:- Junk dimension: Grouping of Random flags and text Attributes in adimension and moving them to a separate sub dimension.Degenerate Dimension: Keeping the control information on Fact table ex:Consider a Dimension table with fields like order number and order linenumber and have 1:1 relationship with Fact table, In this case thisdimension is removed and the order information will be directly stored in aFact table inorder eliminate unneccessary joins while retrieving order

    information..

    26. What is the definition of normalized and denormalized view and

    what are the differences between them??

    Ans:- Normalization is the process of removing redundancies.Denormalization is the process of allowing redundancies.

    27. Why fact table is in normal form?

    Ans:- Basically the fact table consists of the Index keys of thedimension/ook up tables and the measures.so when ever we have the keys in a table .that itself implies that the table isin the normal form.

    28. What is Difference between E-R Modeling and DimentionalModeling.??

  • 8/7/2019 Data Warehousing Basics Interview Questions

    10/24

    Ans:- Basic diff is E-R modeling will have logical and physical model.Dimensional model will have only physical model.E-R modeling is used for normalizing the OLTP database design.Dimensional modeling is used for de-normalizing the ROLAP/MOLAP design.

    29. What is Difference between E-R Modeling and DimentionalModeling.??

    Ans:- Basic diff is E-R modeling will have logical and physical model.Dimensional model will have only physical model.E-R modeling is used for normalizing the OLTP database design.Dimensional modeling is used for de-normalizing the ROLAP/MOLAP design.

    30. What is conformed fact?

    Ans:- Conformed dimensions are the dimensions which can be used acrossmultiple Data Marts in combination with multiple facts tables accordingly??

    31. What are the methodologies of Data Warehousing.??

    Ans:- Every company has methodology of their own. But to name a fewSDLC Methodology, AIM methodology are stardadly used. Othermethodologies are AMM, World class methodology and many more.32. What is BUS Schema?Ans:- BUS Schema is composed of a master suite of confirmed dimensionand standardized definition if facts.

    Data WarehousingConcepts Interview Questions and Answers

    33. What is the difference between datawarehouse and BI?

    Ans:- Simply speaking, BI is the capability of analyzing the data of adatawarehouse in advantage of that business. A BI tool analyzes the data ofa datawarehouse and to come into some business decision depending on theresult of the analysis.

    34. What is a bo repository??Ans:- Repository means set of database tables, Business object storesecurity information e.g user, group, access permission, user type etc. ,universe information e.g. objects, classes, table name, column name,relation ship etc.and document information..

    35. What is the difference between Datawarehousing and

    BusinessIntelligence?

    http://www.newinterviewquestions.com/cat/Data-Warehousing-Concepts-interview-questions/http://www.newinterviewquestions.com/cat/Data-Warehousing-Concepts-interview-questions/http://www.newinterviewquestions.com/cat/Data-Warehousing-Concepts-interview-questions/
  • 8/7/2019 Data Warehousing Basics Interview Questions

    11/24

    Ans:- Data warehousing deals with all aspects of managing thedevelopment, implementation and operation of a data warehouse or datamart including meta data management, data acquisition, data cleansing,data transformation, storage management, data distribution, data archiving,

    operational reporting, analytical reporting, security management,backup/recovery planning, etc. Business intelligence, on the other hand, is aset of software tools that enable an organization to analyze measurableaspects of their business such as sales performance, profitability, operationalefficiency, effectiveness of marketing campaigns, market penetration amongcertain customer groups, cost trends, anomalies and exceptions, etc.Typically, the term business intelligence is used to encompass OLAP, datavisualization, data mining and query/reporting tools.Think of the datawarehouse as the back office and business intelligence as the entire businessincluding the back office. The business needs the back office on which tofunction, but the back office without a business to support, makes no sense.

    36. Why do we override the execute method is struts? Plz give methe details?

    Ans:- As part of Struts FrameWork we can decvelop the ActionServlet,ActionForm servlets(here ActionServlet means which class extendsthe Action class is called ActionServlet and ActionFome means which calssextends the ActionForm calss is called the Action Form servlet)and otherservlets classes.In case of ActionForm class we can develop the validate().this method will

    return the ActionErrors object.In this method we can write the validationcode.If this method return null or ActionErrors with size=0,the webcontainerwill call the execute() as part of the Action class.if it returns size > 0 itwillnot be call the execute().it will execute the jsp,servlet or html file asvalue for the input attribute as part of the attribute instruts-config.xml file.In case of Action class the execute() method retuen the ActionForwardobject.in execute() we can write (returnmapping.findForward("success");)here mapping is the object for theActionMapping class.After that it will forward the request to the "success" jsp

    file.(here success is context path for the jsp file,it is written in web.xml.37. What is snapshot??

    Ans:- You can disconnect the report from the catalog to which it is attachedby saving the report with a snapshot of the data. However, you mustreconnect to the catalog if you want to refresh the data.

  • 8/7/2019 Data Warehousing Basics Interview Questions

    12/24

    38. Is OLAP databases are called decision support system ???true/false?

    Ans:- True.

    38. What is active data warehousing?

    Ans:- An active data warehouse provides information that enables decision-makers within an organization to manage customer relationships nimbly,efficiently and proactively. Active data warehousing is all about integratingadvanced decision support with day-to-day-even minute-to-minute-decisionmaking in a way that increases quality of those customer touches whichencourages customer loyalty and thus secure an organization's bottom line.The marketplace is coming of age as we progress from first-generation"passive" decision-support systems to current- and next-generation "active"data warehouse implementations

    39. Why Denormalization is promoted in Universe Designing?

    Ans:- In a relational data model, for normalization purposes, some lookuptables are not merged as a single table. In a dimensional data modeling(starschema), these tables would be merged as a single table called DIMENSIONtable for performance and slicing data.Due to this merging of tables into onelarge Dimension table, it comes out of complex intermediate joins.Dimension tables are directly joined to Fact tables.Though, redundancy ofdata occurs in DIMENSION table, size of DIMENSION table is 15% only when

    compared to FACT table. So only Denormalization is promoted in UniverseDesinging.

    40. explain in detail about type 1,type 2(SCD),type 3 ?

    Ans:- Type-1 Most Recent Value.Type-2(full History) i) Version Number ii) Flag iii) DateType-3Current and one Privileges value

    41. Kindly numberWhat are the steps to be taken to schedule thereport?

    Ans:- You can schedule any report using Business Objects (reporter) .1)Open report in BO2) Select option " File->Send To- BCA"3) Select the BCAname to which report has to be scheduled4) Set other options for reportscheduling like time , any macro , user etc.

  • 8/7/2019 Data Warehousing Basics Interview Questions

    13/24

    42. what is aggregate table and aggregate fact table ... anyexamples of both??

    Ans:- Aggregate table contains summarised data. The materialized vieware aggregated tables. For ex in sales we have only date transaction. if wewant to create a report like sales by product per year. in such cases we

    aggregate the date vales into week_agg, month_agg, quarter_agg,year_agg. to retrive date from this tables we use @aggrtegate function.

    43. What is the difference between OLAP and datawarehosue??

    Ans:- Datawarehouse is the place where the data is stored for analyzingwhere as OLAP is the process of analyzing the data,managingaggregations,partitioning information into cubes for indepth visualization.

    44. What is the difference between ODS and OLTP??

    Ans:- ODS:- It is nothing but a collection of tables created in theDatawarehouse that maintains only current data.where as OLTP maintains the data only for transactions, these are designedfor recording daily operations and transactions of a business.ods has broad enterprise wide scope ,but unlike the real enterprise datawarehouse ,data is refreshd in near real time and used for routin businessactivity whereas o;yp is online transaction proces which comes across dailybases data .ODS: Operational Data Source/Store, The source from where we will get thedata for DWH is called ODS OLTP: We have content(Front End) associated

    with the back end.a fact table without facts.i,e there is no key measure to analyse thebusiness.

    General Datawarehousing Interview Questions and Answers

    45. What is real time data-warehousing?

    Ans:- Real-time data warehousing is a combination of two things: 1) real-time activity and 2) data warehousing. Real-time activity is activity that is

    happening right now. The activity could be anything such as the sale ofwidgets. Once the activity is complete, there is data about it.

    Data warehousing captures business activity data. Real-time datawarehousing captures business activity data as it occurs. As soon as thebusiness activity is complete and there is data about it, the completedactivity data flows into the data warehouse and becomes available instantly.

  • 8/7/2019 Data Warehousing Basics Interview Questions

    14/24

    In other words, real-time data warehousing is a framework for derivinginformation from data as the data becomes available.

    46. Name some of the real time data-warehousing tools.

    Ans:- ??

    47. What is ETL?

    Ans:- ETL stands for extraction, transformation and loading.

    ETL provide developers with an interface for designing source-to-targetmappings, ransformation and job control parameter. ExtractionTake data from an external source and move it to the warehouse pre-processor database.

    TransformationTransform data task allows point-to-point generating, modifying andtransforming data.

    LoadingLoad data task adds records to a database table in a warehouse.

    48. What Snow Flake Schema?

    Ans:- Snowflake Schema, each dimension has a primary dimension table, towhich one or more additional dimensions can join. The primary dimension

    table is the only table that can join to the fact table.

    49. What is a dimension table?

    Ans:- A dimensional table is a collection of hierarchies and categories alongwhich the user can drill down and drill up. it contains only the textualattributes.

    50. What are modeling tools available in the Market?

    Ans:- There are a number of data modeling toolsTool Name Company Name,Erwin Computer Associates ,EmbarcaderoEmbarcadero Technologies ,Rational Rose IBM Corporation ,Power DesignerSybase Corporation ,Oracle Designer Oracle Corporation

    51. What are the vaious ETL tools in the Market?

  • 8/7/2019 Data Warehousing Basics Interview Questions

    15/24

    Ans:- Various ETL tools used in market are:

    1. Informatica,2. Data Stage,3. MS-SQL DTS(Integrated Services 2005)4. Abinitio,5. SQL Loader,6. Sunopsis,7. Oracle Warehouse Bulider,

    8. Data Junction

    52. What is Dimensional Modelling? Why is it important ?

    Ans:- Dimensional Modelling is a design concept used by many datawarehouse desginers to build thier datawarehouse. In this design model allthe data is stored in two types of tables - Facts table and Dimension table.Fact table contains the facts/measurements of the business and thedimension table contains the context of measuremnets ie, the dimensions onwhich the facts are calculated.

    Why is Data Modeling Important?

    ---------------------------------------

    Data modeling is probably the most labor intensive and time consuming partof the development process. Why bother especially if you are pressed fortime? A common response by practitioners who write on the subject is thatyou should no more build a database without a model than you should builda house without blueprints.

    The goal of the data model is to make sure that the all data objects required

    by the database are completely and accurately represented. Because thedata model uses easily understood notations and natural language , it can bereviewed and verified as correct by the end-users.

    The data model is also detailed enough to be used by the databasedevelopers to use as a "blueprint" for building the physical database. Theinformation contained in the data model will be used to define the relationaltables, primary and foreign keys, stored procedures, and triggers. A poorlydesigned database will require more time in the long-term. Without carefulplanning you may create a database that omits data required to create

    critical reports, produces results that are incorrect or inconsistent, and isunable to accommodate changes in the user's requirements.

    53. What is ER Diagram?

    Ans:- The Entity-Relationship (ER) model was originally proposed by Peter in1976 [Chen76] as a way to unify the network and relational database views.

  • 8/7/2019 Data Warehousing Basics Interview Questions

    16/24

    Simply stated the ER model is a conceptual data model that views the realworld as entities and relationships. A basic component of the model is theEntity-Relationship diagram which is used to visually represents dataobjects.

    Since Chen wrote his paper the model has been extended and today it iscommonly used for database design For the database designer, the utility ofthe ER model is:

    It maps well to the relational model. The constructs used in the ER modelcan easily be transformed into relational tables.It is simple and easy to understand with a minimum of training. Therefore,the model can be used by the database designer to communicate the designto the end user.

    In addition, the model can be used as a design plan by the databasedeveloper to implement a data model in a specific database managementsoftware.

    54. What is a Star Schema?

    Ans:- Star schema is a type of organising the tables such that we canretrieve the result from the database easily and fastly in the warehouseenvironment.Usually a star schema consists of one or more dimension tablesaround a fact table which looks like a star,so that it got its name.

    55. What are Aggregate tables?

    Ans:- Aggregate table contains the summary of existing warehouse datawhich is grouped to certain levels of dimensions.Retrieving the required datafrom the actual table, which have millions of records will take more time andalso affects the server performance.To avoid this we can aggregate the tableto certain required level and can use it.This tables reduces the load in thedatabase server and increases the performance of the query and canretrieve the result very fastly.

    56. What are the various Reporting tools in the Market?

    Ans:- 1. MS-Excel ,2. Business Objects (Crystal Reports) ,3. Cognos(Impromptu, Power Play) ,4. Microstrategy ,5. MS reporting services ,6.Informatica Power Analyzer ,7. Actuate ,8. Hyperion (BRIO) ,9. Oracle Express OLAP ,10. Proclarity,

    57. What is the Difference between OLTP and OLAP?

  • 8/7/2019 Data Warehousing Basics Interview Questions

    17/24

    Ans:- Main Differences between OLTP and OLAP are:-

    1.User and System Orientation

    OLTP: Customer-oriented, used for data analysis and querying by clerks,clients and IT professionals.

    OLAP: Market-oriented, used for data analysis by knowledgeworkers( managers, executives, analysis).

    2. Data Contents

    OLTP: Manages current data, very detail-oriented.

    OLAP: Manages large amounts of historical data, provides facilities forsummarization and aggregation, stores information at different levels ofgranularity to support decision making process.

    3. Database Design

    OLTP: Adopts an entity relationship(ER) model and an application-orienteddatabase design.

    OLAP: Adopts star, snowflake or fact constellation model and a subject-oriented database design.

    4. View

    OLTP: Focuses on the current data within an enterprise or department.

    OLAP: Spans multiple versions of a database schema due to the evolutionaryprocess of an organization; integrates information from many organizationallocations and data stores

    57. What is Fact table?

    Ans:- Fact Table contains the measurements or metrics or facts of businessprocess. If your business process is "Sales" , then a measurement of thisbusiness process such as "monthly sales number" is captured in the Facttable. Fact table also contains the foriegn keys for the dimension tables.58. What are the Different methods of loading Dimension tables?

  • 8/7/2019 Data Warehousing Basics Interview Questions

    18/24

    Ans:- Conventional Load:Before loading the data, all the Table constraints will be checked against thedata.

    Direct load:(Faster Loading)

    All the Constraints will be disabled. Data will be loaded directly.Later thedata will be checked against the table constraints and the bad data won't beindexed.

    58. What is a lookup table?

    Ans:- A lookUp table is the one which is used when updating a warehouse.When the lookup is placed on the target table (fact table / warehouse) basedupon the primary key of the target, it just updates the table by allowing onlynew records or updated records based on the lookup condition.

    59. What is a general purpose scheduling tool?

    Ans:- The basic purpose of the scheduling tool in a DW Application is tostream line the flow of data from Source To Target at specific time or basedon some condition.60. Name some of modeling tools available in the Market?

    Ans:- These tools are used for Data/dimension modeling

    1. Oracle Designer, 2. ERW in (Entity Relationship for windows),

    3. Informatica (Cubes/Dimensions),4. Embarcadero,5. Power DesignerSybase.

    61. What are Data Marts?

    Ans:- Data Marts are designed to help manager make strategic decisionsabout their business.Data Marts are subset of the corporate-wide data that is of value to aspecific group of users.There are two types of Data Marts:

    1.Independent data marts sources from data captured form OLTP system,external providers or from data generated locally within a particulardepartment or geographic area.2.Dependent data mart sources directly form enterprise data warehouses.

    62. What is a data warehousing?

  • 8/7/2019 Data Warehousing Basics Interview Questions

    19/24

    Ans:- Data Warehouse is a repository of integrated information, available forqueries and analysis. Data and information are extracted fromheterogeneous sources as they are generated.This makes it much easier andmore efficient to run queries over data that originally came from differentsources.

    Typical relational databases are designed for on-line transactional processing(OLTP) and do not meet the requirements for effective on-line analyticalprocessing (OLAP). As a result, data warehouses are designed differentlythan traditional relational databases.

    a data ware housing is analysing the business this is timevarient,nonvolitail,integreated and subject oriented a data ware house is supportdecision making.

    63. Differences between star and snowflake schemas?

    Ans:- Star schema - all dimensions will be linked directly with a fat table.Snow schema - dimensions maybe interlinked or may have one-to-manyrelationship with other tables.

    64. Which columns go to the fact table and which columns go the

    dimension table?

    Ans:- The Primary Key columns of the Tables(Entities) go to the DimensionTables as Foreign Keys.

    The Primary Key columns of the Dimension Tables go to the Fact Tables asForeign Keys.

    65. What is ODS?

    Ans:- 1. ODS means Operational Data Store.

    2. A collection of operation or bases data that is extracted fromoperation databases and standardized, cleansed, consolidated, transformed,

    and loaded into an enterprise data architecture. An ODS is used to supportdata mining of operational data, or as the store for base data that issummarized for a data warehouse. The ODS may also be used to audit thedata warehouse to assure summarized and derived data is calculatedproperly. The ODS may further become the enterprise shared operationaldatabase, allowing operational systems that are being reengineered to usethe ODS as there operation databases.

  • 8/7/2019 Data Warehousing Basics Interview Questions

    20/24

  • 8/7/2019 Data Warehousing Basics Interview Questions

    21/24

    Ans:- It is an environment or storage space managed by a relationaldatabase management system (RDBMS) consisting of vast quantities ofinformation.

    VLDB doesnt refer to size of database or vast amount of information stored.

    It refers to the window of opportunity to take back up the database.

    Window of opportunity refers to the time of interval and if the DBA wasunable to take back up in the specified time then the database wasconsidered as VLDB.

    71. What is SCD1 , SCD2 , SCD3?

    Ans:- SCD Stands for Slowly changing dimensions.SCD1: only maintained updated values.Ex: a customer address modified we update existing record with newaddress.

    SCD2: maintaining historical information and current information by usingA) Effective Date,B) Versions,C) Flags,or combination of these

    SCD3: by adding new columns to target table we maintain historicalinformation and current information.

    72. What are slowly changing dimensions?

    Ans:- SCD stands for Slowly changing dimensions. Slowly changingdimensions are of three typesSCD1: only maintained updated values.Ex: A customer address modified we update existing record with newaddress.SCD2: Maintaining historical information and current information by usingA) Effective Date,B) Versions,C) Flags,or combination of thesescd3: By adding new columns to target table we maintain historicalinformation and current information

    73. What are Semi-additive and factless facts and in which scenariowill you use such kinds of fact tables?

    Ans:- Snapshot facts are semi-additive, while we maintain aggregated factswe go for semi-additive.EX: Average daily balanceA fact table without numeric fact columns is called factless fact table.Ex: Promotion Facts

  • 8/7/2019 Data Warehousing Basics Interview Questions

    22/24

    While maintain the promotion values of the transaction (ex: productsamples) because this table doesnt contain any measures.

    74. What are non-additive facts?

    Ans:- Non-Additive: Non-additive facts are facts that cannotbe summed up for any of the dimensions present in thefact table.

    75. What are conformed dimensions?

    Ans:- Conformed dimensions mean the exact same thing with every possiblefact table to which they are joinedEx:Date Dimensions is connected all facts like Sales facts, Inventoryfacts..etc

    76. How do you load the time dimension?

    Ans:- Time dimensions are usually loaded by a program thatloops through all possible dates that may appear inthe data. It is not unusual for 100 years to berepresented in a time dimension, with one row per day.

    77. Why are OLTP database designs not generally a good idea for a

    Data Warehouse?

    Ans:- Since in OLTP,tables are normalised and hence query response will beslow for end user and OLTP doesnot contain years of data and hence cannotbe analysed.

    78. What type of Indexing mechanism do we need to use for a

    typical datawarehouse?

    ANs:- On the fact table it is best to use bitmap indexes. Dimension tablescan use bitmap and/or the other types of clustered/non-clustered,unique/non-unique indexes.

    To my knowledge, SQLServer does not support bitmap indexes. Only Oraclesupports bitmaps.

    79. Why should you put your data warehouse on a different systemthan your OLTP system?

  • 8/7/2019 Data Warehousing Basics Interview Questions

    23/24

  • 8/7/2019 Data Warehousing Basics Interview Questions

    24/24

    3NF: The Table is already in 2NF, Contains no transitivedependencies.

    82. What is data mining?

    Ans:- Data mining is a process of extracting hidden trends within a

    datawarehouse. For example an insurance dataware house can be used tomine data for the most high risk people to insure in a certain geographialarea.

    ***********************PAGE 1 and PAGE 2Complete********************http://www.newinterviewquestions.com/cat/General-Datawarehousing-interview-questions