8/9/2019 dmdw-mid 1
1/22
JNTU ONLINE EXAMINATIONS [Mid 1 - DMDW]
1. Which of the following is the most popularly available and rich information
repositories?
a. Temporal databases
b. Relational databases
c. Transactional databasesd. spatial databases
2. Which of the following databases is used to store time-related data?
a. Spatial databasesb. Text databases
c. Multimedia databasesd. Temporal databases
3. From a DWH perspective, data mining can be viewed as an advanced stage of
a. On-Line Transaction Processing
b. On-Line Data Processingc. On-Line Analytical Processing
d. On-Line Electronic Processing4. A _ _ _ _ _ _ is a group of heterogeneous databases?
a. Time series databasesb. Object oriented databases
c. Legacy databasesd. Spatial databases
5. Spatial databases includes
a. Legacy databasesb. Time series databasesc. Satellite image databases
d. Temporal databases6. Many people treat data mining as synonym for another popularly used term
a. Knowledge Discovery in databasesb. knowledge inventory in databasesc. Knowledge acceptance in databases
d. knowledge disposal in databases.7. A database is a collection of
a. Related data
b. Interrelated datac. Irrelevant data
d. Distributed data8. A Relational database is a collection of
a. tablesb. eventsc. attributesd. values
9. A _ _ _ _ _ _ _ is a repository of information collected from multiple squaresstored under a unified schema, and which usually resides at a single site.
a. Data miningb. Databasec. Data warehoused. legacy databases
10. Which of the following databases is used to store image, audio, and video data?
a. Heterogeneous databases
b. Temporal databasesc. Legacy databasesd. Multimedia databases
8/9/2019 dmdw-mid 1
2/22
11. What is the single dimensional association rule for the following predicate
notation, which in multidimensional association rule. Contains(T, "computer") ==
contains(T, "software")
a. Computer == software
b. Software == computer
c. Software == computer
d. Computer == software
12. Which of the following analysis attempt to identify attributes that do notcontribute to the classification or prediction process?
a. Cluster analysisb. Outlier analysis
c. Relevance analysisd. Evolution analysis
13. Which of the following is a summarization of the general characteristics orfeatures of a target class of data?
a. Data discriminationb. Data characterizationc. Data compressiond. Meta data
14. _ _ _ _ _ _ _ is a comparison of the general features of target class data objectswith general features of objects from one or a set of contrasting classes.
a. Data characterizationb. Data summarizationc. Data discriminationd. Meta data
15. _ _ _ _ _ _ _ interestingness measures are based on user beliefs in the data.
a. Objective
b. Descriptivec. Collectived. Subjective
16. _ _ _ _ _ _ mining tasks characterize the general properties of the data in the
databases.
a. Descriptive
b. Predictivec. Metadatad. Data
17. _ _ _ _ _ mining tasks perform inference on the current data in order to make
predictions.
a. Descriptiveb. Predictivec. Data
d. Metadata18. The derived model may be represented in the form of
a. ER modelb. Flow chartc. Decision treesd. DFD
19. Which of the following is the classification of data mining systems?
8/9/2019 dmdw-mid 1
3/22
a. Summarizationb. Visualization
c. Discriminationd. Characterization
20. _ _ _ _ _ _ _ analysis describes and models regularities or trends for objectswhose behavior changes over time.
a. Data evolutionb. Cluster
c. Outlierd. Summarization
21. Which of the following issues relation to the diversity of database type?
a. Handling noisy or incomplete datab. Incorporation of background knowledgec. Handling of relational and complex types of data
d. Efficiency and scalability of data mining algorithms22. Which of the following is not major issue in data mining?
a. Mining methodology and user interaction issuesb. Performance issuesc. Issues relating to the diversity of database typesd. Issues relating to the Measurement
23. Processing _ _ _ _ _ queries in operational databases would substantially degradethe performance of operational tasks.
a. On-Line Transaction Processingb. On-Line Electronic Processingc. On-Line Data Processingd. On-Line Analytical Processing
24. An _ _ _ _ _ _ System typically adopts either a star or snow flake model andsubject oriented database design.
a. On-Line Transaction Processingb. On-Line Electronic Processing
c. On-Line Analytical Processingd. On-Line Data Processing
25. The access patterns of an _ _ _ _ system consist mainly of short, atomic
transactions.
a. On-Line Analytical Processingb. On-Line Transaction Processing
c. On-Line Electronic Processingd. On-Line Data Processing
26. Which of the following approach requires complex information filtering andintegration processes and competes for resources with processing at local
sources?
a. Update-driven approach
b. Integrate-driven approachc. Query-driven approachd. Data-driven approach
27. Mining different kinds of knowledge in databases is an issue in
a. Performance issueb. Mining methodology and user interaction issues
c. Diversity of database types issuesd. time complexity
28. Pattern evolution is an issue related to
a. Mining methodology and user interaction issuesb. Performance issuesc. Issues relating to the diversity of database types
d. Issues relating to the Measurement
8/9/2019 dmdw-mid 1
4/22
29. A DWH is a subject oriented, integrated, time- variant, and _ _ _ _ _ _ collectionof data in support of management's decision-making process.
a. Nonvolatileb. Volatile
c. Disintegratedd. Object- oriented
30. An _ _ _ system focuses mainly on the current data with in an enterprise ordepartment, without referring to historical data or data in different organizations .
a. On-Line Analytical Processingb. On-Line Data Processing
c. On-Line Electronic Processingd. On-Line Transaction Processing
31. The basic characteristic of On-line Analytical Processing is
a. Informational processingb. Operational processingc. Data processing
d. Data cleaning32. Which of the following cuboid that holds the highest level of summerization?
a. Cuboid
b. Base cuboidc. Non-base cuboidd. Apex coboid
33. _ _ _ _ _ _ _ _ _ _ is a visualization operation that rotates the data axes in view inorder to provide an alternative presentation of the data
a. Rollupb. Drill down
c. Pivotd. Slice & dice
34. _ _ _ _ _ _ tables can be specified by users or experts, or automatically generatedand adjusted based on data distributions.
a. Factb. Summarized
c. Dimensiond. Relational35. _ _ _ _ _ _ _ executes queries involving more than one fact table
a. Drill-throughb. Drill-acrossc. Drill-down
d. Rotate36. A _ _ _ _ _ allows data to be modeled and viewed in multiple dimensions.
a. Meta datab. Data cubec. Databased. Fact table
37. The major difference between the snowflake and star schema models is that thedimension tables of the snowflake model image kept in _ _ _ _ form
a. Standardb. De-normalized
c. Normalizedd. Multi dimensional
38. Which of the following is not a measure, which is based on the kind ofaggregation functions used.
a. Cumulativeb. Distributed
c. Algebraic
8/9/2019 dmdw-mid 1
5/22
d. Holistic39. A concept hierarchy that is a total or partial order among attributes in database
schema is called a _ _ _ _ _ _ _ _ _ _ _ hierarchy.
a. Set-grouping
b. Groupingc. Decisiond. Schema
40. Which of the following focuses on socioeconomic applications?
a. Statistical database systemsb. Online Analytical Processing systems
c. Spatial database systemsd. Temporal database systems
41. A _ _ _ _ _ _ _ _ _ model consists of radial lines emanating from a central point,where each line represents a concept hierarchy for a dimension
a. Cube netb. Triangle net
c. Square netd. Star net
42. Which of the following is constructed where the enterprise warehouse is the solecustodian of all warehouse data. Which is then distributed to the various
dependent data marts.
a. Enterprise DWH
b. Two- tier DWHc. Multi-tier DWHd. Virtual warehouse
43. Which of the following is a Multi Dimensional Online Analytical Processing?
a. Ess baseb. Database
c. Swiss based. Red brick
44. The _ _ _ _ _ _ view includes fact tables and dimension tables.
a. DWH
b. Top-downc. Data sourced. Business Query
45. Which of the following is a Hybrid OLAP server?
a. MS SQL server 1.0b. MS SQL 5.0
c. MS SQL server 7.0d. MS SQL server 3.0
46. ETL stands for
a. Evaluate, Transport and Linkb. Extract Transfer and Loadc. Error, Tracking and Load
d. Extract, Transient and Load47. To architect the DWH, the major driving factor to support is
a. An inability to cope with requirements evolutionb. Not populating the warehouse
c. Day- to- day management of the warehoused. Supporting Online Transaction processing
48. A _ _ _ _ _ _ _ contains a subset of corporate-wide data that is of value to aspecific group of users.
a. Enterprise warehouseb. Virtual warehouse
c. Data warehouse
8/9/2019 dmdw-mid 1
6/22
d. Data mart49. A _ _ _ _ _ _ _ is a set of views over operational databases
a. Enterprise warehouseb. Virtual warehouse
c. Data warehoused. Data mart
50. What kind of the intermediate servers that stand in between a relational back-endserver and client front-end tools?
a. Hybrid OLAP serversb. Multidimensional OLAP server
c. Relational OLAP serversd. Specialized SQL servers
51. Choose the _ _ _ _ _ _ _ _ _ that will populate each fact table record
a. Measuresb. Dimensionsc. Grain
d. Business Process52. How many cuboids are there in an n- dimensional data cube?
a.
b.
c.
d.
53. Meta data repository contains
a. Operational meta datab. Data irrelevant to system performancec. The mapping from the DWH to the operational environment
d. Summarized data
54. Which of the following support the bitmap indices
a. Sybase IQb. Oracle 7c. CoBoLd. SQL
55. _ _ _ _ _ _ _ are created for the data names and definitions of the givenwarehouse
a. Data cubeb. Summarized datac. Meta datad. Detailed Information
56. Chunking technique involves "overlapping" some of the aggregationcomputations, it is referred to as _ _ _ _ _ aggregation in data cube computation
a. Two way arrayb. Three way array
c. Multi way arrayd. Sparse array
57. The _ _ _ _ _ _ _ operator computes aggregates over all subsets of thedimensions specified in the operation.
a. Data baseb. Computer cube
c. Define cubed. Group by
8/9/2019 dmdw-mid 1
7/22
58. Which of the following is a subcuge that is small enough to fit into the memoryavailable for cube computation?
a. Bulkb. Array
c. Structured. Chunk
59. The bit mapped join indices method is an integrated form of
a. Composite join indexing and bitmap indexingb. Join indexing and composite join indexingc. Join indexing and bitmap indexing
d. Bitmap indexing and outer join indexing60. A set of attributes in a relation schema that forms a primary key for another
relation schema is called a _ _ _ _ _ _ _
a. Primary keyb. Foreign keyc. Secondary key
d. Composite key61. Which of the following typically gathers data from multiple, heterogeneous, and
external sources?
a. Data cleaningb. Loadc. Refresh
d. Data extraction62. OLAM is particularly important for the following reason
a. How quality of data in DWHb. Data processing
c. OLTP-based exploratory data analysisd. Online selection of data mining functions
63. Which of the following sets a good example for interactive data analysis andprovides the necessary preparations for exploratory data mining?
a. OLPb. OLAP
c. OLTPd. OLDP64. Which of the following is not exception indicator?
a. Out Expb. Self Expc. In Exp
d. Path Exp65. _ _ _ _ _ _ _ _ _ can help business managers find and reach more suitable
customers, as well as gain critical business insights that may help to drive marketshare and raise profits.
a. Data warehouseb. Data mining
c. Data summarizationd. Data processing
66. _ _ _ _ _ _ _ _ _ _ _ is an alternative approach in which pre-computed measuresindicating data exceptions are used to guide the user in the data analysis process
at all levels of aggregation.
a. Hypothesis-driven exploration
b. Inventory-driven explorationc. Discovery-driven explorationd. Exception-driven exploration
67. Which of the following is an exception indicator that indicates that indicates the
degree of surprise of the cell value, relative to other cells at the same level ofaggregation?
8/9/2019 dmdw-mid 1
8/22
a. Out Expb. In Exp
c. Path Expd. Self Exp
68. _ _ _ _ _ is a powerful paradigm that integrates OLAP with data miningtechnology.
a. Online Analytical Modelingb. Online Analytical Machine
c. Online Analytical Miningd. Online Analytical Monitoring
69. Data warehouse application is _ _ _ _ _ _ _ _ _
a. Data Processingb. Transaction Processingc. Datacube
d. Datamining70. _ _ _ _ _ _ _ _ _ cubes compute complex queries involving multiple dependent
aggregates as multiple granularities
a. Multi featureb. Datac. Meta
d. Solid71. Which of the following performs a linear transformation on the original data?
a. Z-score normalizationb. Normalization with decimal scalingc. Zero-standard deviationd. Min-max normalization
72. Which of the following is the best method for missing values in data cleaning?
a. Fill in the missing value manually
b. Use the most probable value to fill in the missing valuec. Use the attribute mean to fill the missing value
d. Use a global constant to fill in the missing value73. The minimum and maximum values in a given bin are identified as the
a. Bin meansb. Bin averagec. Bin mediansd. Bin boundaries
74. Which of the following is data transformation operation?
a. Normalization
b. Regressionc. Clusteringd. Binning
75. The correlation between attributes A and B can be measured by
a.
b.
c.
d.76. _ _ _ _ _ methods smooth a sorted data value by consulting in neighborhood ie
the values around it.
a. Clusteringb. Binning
8/9/2019 dmdw-mid 1
9/22
c. Regressiond. Data reduction
77. Z-score normalization is also called as
a. Min-max normalization
b. Zero-standard deviation normalizationc. Zero-mean normalizationd. Normalization by decimal scaling
78. _ _ _ _ _ _ is a random error or variance in a measured variable.
a. Binb. Cluster
c. Noised. Regression
79. The data are consolidated into forms appropriate for mining is called as
a. Data reductionb. Data Redundancyc. Data clean
d. Data transformation80. Which of the following is a decision tree algorithm?
a. C3.2
b. ID3c. PP2d. DIM
81. If the tuples in D are grouped into M mutually disjoint Clustering, then an simple
random sample of m clusters can be obtained, where m M which of thefollowing suits the above sentence?
a. Stratified sampleb. SRS without replacementc. Cluster sampled. SRS with replacement
82. Multidimensional index trees include
a. A- trees
b. T-treesc. P-trees
d. R-trees83. Which of the following strategy for data reduction is irrelevant, weakly relevant,
or redundant attributes may be detected and removed?
a. Data cube aggregationb. Dimension reductionc. Data compression
d. Numerosity reduction84. In database systems, _ _ _ _ _ are primarily used for providing fast data access.
a. Red-black treesb. Game treesc. Multidimensional index trees
d. splay trees85. If the mining task is classification, and the mining algorithm itself is used to
determine the attribute subset, then this is called a _ _ _ _ _ _ approach.
a. Filterb. Reductionc. Smoothingd. Wrapper
86. The discrete wavelet transformation is closely related to the _ _ _ _ _ _ _transform.
8/9/2019 dmdw-mid 1
10/22
a. Discrete fourierb. Fourier
c. Laplaced. wavelet
87. Principal components analysis is also called as
a. Karhunen-loeve methodb. Kinen-liva methodc. Kruskal-learn method
d. Kutni-lara method88. _ _ _ _ _ _ can be used as a data reduction technique since it allows a large data
set to be represented by a much smaller random subset of the data.
a. Clusteringb. Regressionc. Histograms
d. Sampling89. Loy-linear models are
a. Parametric methodsb. Discrete methodsc. Non-parametric methodsd. Non- discrete methods
90. Which of the following method is the generation of concept of hierarchies forcategorical data?
a. Specification of a portion of a hierarchy by implicit data groupingb. Specification of their partial ordering, but not of a set of attributesc. Specification of a set of attributes, but not of their partial orderd. Specification of only a partial set of entities
91. Which of the following method uses class information?
a. Histogram analysis
b. Binningc. Cluster analysis
d. Entropy-based Discretization92. _ _ _ _ _ _ _ _ _ hierarchies for categorical attributes or dimensions typically
involve a group of attributes
a. Diccretizationb. Semanticc. Index
d. Concept93. Which of the following is based on the maximal asset values, which may lead to a
highly biased hierarchy?
a. Cluster analysisb. Segmentationc. Binning
d. Histogram analysis94. The _ _ _ _ _ can be used to segment numeric data into relatively uniform,
"natural" intervals.
a. 1-2-3 rule
b. 2-3-4 rulec. 3-4-5 rule
d. 4-5-6rule95. _ _ _ _ _ _ _ _ hierarchies for numeric attributes can be constructed automatically
based on data distribution analysis
a. Conceptb. Discretizationc. Tree
d. Index
8/9/2019 dmdw-mid 1
11/22
96. _ _ _ _ _ _ _ techniques can be used to reduce the number of values for a givencontinuous attribute, by dividing the range of the attribute into intervals
a. Concept hierarchyb. Discretization
c. Tree-basedd. Index
97. A _ _ _ _ _ _ _ _ _ algorithm can be applied to partition data into groups
a. Binningb. Histogramc. Clustering
d. Entropy-based98. An information-based measure called _ _ _ _ can be used to recursively partition
the values of a numeric attribute A, resulting in a hierarchical discretization.
a. Entropyb. Clusterc. Binning
d. Segmentation99. The kinds of knowledge include
a. Image analysis
b. Query processc. Associationd. Multimedia analysis
100. Which of the following is a simplicity measure?
a. Rule strengthb. Rule qualityc. Rule reliability
d. Rule length101. _ _ _ _ _ _ hierarchies can be used to refine or enrich schema defined
hierarchies. When the two types of hierarchies are combined.
a. Schemab. Set-groupingc. Operation-derived
d. rule-based102. _ _ _ _ _ _ _ are those that contribute new information or increasedperformance to the given pattern set.
a. Utility patternsb. Certainty patternsc. Novelty pattern
d. Simplicity patterns103. Certainty factor is also known as
a. Rule lengthb. Noice thresholdc. Minable viewd. Rule strength
104. Which of the following primitive specifies the data mining functions to beperformed?
a. Task-relevant datab. The kind of knowledge to be mined
c. Background knowledged. Interestingness measures
105. _ _ _ _ _ _ _ may be used to guide the mining process or, after discoveryto evaluate the discovered patterns.
a. Task-relevant datab. The kind of knowledge to be mined
c. Background knowledge
8/9/2019 dmdw-mid 1
12/22
d. Interestingness measures106. A _ _ _ _ _ hierarchy is a total or partial order among attributes in the
database schema.
a. Schema
b. Set-groupingc. Operation-derivedd. rule-based
107. Given a set of task-relevant data tuples the confidence of "A== B" is
defined as
a.
b.
c.
d.
108. _ _ _ _ _ hierarchies include the decoding of information encoded stringsinformation extraction from complex data objects and data clustering.
a. Rule-based
b. Operation-derivedc. Schema
d. Set grouping
109. For association rules of the form "A== B" where A and B are sets of
items, support is defined as
a.
b.
c.
d.110. Which of the following clause is the task-irrelevant data primitive?
a. In relevance tob. Use for warehousec. Analysisd. Order by
111. Mining with the use of _ _ _ _ , allows additional flexibility for ad hoc rulemining.
a. Image patternsb. Data patterns
c. Information patternsd. Meta patterns
112. Which of the following clause lists the attributes or dimensions forexploration
a. Order byb. group by
c. havingd. in relevance to
8/9/2019 dmdw-mid 1
13/22
113. Which of the following clause uses the meta pattern?
a. Analyzeb. In relevance toc. Matching
d. Use data warehouse114. Which of the following clause is used for discrimination?
a. Mine characteristicsb. Mine discriminantc. Mine associationd. Mine comparison
115. DMQL expansion is
a. Data Modeling Queue Level
b. Design Modeling Query languagec. Data Mining Query Languaged. Data &Meta data Query Language
116. The _ _ _ _ _ clause, when used for characterization, specific aggregate
measures, such as count, sum or count .
a. Use databaseb. Analyze
c. Matchingd. Use hierarchy
117. Which of the following clause specifies the condition by which groups ofdata are considered relevant?
a. Havingb. Group by
c. Order byd. analyze
118. The _ _ _ _ _ _ _ _ statement is used to specify the kind of knowledge tobe mined.
a. Knowledge-mine-specification
b. Mine-knowledge-specification
c. Knowledge-specification-mine
d. Specification-mine-knowledge
119. An example of interestingness measures and threshold values is
a. Without support threshold=
b.With confidence threshold=
c. Without Confidence threshold=
d. With support threshold=
120. CRISP-DM addresses an issue as
a. Mapping from datamining problems to business issuesb. Capturing and misunderstanding the data
c. Disintegrating datamining results within the business context
8/9/2019 dmdw-mid 1
14/22
d. Deploying and maintaining data mining results121. An Example of a set-grouping hierarchy is
a. Define hierarchy age-hierarchy for age as customer on level1:{young, middle-
aged,serior} level10:all level2:{20 39} level1: young level2:{20 59}
level1: middle-aged level2:{60 89} level1:senior
b. Define hierarchy age-hierarchy as age for customer on level1:{young, middle-
aged,serior} level10:all level2:{20 39} level1: young level2:{20 59}
level1: middle-aged level2:{60 89} level1:senior
c. Define hierarchy age-hierarchy for age on customer as level1:{young,
middle-aged,serior} level10:all level2:{20 39} level1: young level2:
{20 59} level1: middle-aged level2:{60 89} level1:senior
d. Define hierarchy age-hierarchy on age for customer as level1:{young, middle-
aged,serior} level10:all level2:{20 39} level1: young level2:{20 59}
level1: middle-aged level2:{60 89} level1:senior122. Which of the following data mining language uses SQL-like syntax and
serves as rule generation queries for mining association rules.
a. MINE RULE operatorb. RULE MINE operator
c. DATA MINE operatord. DWH operator
123. Which of the following is not a data mining language?
a. DMQLb. MSQLc. PSQLd. OLE DB for
124. System of schema hierarchy is
a. textbf{Define hierarchy} location-hierarchy textbf{on} addresstextbf{as} [street, city, country]
b. textbf{Define hierarchy} location-hierarchy textbf{as} address textbf{on} [street,city, country]c. textbf{Define hierarchy} location-hierarchy textbf{from} address textbf{to}[street, city, country]
d. textbf{Define hierarchy }location-hierarchy textbf{for} address textbf{all} [street,city, country]
125. The DMQL statement syntax is
a. display as result _ from
b. display result _ from
c. display on result _ from
8/9/2019 dmdw-mid 1
15/22
d. display for result _ from
126. Which of the following is a data mining query language
a. PSQLb. QSQLc. MSQL
d. RSQL127. _ _ _ _ _ is used for efficient implementations of a few essential data
mining primitives.
a. No couplingb. Loose couplingc. Tight couplingd. Semi tight coupling
128. _ _ _ _ _ _ _ is a compromise between loose and tight coupling.
a. No coupling
b. Loose couplingc. Tight coupling
d. Semi tight coupling129. Which of the following coupling schema is used to fetch data from a data
repository managed by database systems?
a. No couplingb. Loose couplingc. Tight coupling
d. Semi tight coupling130. A well designed data mining system should offer _ _ _ _ _ _ _ with a data
warehouse system
a. Semi tight couplingb. No couplingc. Loose coupling
d. Normal coupling131. Which of the following is difficult to achieve high scalability and good
performance with large data sets?
a. No couplingb. Tight couplingc. Semi tight coupling
d. Loose coupling132. _ _ _ _ _ _ _ _ means that a Data mining system will not utilize any
function of a data warehouse system
a. Loose couplingb. Semi tight couplingc. Loose coupling
d. No coupling133. _ _ _ _ _ _ _ _ means that a data mining system is smoothing integrated
coupling database system.
a. No coupling
b. Loose couplingc. Tight coupling
d. Semi tight coupling134. Which of the following provides a concise and succinct summerization of
the given collection of data?
a. Comparison
b. Characterizationc. Summerization
d. Aggregation
8/9/2019 dmdw-mid 1
16/22
135. _ _ _ _ _ _ _ _ data mining describes the data set in a concise andsummerative manner and presents interesting general properties of the data.
a. Descriptiveb. Predictive
c. Actived. Constructive
136. _ _ _ _ _ _ data mining analyzes the data in order to construct one or a setof models and attempts to predict the behavior of new data sets.
a. Descriptiveb. Predictive
c. Actived. Constructive
137. Attribute removal is based on the following rule: If there is a large set ofdistinct values for an attribute of the initial working relation but,
a. There is generalization operator on the attributeb. There is no generalization operand on the attribute
c. There is no generalization operator on the attributed. There is no aggregation operator on the attribute
138. On-line analysis processing in data warehouses is a purely-controlledprocess
a. Machineb. database
c. Developerd. User
139. Which of the following approach is used to control generalization process?
a. Generalized relation threshold control
b. Generalized class threshold controlc. Generalized dimension threshold control
d. Generalized query threshold control140. Many current OLAP systems confine dimensions to _ _ _ _ _ _ _ _ _ _ data
a. Numericb. Non numeric
c. Metad. Summerized141. _ _ _ _ _ _ _ is a process that abstracts a large set of task-relevant data in
a database from a relatively low conceptual level to higher conceptual levels.
a. Data realizationb. Data characterization
c. Data summerizationd. Data generalization
142. The _ _ _ _ _ _ approach can be considered as a data warehouse-basedpre-computation-oriented, material- view approach.
a. Object-oriented inductionb. Data cube
c. Attribute-oriented inductiond. Data square
143. Which of the following approach is a relational database query-oriented,generalization-based, on-line data analysis technique?
a. Attribute-oriented inductionb. object-oriented approach
c. Data cubed. Data square
144. _ _ _ _ _ _ _ _ performs off-line aggregation before an OLAP or Datamining query is submitted for processing.
a. Object-oriented induction
8/9/2019 dmdw-mid 1
17/22
b. Data cubec. Attribute-oriented induction
d. Data square145. The range of t-weight is
a.
b.
c.
d.
146. How can the t-weight and interestingness measures in general be used bythe data mining system to display only the concept descriptions that it objectively
evaluates as interesting?
a. By thresholdb. By generalizationc. By comparison
d. By characterization147. The data cube implementation of attribute-oriented induction can be
performed by
a. Using defined data cubeb. Using a predefined data cubec. Using a generalized data cube
d. Using a quantified data cube148. A _ _ _ _ _ can be represented by a 3-D data cube.
a. Cross-tabb. Bar chart
c. pie chartd. Flow chart
149. Step one of the attribute-oriented-induction algorithm is essentially arelational query to collect the task relevant data into the _ _ _ _ _ _ _ _ _ _ _ .
a. Prime relationb. Secondary relation
c. Working relationd. Analyzing relation
150. Which of the following relation collects the statistics of attribute-oriented-induction algorithm?
a. Working relationb. Prime relation
c. Secondary relationd. Analyzing realation
151. Descriptions can also be visualized in the form of _ _ _ _ _ _ _ _ .
a. Cross-ralationsb. Cross-checksc. Cross-boards
d. Cross-tabs152. Step three of attribute-oriented-induction derives the _ _ _ _ _ _ _
relation.
a. Workingb. Primec. Secondary
d. Analysing
8/9/2019 dmdw-mid 1
18/22
153. The _ _ _ _ _ _ as an interestingness measure that describes the typicallyof each disjoint in the rule, or of each tuple in the corresponding generalized
relation.
a. Quantitative rule
b. Quantitative characteristic rulec. c-weightd. t-weight
154. The information gain is obtained by
a. Expected information + entropyb. Entropy - Expected information
c. Expected information entropy
d. Entropy Expected information
155. The expected information needed to classify a given sample is
a. I(s1,s2----.sm)= mathop Sigma limits_{i = 1}n ( /s) ( /s)
b. I(s1,s2----.sm)= ( /s) ( /s)
c. I(s1,s2----.sm)= - mathop Sigma limits_{i = 1}n ( /s) ( /s)
d. I(s1,s2----.sm)=- mathop Sigma limits_{i = 1}n ( /s) ( /s)
156. Class comprarison is also called as
a. compositionb. aggregationc. discriminationd. characterization
157. _ _ _ _ _ _ can be used to perform some preliminary relevance analysis onthe data by removing or generalizing attributes having a very large number ofdistinct values.
a. Object-oriented inductionb. Attribute-oriented inductionc. Batch-oriented induction
d. Class-oriented induction158. Class characterization that includes the analysis of attribute/dimensions
relevance is called _ _ _ _ _ .
a. Analytical comparisonb. Analytical measurementc. Analytical characterization
d. Analytical difference159. _ _ _ _ _ _ _ irrelevant and weakly relevant attributes using the selected
relevance analysis measure.
a. Insertb. Updatec. Modify
d. Remove160. The _ _ _ _ _ class is the class to be characterized
a. baseb. target
c. contrastingd. sub
8/9/2019 dmdw-mid 1
19/22
161. The _ _ _ _ _ _ class is the set of comparable data that are not in thetarget class.
a. baseb. target
c. contrastingd. sub
162. Generalization is performed on the _ _ _ _ _ _ _ _ to the level controlledby a user or expert-specified dimension threshold, which results in a _ _ _ _ _ _ _
a. Target class, Prime target class relationb. Contrasting class, Prime contrasting class relation
c. Target class, Secondary target class relationd. Contrasting class, Secondary contrasting class relation
163. Let be a generalized tuple, and be the target class, the d-weight
is defined as
a. d-weight =condition( ) / count( )
b. d-weight =condition( ) / mathop Sigma limits_{i = 1}m
count( )
c. d-weight =condition( ) / count( )
d. d-weight =condition( ) / count( )
164. Can class comparison mining be implemented efficiently using data cubetechniques?
a. yesb. no
c. limitedd. difficult
165. Class discrimination is also called as
a. class comparisonb. class hierarchyc. class aggregationd. class concept
166. The set of relevant data in the database is collected by query processed
and is partitioned respectively into a target class and one or a set of _ _ _ _ _class(es)
a. discriminationb. contrastingc. comparabled. target
167. The range for the d-weight is
a.
b.
c.
d.168. A _ _ _ _ _ _ d-weight in the target class indicates that the concept
represented by the generalized tuple is primarily derived from the target class
8/9/2019 dmdw-mid 1
20/22
a. Lowb. High
c. Averaged. Middle
169. A _ _ _ _ _ _ d-weight implies that the concept is primarily derived fromthe contrasting class
a. Lowb. High
c. Averaged. Middle
170. A quantitave discriminant rule for the target class of a given comparisondescription is written in the form
a. x, target _ class(x) compare(x) [d: d-weight]
b. x, contrasting _ class(x) condition(x) [d: d-weight]
c. x, contrasting _ class(x) compare(x) [d: d-weight]
d. x, target _ class(x) condition(x) [d: d-weight]
171. In d-weight, d stands for
a. divide
b. deadc. discriminationd. degree
172. Inter quartile is defined as
a. First quartile -Third quartileb. First quartile + Third quartile
c. Third quartile + First quartiled. Third quartile - First quartile
173. One common rule of thumb for identifying suspected outliers is to singleout values falling at least _ _ _ _ _ _ _ above the third quartile or below the first
quartile.
a.
b.
c.
d.174. The most commonly used percentiles other the median are _ _ _ _ _ _
a. Outliers
b. Boxplotsc. Quartilesd. Modes
175. A popularly used visual representation of a distribution is the _ _ _ _ _ _ __
a. Boxplotb. Outlierc. Quartiled. Histogram
176. Dispersion is also called as
8/9/2019 dmdw-mid 1
21/22
a. Meanb. Variance
c. Mediand. mode
177. Which of the following is central tendency measure?
a. Outliersb. Variancec. Quartiles
d. Mode178. Which of the following is a data dispersion measure?
a. Meanb. Variancec. Moded. Median
179. The average of the largest and smallest values in a data set is called as
a. Median
b. Meanc. Mid ranged. Mode
180. The _ _ _ _ _ _ _ _ for a set of data is the value that occurs most
frequently in the set.
a. Median
b. Meanc. Mid ranged. Mode
181. Which of the following is not central tendency measure?
a. Varianceb. Mean
c. Mediand. Mode
182. A _ _ _ _ _ _ _ _ is one of the most effective graphical methods or trendbetween two quantitative variables.
a. q-q plotb. scatter plotc. quantile plotd. q-q-q plot
183. A _ _ _ _ _ _ _ _ is another important exploratory graphic aid that adds asmooth curve to a scatter plot in order to provide better perception of the pattern
of dependence.
a. Loess curveb. Scatter curvec. Bar chat
d. Quantile plot184. Histograms are also called as _ _ _ _ _ _ _ _ _ histograms.
a. frequencyb. variance
c. quartiled. outlier
185. The word loess is short for
a. Load compression
b. Local compressionc. Load refressiond. Local refression
186. A _ _ _ _ _ _ _ _ _ consists of a set of rectangles that reflect the counts of
the classes present in the given data.
8/9/2019 dmdw-mid 1
22/22
a. Quartile plotb. q-q plot
c. Histogramd. Loess curves
187. A _ _ _ _ _ _ is a simple and effective way to have a first look at anunvariate data distribution.
a. q-q plotb. scatter plot
c. histogramd. quantile plot
188. A _ _ _ _ _ _ _ _ _ , groups the quantiles of one unvariate distributionagainst the correspondings quantiles of another.
a. quantile plotb. q-q-q plot
c. q-q plotd. Scatter plot
Top Related