Bachelors Level/Fourth Year/Seventh Semester/Science csit/seventh semester/data warehousing and data mining/syllabus wise questions

B.Sc Computer Science and Information Technology

Institute of Science and Technology, TU

Data Warehousing and Data Mining (CSC420)

Year Asked: 2079, syllabus wise question

Classification and Prediction
1.
When multilayer perceptron is better choice over other classification algorithms? Consider a multilayer feed-forward neural network given below. Let the learning rate be 0.5. Assume initial values of weights and biases as given in the table below. Train the network for the training tuples (1, 1, 0) and (0, 1, 1), where last number is target output. Show weight and bias updates by using back-propagation algorithm. Assume that sigmoid activation function is used in the network.

$\begin{array}{c|c|c|c|c|c|c|c} w13 & w14 & w23 & w24 & w35 & w45 & b3 & b4 & b5 \\ \hline 0.5 & 0.2 & -0.3 & 0.5 & 0.1 & 0.3 & 0.6 & -0.4 & 0.8 \\ \end{array}$
[10]
2.
What is confusion matrix? Discuss various classification measures along with their mathematical formulae. [5]
Cluster Analysis
1.
What are two categories of hierarchical clustering? Divide the following data points into two clusters using agglomerative clustering. { (2,10), (2,5), (8,4), (5,8), (7,5), (6,4) } [5]
2.
Discuss the concept of K-means++ and Mini-batch K-means algorithm. [5]
Data Cube Technology
1.
Suppose that we have 5 dimensional data. What will be total number of cuboids generated? If we consider each dimension has 5 levels, what will be the number of cuboids generated? [5]
Data Preprocessing
1.
Discuss different types of attributes with suitable example of each. [5]
Graph Mining and Social Network Analysis
1.
What are application areas of graph mining? Explain the concept behind inductive logic programming with suitable demonstration. [5]
Introduction to Data Warehousing
1.
Why OLAP operations are used? Discuss various OLAP operation with suitable example of each. [10]
2.
Why data normalization is important in data mining? Explain min-max and Z-score normalization approach. [5]
3.
Write down short notes on: a. Data Mart b. Market Basket Analysis [5]
Mining Frequent Patterns
1.
Discuss any two drawbacks of Apriori algorithm. Find frequent item-sets and association rules from the transaction database given below using FP-growth algorithm. Assume minimum support is 50% and minimum confidence is 60%.

$\begin{array}{c|c} \text{Transaction\_ID} & \text{Items purchased} \\ \hline 1 & \text{Sausage, peanut, Beer} \\ 2 & \text{peanut, Beer, Apple} \\ 3 & \text{Apple, Milk} \\ 4 & \text{Sausage, peanut, Apple} \\ 5 & \text{Sausage, peanut, Beer, Milk} \\ 6 & \text{Sausage, peanut, Beer, Apple} \\ \end{array}$
[10]
Mining Spatial, Multimedia, Text and Web Data
1.
Discuss the concept of text mining with its practical implications. [5]