Tribhuwan University

Institute of Science and Technology

2080

Bachelor Level / Fourth Year / Seventh Semester / Science

B.Sc in Computer Science and Information Technology (CSC420)

(Data Warehousing and Data Mining)

Full Marks: 60

Pass Marks: 24

Time: 3 Hours

Candidates are required to give their answers in their own words as for as practicable.

The figures in the margin indicate full marks.

Section A

Long Answers Questions

Attempt any TWO questions.
[2*10=20]
1.
State Apriori property. Find frequent item sets and association rules from the transaction database given below using Apriori algorithm. Assume min. support is 50% and min confidence is 75%.

$\begin{array}{c|c} \text{Transaction ID} & \text{Items Purchased} \\ \hline 1 & \text{Bread, Cheese, Egg, Juice} \\ 2 & \text{Bread, Cheese, Juice} \\ 3 & \text{Bread, Milk, Yogurt} \\ 4 & \text{Bread, Juice, Milk} \\ 5 & \text{Cheese, Juice, Milk} \\ \end{array}$
[10]
2.
How classification differs from regression. Train ID3 classifier using the dataset given below. Then predict class label for the data [Age=Mid, Competition=Yes, Type=HW].

$\begin{array}{c|c|c|c} \text{Age} & \text{Competition} & \text{Type} & \text{Profit (Class label)} \\ \hline \text{old} & \text{Yes} & \text{SW} & \text{Down} \\ \text{old} & \text{No} & \text{SW} & \text{Down} \\ \text{old} & \text{No} & \text{HW} & \text{Down} \\ \text{mid} & \text{Yes} & \text{SW} & \text{Down} \\ \text{mid} & \text{No} & \text{HW} & \text{Up} \\ \text{mid} & \text{Yes} & \text{HW} & \text{Up} \\ \text{new} & \text{Yes} & \text{SW} & \text{Up} \\ \text{new} & \text{No} & \text{HW} & \text{Up} \\ \text{new} & \text{No} & \text{SW} & \text{Up} \\ \end{array}$
[10]
3.
Why the concept of data mart is important? Discuss different data warehouse schema with examples.[10]
Section B

Short Answers Questions

Attempt any Eight questions.
[8*5=40]
4.
How KDD differs from data mining? Explain various stages of KDD with suitable block diagram. [5]
5.
How many cuboids are possible from 5-dimensional data? Discuss the concept of full cube and iceberg cube. [5]
6.
How K-medoids clustering differs from K-means clustering? Divide the following data points into two clusters using kmedoids algorithm. Show computation up to 3 iterations. {(70,85), (65,80), (72,88), (75,90), (60,50), (64,55), (62,52), (63,58)}. [5]
7.
Discuss working of DBSCAN algorithm. [5]
8.
Which algorithm is used for training multi-layer perceptron? Discuss the algorithm in detail. [5]
9.
Explain the OLAP operations with examples. [5]
10.
Discuss the concept of multimedia data mining along with the concept of similarity search. [5]
11.
Write down short notes on: a. Support Vector Machine b. Multi-dimensional Data Model [5]
12.
Discuss different ways of smoothing noisy data along with suitable examples. [5]