Bachelors Level/Fourth Year/Eighth Semester/Science bit/eighth semester/data warehousing and data mining/syllabus

Bachelors In Information Technology

Institute of Science and Technology, TU

Nature of the course: (Theory+Lab)

F.M: 60+20+20 P.M: 24+8+8

Credit Hrs: 3Hrs

Data Warehousing and Data Mining [BIT454]
Course Objective
i.
The main objective of this course is to provide knowledge of different data mining techniques and data warehousing .
Course Description

This course introduces advanced aspects of data warehousing and data mining, encompassing the principles, research results and commercial application of the current technologies

S1:Introduction to Data Warehousing[5]
1
Data Warehouse and Data Warehousing, Differences between Operational Database and Data Warehouse, MOLAP, OLAP Operations, Conceptual Modeling of Data Warehouse, Components of Data Warehouse
S2:Introduction to Data Mining[2]
1
Motivation for Data Mining, Introduction to Data Mining System, Data Mining Functionalities, KDD, Data Mining Goals
S3:Data Preprocessing[3]
1
Data Types and Attributes, Various Similarity Measures, Data Cleaning, Data Integration and Transformation, Data Reduction, Data Discretization and Concept Hierarchy Generation
S4:Data Preprocessing[3]
1
Data Types and Attributes, Various Similarity Measures, Data Cleaning, Data Integration and Transformation, Data Reduction, Data Discretization and Concept Hierarchy Generation
S5:Data Cube Technology[4]
1
Cube Materialization (Introduction to Full Cube, Iceberg Cube, Closed Cube, Shell Cube), General Strategies for Cube Computation, Attribute Oriented Analysis (Attribute Generalization, Attribute Relevance, Class Comparison)
S6:Data Cube Technology[4]
1
Cube Materialization (Introduction to Full Cube, Iceberg Cube, Closed Cube, Shell Cube), General Strategies for Cube Computation, Attribute Oriented Analysis (Attribute Generalization, Attribute Relevance, Class Comparison)
S7:Mining Frequent Patterns[6]
1
Frequent Patterns, Market Basket Analysis, Frequent Itemsets, Generating Itemsets and Association Rules, Finding Frequent Itemset (Apriori Algorithm, FP Growth), Generating Association Rules from Frequent Itemset, Limitation and Improving Apriori, Association Mining to Correlation Analysis, Constraint-Based Association Mining
S8:Classification and Prediction[10]
1
Definition (Classification, Prediction), Learning and Testing of Classification, Classification by Decision Tree Induction, ID3 and Gini Index as Attribute Selection Algorithm, Bayesian Classification, Laplace Smoothing, Classification by Back Propagation, Rule Based Classifier (Decision Tree to Rules, Rule Coverage and Accuracy, Efficient of Rule Simplification), Support Vector Machine, Associative Classification, Lazy Learners, Accuracy and Error Measures, Ensemble Methods, Issues in Classification
S9:Cluster Analysis[8]
1
Types of Data in Cluster Analysis, Similarity and Dissimilarity between Objects, Clustering Techniques: - Partitioning Methods, Hierarchical Methods, Density-Based Methods, Grid-Based Methods, Model-Based Clustering Methods, Clustering High-Dimensional Data, Constraint-Based Cluster Analysis, Outlier Analysis
S10:Graph Mining and Social Network Analysis[5]
1
Graph Mining, Why Graph Mining, Graph Mining Algorithm (Beam Search), Mining Frequent Sub Graph, Apriori Graph, Pattern Growth Graph, Graph Indexing, Social Network Analysis, Characteristics of Social Network (Densification Power Law, Shrinking Diameter, Heavy-Tailed Out Degree and In-Degree Distributions), Link Mining (Task Involved in Link Mining, Challenges Faced by Link Mining), Friends of Friends, Viral Marketing, Community Mining, Theory of Balance, Theory of Status, Conflict Between The Theory of Balance and Status), Predicting Positive and Negative Links
S11:Mining Spatial, Multimedia, Text and Web Data[2]
1
Spatial Data Mining, Mining Spatial Association, Multimedia Data Mining, An Introduction to Text Mining, Natural Language Processing and Information Extraction, Web Mining (Web Content Mining, Web Structure Mining, Web Usage Mining)
References
1.
Data Mining: Concepts and Techniques, 3rd ed. Jiawei Han, Micheline Kamber, and Jian Pei. Morgan Kaufmann Series in Data Management Systems Morgan Kaufmann Publishers, July 2011
2.
Introduction to Data Mining, 2nd ed. Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, Vipin Kumar. Pearson Publisher, 2019
3.
Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, Jeffrey D. Ullman, 2014
Labrotary Work
The laboratory should contain all the features mentioned in a course, which should include data preprocessing and cleaning, implementing classification, clustering, association algorithms in any programming language, and data visualization through data mining tools.