School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics...

45
School of Computing Science and Engineering Department of Computer Science and Engineering Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018 M. Tech Advance Computing and Data Science Semester – I CIA: Continuous Internal Assessment L: Theory Lecture T: Tutorial P: Practical TH: Theory Exam. #: Internship for 15 days. *: Oral Examination UC: University Core PC: Programme Core PE: Programme Elective CIA Weightage Description CIA 1 10% Home Assignment CIA 2 20% Mid- Term Exam (MTE) CIA 3 10% Seminar Presentation CIA 4 10% Research Based Activity TOTAL 50% Program Elective- I : 1. Graph Theory and Applications (PADE01) 2. Data Mining (PADE02) Sr. No. Core Course Code Course Name Teaching Scheme (Hrs./Week) Examination Scheme Total Marks L T P C Formative Assessment CIA Summative Assessment ESE Course Lab Course Lab 1 PC PAD101 Statistics and Mathematical Analysis 2 0 0 2 50 -- 50 -- 100 2 PC PAD102 Advanced Databases 3 0 0 3 50 -- 50 -- 100 3 PC PAD103 Big Data Technologies 3 0 0 3 50 -- 50 -- 100 4 PE PAD104 Linux and Python for Machine Learning 3 0 0 3 50 -- 50 -- 100 5 PE PADE__ Program Elective I 3 0 0 3 50 -- 50 -- 100 6 PC PAD111 Big Data Technologies Lab 0 0 4 2 -- 25 -- 25 50 7 PC PAD112 R Programing Lab 0 0 4 2 -- 25 -- 25 50 8 PC PAD113 Advance Databases Lab 0 0 4 2 -- 25 -- 25 50 TOTAL 14 00 12 20 250 75 250 75 650

Transcript of School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics...

Page 1: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

M. Tech Advance Computing and Data Science Semester – I

CIA: Continuous Internal

Assessment

L: Theory Lecture

T: Tutorial

P: Practical

TH: Theory Exam.

#: Internship for 15

days.

*: Oral Examination

UC: University Core

PC: Programme Core

PE: Programme Elective

CIA Weightage Description CIA 1 10% Home Assignment CIA 2 20% Mid-Term Exam (MTE) CIA 3 10% Seminar Presentation CIA 4 10% Research Based Activity

TOTAL 50%

Program Elective- I : 1. Graph Theory and Applications (PADE01) 2. Data Mining (PADE02)

Sr. No. Core Course

Code Course Name

Teaching Scheme (Hrs./Week)

Examination Scheme

Total Marks

L T P C

Formative Assessment

CIA

Summative Assessment

ESE Course Lab Course Lab

1 PC PAD101 Statistics and Mathematical Analysis 2 0 0 2 50 -- 50 -- 100

2 PC PAD102 Advanced Databases 3 0 0 3 50 -- 50 -- 100

3 PC PAD103 Big Data Technologies 3 0 0 3 50 -- 50 -- 100

4 PE PAD104 Linux and Python for Machine Learning 3 0 0 3 50 -- 50 -- 100

5 PE PADE__ Program Elective I 3 0 0 3 50 -- 50 -- 100

6 PC PAD111 Big Data Technologies Lab 0 0 4 2 -- 25 -- 25 50

7 PC PAD112 R Programing Lab 0 0 4 2 -- 25 -- 25 50

8 PC PAD113 Advance Databases Lab 0 0 4 2 -- 25 -- 25 50

TOTAL 14 00 12 20 250 75 250 75 650

Page 2: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – I Course: Statistics and Mathematical Analysis Course Code: PAD101

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4 Lab Theory Lab 2 0 0 2 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00 Hrs. Prerequisites: Probability & Statistics Statistical Algorithm Mathematical Model Objectives: Students are able to:-

1 Develop deep understanding and working knowledge of one chosen area of mathematics, which can come from one of the areas of the faculty expertise, such as analysis, algebra, topology, or statistics

2 Successfully relate theoretical concepts to a real-world problem in a written report 3 Demonstrate the ability to find appropriate research literature appropriate to the

investigative task.

Unit No Details Hours

1

Module 1: Introduction to Statistics, Descriptive Statistics, Summary Statistics Basic probability theory, Statistical Concepts (univariate and bivariate sampling, distributions, resampling, statistical Inference, prediction error),

6

Module 2: Probability Distribution (Continuous and discrete, Normal, Bernoulli, Binomial, Negative Binomial, Geometric and Poisson distribution)

5

2

Module 1: Eigen Values, Matrix & Vector Properties, Matrix Algebra, Linear Transformation, Orthogonal Transformation, mining the web: Page rank Vector Spaces, subspaces, basis, dimension

6

Module 2: Linear Transformation and their representation by Matrices. Matrices: Review of Matrix Algebra; Rank of matrix;

5

3 Module 1: Eigen-values and Eigenvectors; Diagonalisation; Systems of Linear Equations; Quadratic surfaces

4

Module 2: Inner Product Spaces, Orthonormal Sets 4

4 Module 1: Gram Schmidt orthogonalisation process and its applications to the method of least squares

4

Module 2: QR algorithm 2

5 Module 1: Introduction to Optimization problems 3 Module 2: Nature of Solutions and Algorithms. Case studies on various Mathematical Models

3

Page 3: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Outcomes: At the end students will be able to-

1 Provide a concise and clear description of a statistical problem 2 Provide a description of the method used for analysis, including a discussion of advantages,

disadvantages, and necessary assumptions. 3 Provide a discussion of the results and of the statistical analysis 4 Provide a conclusion to the study including a discussion of limitations of the analysis. 5 Provide a derivation for mathematical statistics problems.

Text Books Statistics and Analysis of Scientific Data by MassimilianoBonamente Numerical Linear Algebra by Lloyd N. Trefethen and David Bau, III, SIAM, Philadelphia, ISBN0898713617. Reference Book Applied Numerical Linear Algebra by J. Demmel, published by SIAM, 1997 Matrix Methods in Data Mining and Pattern Recognition, by Lars Elden, SIAM, 2007.

Page 4: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – I Course: Advanced Databases Course Code: PAD102

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4 Lab Theory Lab 3 0 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00 Hrs. Prerequisites: Understanding of the database schema and need for normalization. Knowledge of Design the database schema with the use of appropriate data types for storage of data in database. Use different types of database analysis and visualization tools Objectives: Students are able to:-

1 To Evaluate emerging architectures for database management systems 2 To develop an understanding of the manner in which relational systems are implemented

and the implications of the techniques of implementation for database performance 3 to assess the impact of emerging database standards on the facilities which future database

management systems will provide

Unit No Details Hours

1

Module 1:Database Concepts (File System and DBMS), Database Storage Structures (Tablespace, Control files, Data files)

5

Module 2:Structured and Unstructured data, SQL Commands (DDL, DML & DCL), Dataware Housing concept and tools (ETL tools),

5

2

Module 1: Introduction to Modern databases,NoSQL, NewSQL, NoSQLVs RDBMS databases, Advantages & Tradeoffs, Working with MongoDb,No,SQL, Data Models , XML, working with MongoDB),

4

Module 2:Tools , OLTP and OLAP, data preparation and cleaning techniques

5

3

Module 1: Advanced query optimization: Volcano/Cascades framework for query optimization; multi-query optimization, materialized views and view maintenance

4

Module 2:index and view selection, database tuning. Adaptive query processing and optimization

4

4

Module 1:Query processing on RDF data.Transaction and query processing on main-memory and columnar databases.

4

Module 2: Data streams and stream management systems. Information retrieval and databases

4

5 Module 1: Handling uncertain and precise data. Security and privacy. 4 Module 2:Crowd-sourced databases, applications of declarative querying outside 3

Page 5: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

of database applications

Outcomes: At the end Students will be able to-

1 Critically assess new developments in database technology 2 Interpret and explain the impact of emerging database standards 3 Evaluate the contribution of database theory to practical implementations of database

management systems Text Books

1. Abraham Silberschatz, Henry F. Korth and S. Sudarshan, Database System Concepts 6th Ed, McGraw Hill, 2010.

2. Transaction Processing, Concepts and Techniques, J. Gray and A. Reuter, Morgan Kauffman, 1994

Reference Book 1. MongoDB in Action by Kyle Banker 2. The Definitive Guide –MongoDB by Kristina Chodorow 3. MongoDB Aggregation Framework Principles and Examples by John Lynn 4. Getting Started with NoSQL by GauravVaish 5. Database System Concept by Henry Korth, S.Sudarshan& Abraham Silberschatz

Page 6: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – I Course: Big Data Technologies Course Code: PAD103

Teaching Scheme

(Hrs. /Week)

Continuous Internal Assessment (CIA)

End Semester Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

3 0 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00 Hrs. Prerequisites: Knowledge of data warehouse architecture and dimensional model for data warehousing. Intermediate level knowledge of Comprehend extract, transform and load strategies. Working knowledge of Python, Java and SQL Languages Objectives: Students are able to:-

1 Develop new products and services. The era of big data has created substantial opportunities for developing products aligned with consumer demands

Unit No Details Hours

1

Module 1: Big data definition, enterprise / structured data, social / unstructured data, unstructured data needs for analytics, What is Big Data, Big Deal about Big Data, Big Data Sources, Industries using Big Data, Big Data challenges

5

Module 2: Introduction of Big data programming,Hadoop, History of Hadoop, The ecosystem and stack, The Hadoop Distributed File System (HDFS), Components of Hadoop, Design of HDFS, Java interfaces to HDFS, Architecture overview

5

2

Module 1: The MapReduce Anatomy of a Map Reduce Job run, Failures, Job Scheduling, Shuffle and Sort, Task execution, Map Reduce Types and Formats, Map Reduce Features, Real,WorldMapReduce

5

Module 2:Hadoop ETL Development, ETL Process in Hadoop, Discussion of ETL functions, Data Extractions, Need of ETL tools, Advantages of ETL tools.

4

3

Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics), Talend (big data integration, data management and application integration)

4

Module 2: Introduction to Pig and HIVE, Programming Pig: Engine for executing data flows in parallel on Hadoop, Programming with Hive: Data warehouse system for Hadoop, Optimizing with Combiners and Partitioners (lab), More common algorithms: sorting, indexing and searching (lab)

5

4

Module 1: Setting up a Hadoop Cluster, Cluster specification, Cluster Setup and Installation, Hadoop Configuration, Security in Hadoop, Administering Hadoop, HDFS – Monitoring & Maintenance, Hadoop benchmarks, Hadoop in the cloud

4

Module 2: Overview, Linking with Spark, Initializing Spark, Resilient 6

Page 7: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Distributed Datasets (RDDs), External Datasets, RDD Operations, Passing Functions to Spark, Working with Key,Value Pairs, Shuffle operations, RDD Persistence, Removing Data, Shared Variables, Deploying to a Cluster

5 Module 1: Apache Phoenix Overview, Need of Phoenix, Features, Installation and Configurations, Views and Multi Tenancy, Secondary Indexes, Joins,

2

Module 2: Query Optimizations, Roadmap of Phoenix 2

Outcomes: At the end Students will be able to-

1 Understand the concept and challenge of big data and why existing technology is inadequate to analyze the big data; and

2 Collect, manage, store, query, and analyze various form of big data 3 Gain hands-on experience on large-scale analytics tools to solve some open big data problems

4 Students will demonstrate the ability to think critically in making decisions based on data and deep analytics.

5 Students will demonstrate the ability to use technical skills in predicative and prescriptive modeling to support business decision-making

Text Books 1. Hadoop The Definitive Guide 3nd Edition by O’Rellay ( Author :, Tom White) 2. Big Data Analytics: From Strategic Planning to Enterprise Integration with Tools,Techniques,

NoSQL, and Graph David Loshin, Morgan Kaufmann Publishers,ISBN:9780124173194 Reference Book 1. HadoopReal,World Solutions Cookbook by Packet publication (Author : Jonathan R. Owens,

Jon Lentz,BrianFemiano) 2. Hadoop In Action by Manning Publications (Author:, CHUCK LAM)

Page 8: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – I Course: Linux and Python for Machine Learning Course Code: PAD104

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

3 0 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00

Hrs. Prerequisites: Operating System concepts Version Control System basic ideas Object Oriented Concepts Objectives: Students are able to:-

1 Understanding the basic set of commands and utilities in Linux/UNIX systems. 2 Understanding the basic set of functions and modules in python 3 To understand the inner workings of UNIX-like operating systems

Unit No Details Hours

1

Module 1: Linux History and Operation, Installing and Configuring Linux, Shells, Commands

5

Module 2: Navigation, Common Text Editors, Administering Linux, Introduction to Users and Groups, Linux shell scripting

5

2 Module 1: Version Control System, git, gitbash, github 5 Module 2: Introduction to Python, Basic Syntax, Data Types, Variables, Operators, Input/output

4

3

Module 1: Flow of Control (Modules, Branching), If, If- else, Nested if-else, Looping, For, While, Nested loops, Control Structure, Break, Continue, Pass, Strings and Tuples

4

Module 2: Accessing Strings, Basic Operations, String slices, Working with Lists, Introduction, Accessing list, Operations, Function and Methods, Files, Modules

5

4

Module 1: Dictionaries, Functions and Functional Programming, Declaring and calling Functions, Declare, assign and retrieve values from Lists, Introducing Tuples, Accessing tuples

4

Module 2: Advanced Python: Object Oriented, OOPs concept, Class and object, Attributes, Inheritance

6

5

Module 1: MDO in python, Overloading, Overriding, Data hiding, Operations Exception, Exception Handling, Except clause, Try finally clause, User Defined Exceptions Python Libraries

2

Module 2: Introduction to Machine learning packages like NUMPY, SCIPY, 2

Page 9: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

PANDAS , Matplotlib

Outcomes: At the end Students will be able to-

1 By the end of the course students will have the confidence to navigate throughout the Unix command line environment and to find online solutions for handling basic analytics problem

2 To understand why Python is a useful scripting language for developers. To learn how to design and program Python applications.

3 To learn how to design object‐oriented programs with Python classes. To learn how to use class inheritance in Python for reusability. To learn how to use exception handling in Python applications for error handling.

Text Books Linux bible Book by Christopher Negus Reference Book Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython Book by Wes McKinney

Page 10: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – I Course: Graph Theory and Applications Course Code: PADE01

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

3 0 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00

Hrs. Prerequisites: Knowledge of the techniques of proofs and analysis. Be familiar with the most fundamental Graph Theory topics. Objectives: Students are able to:-

1 To understand fundamentals of graph theory. 2 To study proof techniques related to various concepts in graphs.

Unit No Details Hours

1

Module 1: Graphs – Introduction – Isomorphism – Sub graphs – Walks, Paths, 5 Module 2: Circuits –Connectedness –Components – Euler graphs – Hamiltonian paths and circuits – Trees – Properties of trees – Distance and centers in tree – Rooted and binary trees.

5

2

TREES, CONNECTIVITY & PLANARITY Module 1: Spanning trees – Fundamental circuits – Spanning trees in a weighted graph – cut sets – Properties of cut set – All cut sets – Fundamental circuits and cut sets – Connectivity and separability – Network flows – 1,

4

Module 2: Isomorphism – 2,Isomorphism – Combinational and geometric graphs – Planer graphs – Different representation of a planer graph.

5

3

MATRICES, COLOURING AND DIRECTED GRAPH Module 1: Chromatic number – Chromatic partitioning – Chromatic polynomial – Matching – Covering – Four color problem

5

Module 2:Directed graphs – Types of directed graphs – Digraphs and binary relations – Directed paths and connectedness – Euler graphs

4

4

PERMUTATIONS & COMBINATIONS Module 1:Fundamental principles of counting – Permutations and combinations – Binomial theorem – combinations with repetition

4

Module 2: Combinatorial numbers – Principle of inclusion and exclusion – Derangements – Arrangements with forbidden positions.

4

5 GENERATING FUNCTIONS Module 1: Generating functions – Partitions of integers – Exponential generating function – Summation operator

3

Page 11: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Module 2: Recurrence relations – First order and second order – Non,homogeneous recurrence relations – Method of generating functions

3

Outcomes: At the end of Course Students will be able to-

1 Students will be able to explore modern applications of graph theory. Text Books 1. NarsinghDeo, “Graph Theory: With Application to Engineering and Computer Science”, Prentice Hall of India, 2003. Reference Book 1. Grimaldi R.P. “Discrete and Combinatorial Mathematics: An Applied Introduction”, Addison Wesley, 1994.

Page 12: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – I Course: Data Mining Course Code: PADE02

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

3 0 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00

Hrs. Prerequisites: Knowledge of database technologies and SQL Programming, R Programming, ITIL. Objectives: Students are able to:-

1 To introduce students to the basic concepts and techniques of Data Mining. 2 To develop skills of using recent data mining software for solving practical problems. 3 Understand the functionality of the various data mining and data warehousing components

Unit No Details Hours

1

Module 1: Introduction to Data mining principles, Data mining and knowledge discovery, The need for data mining, Overview of Data warehousing and mining, Advantages and challenges, Data mining applications in various application areas,

5

Module 2:Datawarehousing, Datamarts and OLAP , Data warehouse architectures, Datawarehouse design, Steps in Datawarehousing (ETL), OLAP Vs OLTP, Data marts, design and performance considerations

5

2

Module 1: Data mining Overview, Data mining process, Understanding different types of data and preprocessing considerations, Classification and Prediction

4

Module 2: Naive bayes Classification, back propagation based classification based, tree based, Support Vector machines, Associative classification, Prediction , Regression Trees

5

3

Module 1: Clustering and association Rule mining, k,means clustering, EM technique, Hierarchical Clustering, Density based methods, Grid based methods, Model based methods

5

Module 2: Cluster Analysis and Outlier Analysis. Association Rule mining , Mining Frequent patterns, MIning Various associations and rules

4

4

Module 1: Correlation Analysis, Apriori ( Market basket Analysis), Constraint based association rule mining, Data mining Applications in Text mining, Stream mining and Fraud Detection

4

Module 2: Working with weka, Exploring different data sets, Mining trends, Case studies

4

5 Module 1: Data Quality - Data Quality Assurance 3

Page 13: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Module 2: Data access - Data Privacy and Ethics - Data security 3

Outcomes: At the end Students will be able to-

1 Describe and utilise a range of techniques for designing data warehousing and data mining systems for real-world applications

2 To gain experience of doing independent study and research. Text Books 1. Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications, Volume V ,

John Wang (ed) ,IGI Global ,ISBN:9781599049519 2. Practical Data Mining , Monte F. Hancock, Auerbach Publications, ISBN:9781439868362 Reference Book 1. Data Mining and Data Warehousing ,Bharat BhushanAgarwal and SumitPrakashTayal, Laxmi

Publications ,ISBN:9788131806586 2. Handbook of Statistical Analysis and Data Mining Applications, by Robert Nisbet, John

Elder and Gary Miner ,Academic Press ISBN 9780123747655

Page 14: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – I Course: Big Data Technologies Lab Course Code: PAD111

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

0 0 4 2 -- -- -- -- 25 -- 25 50 Max. Time, End Semester Exam (Theory) -00 Hrs. End Semester Exam (Lab) – 03

Hrs. Prerequisites: Knowledge of data warehouse architecture and dimensional model for data warehousing. Intermediate level knowledge of Comprehend extract, transform and load strategies. Working knowledge of Python, Java and SQL Languages Objectives: Students are able to:-

1 Develop new products and services. The era of big data has created substantial opportunities for developing products aligned with consumer demands

Guidelines for Assessment Continuous assessment of laboratory work is done based on overall performance and lab assignments performance of student. Each lab assignment assessment will assign grade/marks based on parameters with appropriate weightage. Suggested parameters for overall assessment as well as each lab assignment assessment include- timely completion, performance, innovation, efficient codes, punctuality and neatness. Guidelines for Laboratory Conduction The instructor is expected to frame the assignments by understanding the prerequisites, technological aspects, utility and recent trends related to the topic. The assignment framing policy need to address the average students and inclusive of an element to attract and promote the intelligent students. The instructor may set multiple sets of assignments and distribute among batches of students. It is appreciated if the assignments are based on real world problems/applications. Encourage students for appropriate use of Hungarian notation, Indentation and comments. Use of open source software is encouraged. In addition to these, instructor may assign one real life application in the form of a mini-project based on the concepts learned. Instructor may also set one assignment or mini-project that is suitable to respective branch beyond the scope of syllabus. Operating System recommended : 64-bit Open source Linux or its derivative Programming tools recommended: Open Source C Programming tool like GCC

Suggested List of Laboratory Assignments List will be provided by the subject trainer.

Page 15: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – I Course: R Programming Lab Course Code: PAD112

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4 Lab Theory Lab 0 0 4 2 -- -- -- -- 25 -- 25 50 Max. Time, End Semester Exam (Theory) -00 Hrs. End Semester Exam (Lab) – 03 Hrs. Prerequisites: Probability & Statistics, Statistical Algorithm, Mathematical Model Objectives: Students are able to:-

1 Develop deep understanding and working knowledge of one chosen area of mathematics, which can come from one of the areas of the faculty expertise, such as analysis, algebra, topology, or statistics

2 Successfully relate theoretical concepts to a real-world problem in a written report 3 Demonstrate the ability to find appropriate research literature appropriate to the

investigative task.

Guidelines for Assessment Continuous assessment of laboratory work is done based on overall performance and lab assignments performance of student. Each lab assignment assessment will assign grade/marks based on parameters with appropriate weightage. Suggested parameters for overall assessment as well as each lab assignment assessment include- timely completion, performance, innovation, efficient codes, punctuality and neatness. Guidelines for Laboratory Conduction The instructor is expected to frame the assignments by understanding the prerequisites, technological aspects, utility and recent trends related to the topic. The assignment framing policy need to address the average students and inclusive of an element to attract and promote the intelligent students. The instructor may set multiple sets of assignments and distribute among batches of students. It is appreciated if the assignments are based on real world problems/applications. Encourage students for appropriate use of Hungarian notation, Indentation and comments. Use of open source software is encouraged. In addition to these, instructor may assign one real life application in the form of a mini-project based on the concepts learned. Instructor may also set one assignment or mini-project that is suitable to respective branch beyond the scope of syllabus. Operating System recommended : 64-bit Open source Linux or its derivative Programming tools recommended: Open Source C Programming tool like GCC

Suggested List of Laboratory Assignments List will be provided by the subject trainer.

Page 16: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – I Course: Advanced Databases Lab Course Code: PAD 113

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4 Lab Theory Lab 0 0 4 2 -- -- -- -- 25 -- 25 50 Max. Time, End Semester Exam (Theory) -00 Hrs. End Semester Exam (Lab) – 03 Hrs. Prerequisites: Understanding of the database schema and need for normalization. Knowledge of Design the database schema with the use of appropriate data types for storage of data in database. Use different types of database analysis and visualization tools Objectives: Students are able to:-

1 To Evaluate emerging architectures for database management systems 2 To develop an understanding of the manner in which relational systems are implemented

and the implications of the techniques of implementation for database performance 3 to assess the impact of emerging database standards on the facilities which future database

management systems will provide

Guidelines for Assessment Continuous assessment of laboratory work is done based on overall performance and lab assignments performance of student. Each lab assignment assessment will assign grade/marks based on parameters with appropriate weightage. Suggested parameters for overall assessment as well as each lab assignment assessment include- timely completion, performance, innovation, efficient codes, punctuality and neatness. Guidelines for Laboratory Conduction The instructor is expected to frame the assignments by understanding the prerequisites, technological aspects, utility and recent trends related to the topic. The assignment framing policy need to address the average students and inclusive of an element to attract and promote the intelligent students. The instructor may set multiple sets of assignments and distribute among batches of students. It is appreciated if the assignments are based on real world problems/applications. Encourage students for appropriate use of Hungarian notation, Indentation and comments. Use of open source software is encouraged. In addition to these, instructor may assign one real life application in the form of a mini-project based on the concepts learned. Instructor may also set one assignment or mini-project that is suitable to respective branch beyond the scope of syllabus. Operating System recommended : 64-bit Open source Linux or its derivative Programming tools recommended: Open Source C Programming tool like GCC

Suggested List of Laboratory Assignments List will be provided by the subject trainer.

Page 17: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

M. Tech Advance Computing and Data Science Semester – II

CIA: Continuous Internal

Assessment

L: Theory Lecture

T: Tutorial

P: Practical

TH: Theory Exam.

#: Internship for 15

days.

*: Oral Examination

UC: University Core

PC: Programme Core

PE: Programme Elective

CIA Weightage Description CIA 1 10% Home Assignment CIA 2 20% Mid-Term Exam (MTE) CIA 3 10% Seminar Presentation CIA 4 10% Research Based Activity

TOTAL 50%

Program Elective- II : 1. Business Intelligence , Data Analysis and Data Visualization (PADE03) 2. Social Computing (PADE04)

Sr. No. Core Course

Code Course Name

Teaching Scheme (Hrs./Week)

Examination Scheme

Total Marks

L T P C

Formative Assessment

CIA

Summative Assessment

ESE Course Lab Course Lab

1 PC PAD201 Practical Machine Learning 3 0 0 3 50 -- 50 -- 100

2 PC PAD202 Advance Data Structures 3 0 0 3 50 -- 50 -- 100

3 PC PAD203 High Performance Computing 3 0 0 3 50 -- 50 -- 100

4 PC PAD204 Spatial Computing 3 0 0 3 50 -- 50 -- 100

5 PE PADE__ Program Elective II 3 0 0 3 50 -- 50 -- 100

6 PC PAD211 Practical Machine Learning Lab 0 0 4 2 -- 25 -- 25 50

7 PC PAD212 Advance Data Structures Lab 0 0 4 2 -- 25 -- 25 50

8 PC PAD213 Parallel Computing Lab 0 0 4 2 -- 25 -- 25 50

TOTAL 15 00 12 21 250 75 250 75 650

Page 18: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – II Course: Practical Machine Learning Course Code: PAD201

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4 Lab Theory Lab 3 0 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00 Hrs. Prerequisites: Familiarity with basic concepts of computer science (algorithms, data structures, and complexity), mathematical maturity commensurate in discrete math, matrix math, probability and statistics, and the ability to program algorithms in a language of your choice (e.g., C++ or Matlab) in a Linux environment. Objectives: Students are able to:-

1 Understand a wide variety of learning algorithms. 2 Understand how to evaluate models generated from data.

Unit No Details Hours

1

Module 1:Introduction to machine learning, Supervised, Semi Supervised, Unsupervised Learning and reinforced learning , Uses of Machine learning

5

Module 2: Clustering, K means, Hierarchical Clustering, Decision Trees, Oblique trees

5

2 Module 1: Classification problems, Bayesian analysis and Naïve bayes classifier 4 Module 2: Random forest, Gradient boosting Machines, Association rules learning, Apriori and FP,growth algorithms

5

3 Module 1: Support vector Machines, Linear and Non liner classification 5 Module 2:ARIMA, ML in real time, Ensemble methods, Neural Networks and its application, Neural Net & its applications

4

4

Module 1:large scale machine learning, anomaly detection, machine learning system design, regression

4

Module 2: Regression methods: least-square regression, kernel regression, regression trees

4

5 Module 1: Unsupervised learning: k-means, hierarchical 3 Module 2: EM, non-negative matrix factorization, rate distortion theory 3

Outcomes: At the end Students will be able to-

1 Develop an appreciation for what is involved in learning models from data. 2 apply the algorithms to a real-world problem, optimize the models learned and report on the

expected accuracy that can be achieved by applying the models.

Page 19: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Text Books 1. Machine Learning, by Tom Mitchell, McGraw-Hill, 1997 Reference Book 1. Machine Learning for Big Data by Jasaon Bell , Wiley 2. Machine Learning with R by Brett Lantz

Page 20: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – II Course: Advanced Data Structure Course Code: PAD202

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

3 0 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00

Hrs. Prerequisites: Know the importance of data governance for managing Big Data Outline the components needed in a Big Data Platform Objectives: Students are able to:-

1 To understand and apply amortised analysis on data structures, including binary search trees, mergable heaps, and disjoint sets.

Unit No Details Hours

1 Module 1: Basic Concepts of OOPs, Templates Function and class templates, Algorithms

5

Module 2: performance analysis, time complexity and space complexity, 5

2

Module 1: ADT, List (Singly– Doubly and Circular) Implementation, Array, Pointer, Cursor Implementation, Stacks and Queues, ADT, Implementation and Applications

4

Module 2: Trees, General, Binary, Binary Search, Expression Search, AVL, Introduction to Red Black trees and Splay tree,B Trees

5

3

Module 1: Implementations Tree Traversals ,Set, Implementation, Basic Operations on Set, Priority Queue

5

Module 2:Implementation, Graphs, Directed Graphs, Shortest Path Problem, Undirected Graph, Spanning Trees, Graph, Traversals,

4

4

Module 1:hash table representation, hash functions, collision resolution, separate chaining, open addressing, linear probing, quadratic probing, double hashing, rehashing, Issues, Managing Equal Sized Blocks

4

Module 2: Garbage Collection Algorithms for Equal Sized Blocks,Storage Allocation for Objects with Mixed Sizes,Buddy Systems, Storage Compaction

4

5

Module 1: Unit V: Searching Techniques – Sorting – Internal Sorting – Bubble Sort – Insertion Sort – Quick Sort – Heap Sort – Bin Sort – Radix Sort – External Sorting – Merge Sort – Multiway Merge Sort – Polyphase Sorting

3

Module 2: Design Techniques – Divide and Conquer – Dynamic Programming – Greedy Algorithm – Backtracking – Local Search Algorithms

3

Page 21: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Outcomes: At the end of Course Students will be able to-

1 It introduces students to a number of highly efficient algorithms and data structures for fundamental computational problems across a variety of areas

Text Books 1. Advanced Data Structures by Prof Peter Brass Reference Book 1. Data Structures and Algorithm Analysis in C++ Hardcover – 13 Jun 2013by Mark A. Weiss

Page 22: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – II Course: High Performance Computing Course Code: PAD203

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

3 0 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00

Hrs. Prerequisites: Knowledge of Computer Architecture, Linux, C/C++, and Operating System. Knowledge of Virtualization and Basic virtual machine management system Objectives: Students are able to:-

1 Demonstrate a basic knowledge of numerical computing using an appropriate programming language.

2 Be competent in experimental computing in a numerical context and of the optimization of algorithms on high performance architectures

3 Be able to reason about the accuracy of mathematical and numerical models of real physical phenomena

Unit No Details Hours

1

Module 1: Concepts of parallelism, Introduction Amdahl's law and Gustafson's law, Dependencies, Interconnection networks

5

Module 2: Race conditions, mutual exclusion, synchronization, and parallel slowdown, Fine,grained, coarse, grained, and embarrassing parallelism

5

2

Module 1:Bit,level parallelism, Instruction,level parallelism Data parallelism, Task parallelism, Classes of parallel computers, Multicore computing, Symmetric multiprocessing, Distributed computing

5

Module 2: Cluster computing, Massive parallel processing, Grid computing Specialized parallel computers, MPI Programming

4

3

Module 1: Application porting, execution and scalability analysis: Compiler flags, vectorization

4

Module 2: memory alignment of data, porting of application on Linux Measurement of Application execution time and memory consumption with small, medium and large datasets

5

4

Module 1: Scalability analysis and identification of performance bottlenecks, Profiling of applications to find opportunities for performance optimization

4

Module 2: Addition of directives, Restructuring of code for performance optimization

6

5 Module 1: Communication optimization through configuration of MPI calls of the 2

Page 23: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

underlying MPI implementation Module 2: Partitioning applications for heterogeneous resources, Use of existing libraries, tools, and frameworks

2

Outcomes: At the end of Course students will be able to-

1 analyze a given problem for possibilities of parallel computations 2 select algorithms and hardware for the solution of high performance projects 3 program computers with shared and distributed memory architectures 4 use appropriate programming languages efficiently for scientific computations,run parallel

programs on different hardware architectures and software environments 5 assess the performance of implementations,optimize the performance of codes.

Text Books An Introduction to Parallel Commuting Design and Analysis of Algorithm by Vipin, Ananth, Anshul/Pearson Parallel Programming in C with MPI and Open MPI by Michael/ McGraw Hill Edu. Reference Book Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers by Barry Wilkinson, Pearson An Introduction To Parallel Computing : Design And Analysis Of Algorithms 2ed 2ndediton Edition Author: Vipin Kumar, AnanthGrama, Anshul Gupta, George Karypis

Page 24: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – II Course: Spatial Computing Course Code: PAD204

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

3 0 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00

Hrs. Prerequisites: Knowledge of Geo information system and GPS Services and Spatial Information Objectives: Students are able to:-

1 The objective of spatial modeling is to be able to study and simulate spatial objects or phenomena that occur in the real world and facilitate problem solving and planning.

Unit No Details Hours

1

Module 1:Spatial Computing- Introduction to Spatial Computing –applications of spatial computing - advantages and disadvantages

5

Module 2: GIS- Introduction to GIS - Concepts of GIS- Applications of GIS - Functional

5

2

Module 1: Elements / Component of GIS - GIS data type - Raster Data Structure - Vector Data

5

Module 2: Structure - functions of a GIS - Topology - Geographic Data Acquisition - Introduction to Map Projections

4

3

Module 1: GPS - Introduction to GPS – History - Segments of GPS - Operating Principle - trilateration - Surveying with GPS

4

Module 2: - Differential GPS - Types of Errors - advantages and limitation of GPS, applications of GPS

5

4

Module 1: Location Based Services (LBS)- overview of LBS -components - navigation services – applications

4

Module 2: Spatial Statistics – spatial autocorrelation – spatial neighbourhood – distance model – adjacency model - contiguity - Spatial Query - Raster analysis - Vector analysis - network model and analysis - proximity analysis - closest facility analysis - spatial overlay analysis - buffer analysis, application areas of spatial analysis/statistics

4

5 Module 1: Spatial Database Management – Introduction to spatial database management, spatial data models –retrieve / read data from an existing database -

3

Page 25: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

design a database schema - create spatially enabled tables in Postgres/PostGIS - manage spatial data - query language Module 2: Virtual Globes, Introduction to Virtual Globes, applications of virtual globes, advantages of virtual globes Visualization techniques - spatial data visualization techniques – applications

3

Outcomes: At the end of Course students will be able to-

1 Having in depth knowledge of Geo information system and GPS Services and Spatial Information

Text Books Design, Implementation and Management Issues by Spatial Data Repositories in Business by Joseph Hayes (Author) ISBN-10: 1543078850 ISBN-13: 978-1543078855 Reference Book Spatial Computing: Issues in Vision, Multimedia and Visualization Technologies

Page 26: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – II Course: Business Intelligence , Data Analysis and Data VisualizationCourse Code: PADE03

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

3 0 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00

Hrs. Prerequisites: Knowledge of database technologies Understanding Visualization of Data Objectives: Students are able to:-

1 Able to understand Business Analytics, ETL, Data warehouse and mining Techniques.

Unit No Details Hours

1

Module 1: BI basic, Information gathering, Decision,making, Managing BI, BI User Segmentation, Gathering BI Requirements, Content Management Knowledge Management, Social media, Strategic Approach to BI

6

Module 2: data analytics, Overview & analytics life cycle, Need, Structured and multi , structured data analysis, Big , data analytics major components, Analytical models and approaches

2

Module 1: Relational and non , relational Databases, Application areas, Design and analysis of Analytics model,Analytics design steps, Understanding different data processing models, Statistical models, Predictive models, Descriptive models

Module 2: Regression Analysis, Forecasting Techniques, Simulation and Risk Analysis, Optimization, Linear, Non linear, Integer, Decision Analysis, Strategy and Analytics

3

Module 1: Information Visualization, Data analytics Life Cycle, Analytic Processes and Tools, Analysis vs. Reporting, Modern Data Analytic Tools

Module 2: Visualization Techniques, Visual Encodings, Visualization algorithms, Data collection and binding

4 Module 1: Design Patterns/Best Practices, Cognitive issues, Interactive visualization

Module 2: Visualizing big data – structured vs unstructured,

5 Module 1: Visual Analytics Module 2:Geomapping, Dashboard Design,

Outcomes: At the end of Course students will be able to-

Page 27: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

1 Use appropriate models of analysis, assess the quality of input, derive insight from results, and investigate potential issues.

Text Books Python Data Visualization Cookbook,IgorMilovanović, Packt Publishing Data Visualization with D3.js Cookbook,Nick Qui Zhu,Packt Publishing Reference Book Business Intelligence Guidebook 1st Edition From Data Integration to Analytics Authors: Rick Sherman Paperback ISBN: 9780124114616 eBook ISBN: 9780124115286

Page 28: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – II Course: Social Computing Course Code: PADE04

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

3 0 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00

Hrs. Prerequisites: Knowledge of Data Analytics and Analysis of the Social Networks Objectives: Students are able to:-

1 Understand important features of social computing, 2 Understand the research issues in this field..

Unit No Details Hours

1

Online Social Networks (OSNs) Module 1: Introduction , Types of social networks (e.g., Twitter, Facebook), Measurement and Collection of Social Network Data Techniques to study different aspects of OSNs ,, Follower,followee dynamics, link farming, spam detection, hashtag popularity and prediction, linguistic styles of tweets,

5

Module 2: Human Centered Computing , Classes of human,centered computation, Methods of human,centered computation, Incentives for participation, computer supported co,opeartive work, computer supported collaborative learning Crowdsourcing as a Model for Problem Solving, ESP Game

5

2

Models of Opinion Formation Module 1: Opinion Dynamics , Continuous and Discrete Models Cultural, Language Dynamics , Axelrod Model and its variant, The Naming game, The Category Game

4

Module 2: Crowd Behavior, Flocking, Pedestrian behavior, Applause Dynamics and Mexican Wave Formation of Hierarchies , The Bonabeau Model, The advancement,decline Model Social spreading Phenomena, rumor spreading, gossip spreading

5

3

Fundamentals of Social Data Analytics Module 1: Introduction , Working with Social Media Data Topic Models Modeling social interactions on the Web

5

Module 2:Random Walks Variants of random walk

4

Page 29: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

4

Applied Social Data Analytics Module 1:Application of Topic models Opinions and Sentiments , Mining, Analysis and Summarization

4

Module 2: Recommendation Systems Language dynamics and influence in online communities

4

5 Module 1: Community identification, link prediction and topical search in social networks

3

Module 2: Psychometric analysis 3

Outcomes: At the end of Course students will be able to-

1 Design and prototype new social computing systems, 2 Analyze data left behind in social media

Text Books 1. Introduction to Computational Social Science Principles and Applications Authors: Cioffi-Revilla, Claudio , SBN 978-1-4471-5661-1

Reference Book

1. Matthew A. Russell. Mining the Social Web: Data Mining Facebook, Twitter, Linkedin, Google+, Github, and More, 2nd Edition, O'Reilly Media, 2013.

2. Robert Hanneman and Mark Riddle. Introduction to social network methods. Online Text Book, 2005.

3. Jennifer Golbeck, Analyzing the social web, Morgan Kaufmann, 2013. 4. Claudio Castellano, Santo Fortunato, and Vittorio Loreto, Statistical physics of social

dynamics, Rev. Mod. Phys. 81, 591, 11 May 2009. 5. S. Fortunato and C. Castellano, Word of mouth and universal voting behaviour in

proportional elections, Phys. Rev. Lett. 99, (2007). 6. Douglas D. Heckathorn, The Dynamics and Dilemmas of Collective Action, American

Sociological Review (1996). 7. Michael W. Macy and Robert Willer, From factors to actors: Computational Sociology and

Agent,Based Modeling, Annual Review of Sociology Vol. 28: 143,166 (2002).

Page 30: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – II Course: Practical Machine Learning Lab Course Code: PAD211

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

0 0 4 2 -- -- -- -- 25 -- 25 50 Max. Time, End Semester Exam (Theory) -00 Hrs. End Semester Exam (Lab) – 03

Hrs. Prerequisites: Familiarity with basic concepts of computer science (algorithms, data structures, and complexity), mathematical maturity commensurate in discrete math, matrix math, probability and statistics, and the ability to program algorithms in a language of your choice (e.g., C++ or Matlab) in a Linux environment. Objectives: Students are able to:-

1 Understand a wide variety of learning algorithms. 2 Understand how to evaluate models generated from data.

Guidelines for Assessment Continuous assessment of laboratory work is done based on overall performance and lab assignments performance of student. Each lab assignment assessment will assign grade/marks based on parameters with appropriate weightage. Suggested parameters for overall assessment as well as each lab assignment assessment include- timely completion, performance, innovation, efficient codes, punctuality and neatness. Guidelines for Laboratory Conduction The instructor is expected to frame the assignments by understanding the prerequisites, technological aspects, utility and recent trends related to the topic. The assignment framing policy need to address the average students and inclusive of an element to attract and promote the intelligent students. The instructor may set multiple sets of assignments and distribute among batches of students. It is appreciated if the assignments are based on real world problems/applications. Encourage students for appropriate use of Hungarian notation, Indentation and comments. Use of open source software is encouraged. In addition to these, instructor may assign one real life application in the form of a mini-project based on the concepts learned. Instructor may also set one assignment or mini-project that is suitable to respective branch beyond the scope of syllabus. Operating System recommended : 64-bit Open source Linux or its derivative Programming tools recommended: Open Source C Programming tool like GCC

Suggested List of Laboratory Assignments List will be provided by the subject trainer.

Page 31: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – II Course: Advanced Data Structures Lab Course Code: PAD212

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

0 0 4 2 -- -- -- -- 25 -- 25 50 Max. Time, End Semester Exam (Theory) -00 Hrs. End Semester Exam (Lab) – 03 Hrs. Prerequisites: Know the importance of data governance for managing Big Data Outline the components needed in a Big Data Platform Objectives: Students are able to:-

1 to understand and apply amortised analysis on data structures, including binary search trees, mergable heaps, and disjoint sets.

Guidelines for Assessment Continuous assessment of laboratory work is done based on overall performance and lab assignments performance of student. Each lab assignment assessment will assign grade/marks based on parameters with appropriate weightage. Suggested parameters for overall assessment as well as each lab assignment assessment include- timely completion, performance, innovation, efficient codes, punctuality and neatness. Guidelines for Laboratory Conduction The instructor is expected to frame the assignments by understanding the prerequisites, technological aspects, utility and recent trends related to the topic. The assignment framing policy need to address the average students and inclusive of an element to attract and promote the intelligent students. The instructor may set multiple sets of assignments and distribute among batches of students. It is appreciated if the assignments are based on real world problems/applications. Encourage students for appropriate use of Hungarian notation, Indentation and comments. Use of open source software is encouraged. In addition to these, instructor may assign one real life application in the form of a mini-project based on the concepts learned. Instructor may also set one assignment or mini-project that is suitable to respective branch beyond the scope of syllabus. Operating System recommended : 64-bit Open source Linux or its derivative Programming tools recommended: Open Source C Programming tool like GCC

Suggested List of Laboratory Assignments List will be provided by the subject trainer.

Page 32: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: First Year Semester – II Course: Parallel Computing Lab Course Code: PAD213

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4 Lab Theory Lab 0 0 4 2 -- -- -- -- 25 -- 25 50 Max. Time, End Semester Exam (Theory) -00 Hrs. End Semester Exam (Lab) – 03 Hrs. Prerequisites: Knowledge of Computer Architecture, Linux, C/C++, and Operating System. Knowledge of Virtualization and Basic virtual machine management system Objectives: Students are able to:-

1 Demonstrate a basic knowledge of numerical computing using an appropriate programming language.

2 Be competent in experimental computing in a numerical context and of the optimization of algorithms on high performance architectures

3 Be able to reason about the accuracy of mathematical and numerical models of real physical phenomena

Guidelines for Assessment Continuous assessment of laboratory work is done based on overall performance and lab assignments performance of student. Each lab assignment assessment will assign grade/marks based on parameters with appropriate weightage. Suggested parameters for overall assessment as well as each lab assignment assessment include- timely completion, performance, innovation, efficient codes, punctuality and neatness. Guidelines for Laboratory Conduction The instructor is expected to frame the assignments by understanding the prerequisites, technological aspects, utility and recent trends related to the topic. The assignment framing policy need to address the average students and inclusive of an element to attract and promote the intelligent students. The instructor may set multiple sets of assignments and distribute among batches of students. It is appreciated if the assignments are based on real world problems/applications. Encourage students for appropriate use of Hungarian notation, Indentation and comments. Use of open source software is encouraged. In addition to these, instructor may assign one real life application in the form of a mini-project based on the concepts learned. Instructor may also set one assignment or mini-project that is suitable to respective branch beyond the scope of syllabus. Operating System recommended : 64-bit Open source Linux or its derivative Programming tools recommended: Open Source C Programming tool like GCC

Suggested List of Laboratory Assignments List will be provided by the subject trainer.

Page 33: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

M. Tech Advance Computing and Data Science

Semester – III

CIA: Continuous Internal

Assessment

L: Theory Lecture

T: Tutorial

P: Practical

TH: Theory Exam.

#: Internship for 15

days.

*: Oral Examination

UC: University Core

PC: Programme Core

PE: Programme Elective

CIA Weightage Description CIA 1 10% Home Assignment CIA 2 20% Mid-Term Exam (MTE) CIA 3 10% Seminar Presentation CIA 4 10% Research Based Activity

TOTAL 50%

Program Elective-III :

1. AI and Soft Computing ( PADE05) 2. Data Security ( PADE06)

Sr. No. Core Course

Code Course Name

Teaching Scheme (Hrs./Week)

Examination Scheme

Total Marks

L T P C

Formative Assessment

CIA

Summative Assessment

ESE Course Lab Course Lab

1 PC PAD301 Cloud Computing and Virtualization 3 0 0 3 50 50 -- 100

2 PE PADE__ Program Elective III 3 0 0 3 50 50 -- 100

3 PC PAD311 Seminars and Technical Communications 0 0 4 2 -- 50 -- -- 50

4 PC PAD312 Presentation of Literature Review 0 0 4 2 -- 50 -- 50 100

5 UC PAD313 Dissertation Phase - I 0 0 16 10 -- 50 -- 100 150

TOTAL 06 00 24 20 100 150 100 150 500

Page 34: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: Second Year Semester – III Course: Cloud Computing & Virtualization Course Code: PAD301

Teaching Scheme (Hrs.

/Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

3 0 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00

Hrs. Prerequisites: Networking Concepts, Client Server Architecture, Linux Operating Systems, Computer Architecture Objectives: Students are able to:-

1 Articulate the main concepts, key technologies, strengths, and limitations of cloud computing and the possible applications for state-of-the-art cloud computing

2 Identify the architecture and infrastructure of cloud computing, including SaaS, PaaS, IaaS, public cloud, private cloud, hybrid cloud, etc.

Unit No Details Hours

1

Module 1: Introduction to Cloud Computing, Evolution, Benefits and Barriers, Cloud SPI models, Cloud Computing Vs Cluster Computing, Technology Involved in Cloud Computing , Infrastructure as a Service (IaaS),Virtualization, Platform as a service (PaaS), Cloud platform management, Software as a Service, Case studies

5

Module 2: Service Management in Cloud Computing, Service Level Agreements (SLAs),Billing & Accounting , Scaling Cloud Hardware, Managing Data

5

2

Module 1: Cloud computing standards and Interoperability, technical considerations for migration to the cloud, integrating existing applications with the cloud

5

Module 2: Performance Management in a Virtual Environment: Management techniques, methodology and key performance metrics used to identifying CPU, memory, network, virtual machine and application performance bottlenecks in a virtualized environment.

4

3

Module 1: Backup and recovery of virtual machines using data recovery techniques; Scalability: Scalability features within Enterprise virtualized environments using advanced management applications that enable clustering, distributed network switches for clustering

4

Module 2: network and storage expansion; High Availability : 5 4 Module 1: Virtualization high availability and redundancy techniques, 4

Page 35: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Module 2: Software Defined network. 6

5 Module 1: Cloud Security and Privacy, 2 Module 2: Infrastructure security, Data security and Storage, Data privacy, access management

2

Outcomes: At the end of Course students will be able to-

1 Analyze the Cloud computing setup with it's vulnerabilities and applications using different architectures.

2 Design different workflows according to requirements and apply map reduce programming model.

3 Apply and design suitable Virtualization concept, Cloud Resource Management and design scheduling algorithms.

4 Create combinatorial auctions for cloud resources and design scheduling algorithms for computing clouds

5 Broadly educate to know the impact of engineering on legal and societal issues involved in addressing the security issues of cloud computing.

Text Books Cloud Computing Bible, Barrie Sosinsky, Wiley-India, 2010 Reference Book Cloud Computing: Principles and Paradigms, Editors: RajkumarBuyya, James Broberg, Andrzej M. Goscinski, Wile,2011

Page 36: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: Second Year Semester – III Course: AI and Soft Computing Course Code: PADE05

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

3 2 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00

Hrs. Prerequisites: Design of Algorithms, Language Processing and artificial intelligence Objectives: Students are able to:-

1 To provide the students with the concepts of soft computing techniques such as neural networks, fuzzy systems, genetic algorithms

Unit No Details Hours

1

Module 1: Introduction to AI, Problem Formulation, Production system, Heuristic Search Techniques, Knowledge representation and reasoning, Ontology, Natural Language Processing

4

Module 2: Prepositional logic, First order predicate logic, fuzzy logic, methods of reasoning, pattern reorganization, multilayer Neural Network, Self-organizing Neural Network

4

2

Module 1: Artificial neural network: Introduction, characteristics, learning methods, taxonomy, Evolution of neural networks, basic models, important technologies, applications

4

Module 2: Fuzzy logic: Introduction, crisp sets, fuzzy sets, crisp relations and fuzzy relations: cartesian product of relation, classical relation, fuzzy relations, tolerance and equivalence relations, non-iterative fuzzy sets

4

3

Module 1: Genetic algorithm, Introduction, biological background, traditional optimization and search techniques, Genetic basic concepts. McCulloch-Pitts neuron,linearseparability, hebb network

4

Module 2:supervised learning network: perceptron networks , adaptive linear neuron, multiple adaptive linear neuron, BPN, RBF, TDNN,

4

4

Module 1:associative memory network: auto-associative memory network, hetero-associative memory network, BAM, hopfield networks, iterative autoassociative memory network & iterative associative memory network,unsupervised learning networks: Kohonenself-organizing feature maps, LVQ ,CP networks, ART network.

4

Module 2: Membership functions: features, fuzzification, methods of membership value assignments- Defuzzification: lambda cuts methods, fuzzy arithmetic and

4

Page 37: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

fuzzy measures: fuzzy arithmetic,extension principle, fuzzy measures, measures of fuzziness

5

Module 1: fuzzy integrals,fuzzy rule base and approximate reasoning : truth values and tables, fuzzy propositions, formation of rules decomposition of rules, aggregation of fuzzy rules, fuzzy reasoning,fuzzy inference systems, overview of fuzzy expert system, fuzzy decision making.

3

Module 2: Genetic algorithm and search space,general genetic algorithm,operators,Generational cycle,stopping condition, constraints,classification genetic programming, multilevel optimization, real life problem, Neuro fuzzy hybrid systems, genetic neuro hybrid systems,genetic fuzzy hybrid and fuzzy genetic hybrid systems,simplified fuzzy ARTMAP Applications: A fusion approach of multispectral images with SAR, optimization of traveling salesman problem using genetic algorithm approach, soft computing based hybrid fuzzy controllers.

7

Outcomes: At the end of Course students will be able to-

1 Good knowledge of AI algorithms 2 Neural networks and acrchitectures fundamentals 3 Fuzzy Logic, Various fuzzy systems and their functions.

Text Books 1. Neural Networks in a Softcomputing Framework by K.-L. Du , M.N.S. Swamy 2. Soft Computing: Fundamentals and Applications , Revised Edition Author(s): D. K.

Pratihar ISBN: 978-81-8487-495-2 Reference Book

1. Software Agents and Soft Computing: Towards Enhancing Machine Intelligence: Concepts and Applications

Page 38: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: Second Year Semester – III Course: Data Security Course Code: PADE06

Teaching Scheme

(Hrs. /Week) Continuous Internal Assessment (CIA) End Semester

Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4

Lab Theory Lab

3 2 0 3 10 20 10 10 -- 50 -- 100 Max. Time, End Semester Exam (Theory) -3Hrs. End Semester Exam (Lab) – 00

Hrs. Prerequisites: Basics of Network security and Data manipulations and Authentication & authorization, PKI and Information security, intrusion &malware . Objectives: Students are able to:-

1 To train students in the organizing and the technic al realization and security of data

Unit No Details Hours

1

Module 1:Concepts of scripting language, Documenting Functions, object, Indenting Code, Testing Modules

4

Module 2:Native Datatypes, Introducing Dictionaries , Defining Dictionaries, Modifying Dictionaries, Deleting Items From Dictionaries,

4

2

Module 1:Introducing Lists, Defining Lists, Adding Elements to Lists, Searching Lists, Deleting List Elements, Using List Operators

4

Module 2:Introducing Tuples, Declaring variables, Referencing Variables, Assigning Multiple Values at Once, Formatting Strings, Mapping Lists

4

3

Module 1: Joining Lists and Splitting Strings , Historical Note on String Methods, Using Optional and Named Arguments, Using type, str, dir, and Other Built-In Functions,

4

Module 2:Object References, Filtering Lists, objects and Object-Orientation, Exceptions and File Handling

4

4

Module 1:Securing Big Data Environments, Securing Big Data :Application Software Security, Maintenance

4

Module 2:Monitoring, and Analysis of Audit Logs, Secure Configurations for Hardware and Software

4

5

Module 1:Account Monitoring and Control, Challenges and approaches in spatial Bigdata Management, Issues in storage of bigdata,

3

Module 2:Bigdata& information distillation in social sensing, Integration with cloud computing security, Cryptography for Bigdata, Privacy & Accountability in data analytics.

7

Outcomes:

Page 39: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

At the end of Course students will be able to- 1 Analyse the offered system, and point to the potent ial safety problems 2 Choose an appropriate engineering approach to problem solving.

Text Books 1. Big Data: Storage, Sharing, and Security Kindle Edition by Fei HuISBN 9781498734868 -

CAT# K26395 Reference Book

1. Hadoop Security Paperbackby Ben Spivey (Author), Joey Echeverria (Author) ISBN-10: 1491900989 ISBN-13: 978-1491900987

Page 40: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: Second Year Semester – III Course: Seminar and Technical Communication Course Code: PAD311 Teaching Scheme (Hrs. /Week)

Continuous Internal Assessment (CIA) End Semester Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4 Lab Theory Lab -- -- 4 2 -- -- -- -- 50 -- 50 Max. Time, End Semester Exam (Theory) -00 Hrs. End Semester Exam (Lab) – 03 Hrs. Unit No Details Hours

1

Seminar based on state-of-the art in the selected electives. The presentation and the report should cover motivation, mathematical modeling, data-table discussion and conclusion. The reports to be prepared using LATEX derivative. To maintain the quality of the seminar work it is mandatory on the seminar guides to maintain a progressive record of the seminar contact Hours of 1 Hours per month per seminar which shall include the discussion agenda, weekly outcomes achieved during practical sessions, corrective actions and comments on the progress report as per the plan submitted by the students including dates and timing, along with the signature of the student as per the class and teacher time table (as additional teaching load); such record of progressive work shall be referred by the examiners during evaluation

Page 41: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: Second Year Semester – III Course: Presentation of Literature Review Course Code: PAD312 Teaching Scheme (Hrs. /Week)

Continuous Internal Assessment (CIA) End Semester Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4 Lab Theory Lab -- -- 4 2 -- -- -- -- 50 -- 50 Max. Time, End Semester Exam (Theory) -00 Hrs. End Semester Exam (Lab) – 03 Hrs. Unit No Details Hours

1

Seminar based on state-of-the art in the selected electives. The presentation and the report should cover motivation, mathematical modeling, data-table discussion and conclusion. The reports to be prepared using LATEX derivative. To maintain the quality of the seminar work it is mandatory on the seminar guides to maintain a progressive record of the seminar contact Hours of 1 Hours per month per seminar which shall include the discussion agenda, weekly outcomes achieved during practical sessions, corrective actions and comments on the progress report as per the plan submitted by the students including dates and timing, along with the signature of the student as per the class and teacher time table (as additional teaching load); such record of progressive work shall be referred by the examiners during evaluation

Page 42: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: Second Year Semester – III Course:Dissertation Stage I Course Code: PAD313 Teaching Scheme (Hrs. /Week)

Continuous Internal Assessment (CIA) End Semester Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4 Lab Theory Lab -- -- 16 10 -- -- -- -- 50 -- 100 150 Max. Time, End Semester Exam (Theory) -00 Hrs. End Semester Exam (Lab) – 03 Hrs. Unit No Details Hours

1

Motivation, Problem statement, survey of journal papers related to the problem statement, problem modeling and design using set theory, NP-Hard analysis, SRS, UML, Classes, Signals, Test scenarios and other necessary, problem specific UML, software engineering documents. Student should publish one International Journal (Web of science, Scopus, Thomson Reutores) Paper (having ISSN Number and preferably with Citation Index II); or paper can be published in reputed International Journal recommended by the guide of the Dissertation.. To maintain the quality of the dissertation work it is mandatory on the dissertation guides to maintain a progressive record of the dissertation contact Hours of 1 Hours per week which shall include the dissertation discussion agenda, weekly outcomes achieved during practical sessions, corrective actions and comments on the progress report as per the plan submitted by the students including dates and timing, along with the signature of the student as per the class and teacher time table; such record of progressive work shall be referred by the dissertation examiners during evaluation. At the most 8 dissertations can be assigned to a guide.

Page 43: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

M. Tech Advance Computing and Data Science

Semester – IV

CIA: Continuous Internal

Assessment

L: Theory Lecture

T: Tutorial

P: Practical

TH: Theory Exam.

#: Internship for 15

days.

*: Oral Examination

UC: University Core

PC: Programme Core

PE: Programme Elective

CIA Weightage Description CIA 1 10% Home Assignment CIA 2 20% Mid-Term Exam (MTE) CIA 3 10% Seminar Presentation CIA 4 10% Research Based Activity

TOTAL 50%

Sr. No. Core Course

Code Course Name

Teaching Scheme (Hrs./Week)

Examination Scheme

Total Marks

L T P C

Formative Assessment

CIA

Summative Assessment

ESE Course Lab Course Lab

1 PC PAD411 Mid Semester Thesis Progress Review

-- -- 8 5 -- 50 -- -- 50

2 UC PAD412 Dissertation Phase II -- -- 16 10 -- 50 -- 100 150

TOTAL 00 00 24 15 00 100 00 100 200

Page 44: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: Second Year Semester – IV Course:Mid Semester Thesis Progress Review Course Code: PAD411 Teaching Scheme (Hrs. /Week)

Continuous Internal Assessment (CIA) End Semester Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4 Lab Theory Lab -- -- 4 2 -- -- -- -- 50 -- 50 Max. Time, End Semester Exam (Theory) -00 Hrs. End Semester Exam (Lab) – 03 Hrs. Unit No Details Hours

1

Seminar based on state-of-the art in the selected electives. The presentation and the report should cover motivation, mathematical modeling, data-table discussion and conclusion. The reports to be prepared using LATEX derivative. To maintain the quality of the seminar work it is mandatory on the seminar guides to maintain a progressive record of the seminar contact Hours of 1 Hours per month per seminar which shall include the discussion agenda, weekly outcomes achieved during practical sessions, corrective actions and comments on the progress report as per the plan submitted by the students including dates and timing, along with the signature of the student as per the class and teacher time table (as additional teaching load); such record of progressive work shall be referred by the examiners during evaluation

Page 45: School of Computing Science and Engineering · Module 1:Jaspersoft (reporting and analytics server), Pentaho (data integration and business analytics), Splunk (platform for IT analytics),

School of Computing Science and Engineering Department of Computer Science and Engineering

Document Reference Revision No. / Date Prepared By Approved By SUN/SOCSE/COMP/PG/ACDS/31/08/18 R1 / 31 August 2018

Year: Second Year Semester – IV Course:Dissertation Stage II Course Code: PAD412 Teaching Scheme (Hrs. /Week)

Continuous Internal Assessment (CIA) End Semester Examination Total

L T P C CIA-1 CIA-2 CIA-3 CIA-4 Lab Theory Lab -- -- 16 10 -- -- -- -- 50 -- 100 150 Max. Time, End Semester Exam (Theory) -00 Hrs. End Semester Exam (Lab) – 03 Hrs. Unit No Details Hours

1

Motivation, Problem statement, survey of journal papers related to the problem statement, problem modeling and design using set theory, NP-Hard analysis, SRS, UML, Classes, Signals, Test scenarios and other necessary, problem specific UML, software engineering documents. Student should publish one International Journal (Web of science, Scopus, Thomson Reutores) Paper (having ISSN Number and preferably with Citation Index II); or paper can be published in reputed International Journal recommended by the guide of the Dissertation.. To maintain the quality of the dissertation work it is mandatory on the dissertation guides to maintain a progressive record of the dissertation contact Hours of 1 Hours per week which shall include the dissertation discussion agenda, weekly outcomes achieved during practical sessions, corrective actions and comments on the progress report as per the plan submitted by the students including dates and timing, along with the signature of the student as per the class and teacher time table; such record of progressive work shall be referred by the dissertation examiners during evaluation. At the most 8 dissertations can be assigned to a guide.