University of Mumbai Syllabus For Semester 8 (BE Fourth Year) Big Data Analytics: Knowing the Syllabus is very important for the students of Semester 8 (BE Fourth Year). Shaalaa has also provided a list of topics that every student needs to understand.

The University of Mumbai Semester 8 (BE Fourth Year) Big Data Analytics syllabus for the academic year 2021-2022 is based on the Board's guidelines. Students should read the Semester 8 (BE Fourth Year) Big Data Analytics Syllabus to learn about the subject's subjects and subtopics.

Students will discover the unit names, chapters under each unit, and subtopics under each chapter in the University of Mumbai Semester 8 (BE Fourth Year) Big Data Analytics Syllabus pdf 2021-2022. They will also receive a complete practical syllabus for Semester 8 (BE Fourth Year) Big Data Analytics in addition to this.

## University of Mumbai Semester 8 (BE Fourth Year) Big Data Analytics Revised Syllabus

University of Mumbai Semester 8 (BE Fourth Year) Big Data Analytics and their Unit wise marks distribution

### University of Mumbai Semester 8 (BE Fourth Year) Big Data Analytics Course Structure 2021-2022 With Marking Scheme

## Syllabus

- Introduction to Big Data, Big Data characteristics, types of Big Data, Traditional vs. Big Data business approach, Case Study of Big Data Solutions.

- What is Hadoop?
- Core Hadoop Components
- Hadoop Ecosystem
- Physical Architecture
- Hadoop limitations

- What is NoSQL? NoSQL business drivers; NoSQL case studies;
- NoSQL data architecture patterns: Key-value stores, Graph stores, Column family (Bigtable) stores, Document stores, Variations of NoSQL architectural patterns;
- Using NoSQL to manage big data:- What is a big data NoSQL solution? Understanding the types of big data problems; Analyzing big data with a shared-nothing architecture; Choosing distribution models: master-slave versus peer-to-peer; Four ways that NoSQL systems handle big data problems

- Physical Organization of Compute Nodes, LargeScale File-System Organization.

- The Map Tasks, Grouping by Key, The Reduce Tasks, Combiners, Details of MapReduce Execution, Coping With Node Failures.

- Matrix-Vector Multiplication by MapReduce, Relational-Algebra Operations, Computing Selections by MapReduce, Computing Projections by MapReduce, Union, Intersection, and Difference by MapReduce, Computing Natural Join by MapReduce, Grouping and Aggregation by MapReduce, Matrix Multiplication, Matrix Multiplication with One MapReduce Step.

- Applications of Near-Neighbor Search, Jaccard Similarity of Sets, Similarity of Documents, Collaborative Filtering as a Similar-Sets Problem.
- Distance Measures:- Definition of a Distance Measure, Euclidean Distances, Jaccard Distance, Cosine Distance, Edit Distance, Hamming Distance.

- A Data-Stream-Management System, Examples of Stream Sources, Stream Querie, Issues in Stream Processing.

- Obtaining a Representative Sample, The General Sampling Problem, Varying the Sample Size.

- The Bloom Filter, Analysis.

- The Count-Distinct Problem, The Flajolet-Martin Algorithm, Combining Estimates, Space Requirements.

- The Cost of Exact Counts, The Datar-Gionis-Indyk-Motwani Algorithm, Query Answering in the DGIM Algorithm, Decaying Windows.

- PageRank Definition, Structure of the web, dead ends, Using Page rank in a search engine, Efficient computation of Page Rank:- PageRank Iteration Using MapReduce, Use of Combiners to Consolidate the Result Vector.
- Topic sensitive Page Rank, link Spam, Hubs and Authorities.

- Algorithm of Park, Chen, and Yu, The Multistage Algorithm, The Multihash Algorithm.

- Sampling Methods for Streams, Frequent Itemsets in Decaying Windows.

- CURE Algorithm, Stream-Computing, A Stream-Clustering Algorithm, Initializing & Merging Buckets, Answering Queries.

- A Model for Recommendation Systems, Content-Based Recommendations, Collaborative Filtering.

- Social Networks as Graphs, Clustering of Social-Network Graphs, Direct Discovery of Communities, SimRank, Counting triangles using MapReduce.