All the Assignments for Big Data
Assignment 1 uses python queries and Jaccard index. Assignment 2 uses MapReduce and is tested on Hadoop through Docker. Uses n-grams, skipgrams, and inverted indices. Both Assignment 1 and 2 are run on a large database. Assignment 3 uses Spark to identify Twitter trends.