- Week1: Data acquisition: Web scraping, Calling Internet APIs
- Week2: Linear Regression: Multivariate linear regression, Polynomial regression, Regularization (Lasso, Ridge), Cross validation, Train-Test split, MAE, MSE
- Week3: Classification 1: Logistic regression, Accuracy, Confusion Matrix, Precision, Recall, F1-score
- Week4: Classification 2: KNN Classifier, Decision Trees
- Week 5: Clustering: K-Means, Hierarchical clustering, Dendrogram
- Week 6: Association Rules: Association rule mining, Apriori algorithm
- Week 7: Recommender systems: User-User Collaborative Filtering (from scratch and using Surprise library), Mean-centered cosine similarity, Precision and Recall at rank k, Precision-recall curve
- Week 8: Text analytics: Text preparation (Tokenization, Lemmatization, Stopwords), Text representation (Bag of Words, TF-IDF), Text structure (Dependency Parsing, Entity recognition), Text similarity (cosine similarity)
- Week 9: Text analytics 2: Text embeddings, Bag of Words, TF-IDF, Word2vec, application to text classification
- Week 10: Neural Networks: Using PyTorch to build NN models, Artificial Neuron, Multilayer Perceptron, Using existing models from Huggingface.
- Week 11: Graph Analytics: creating, visualizing and analyzing graphs with NetworkX (Undirected Graph, Directed Graph, Weighted Graph, Erdős–Rényi graph, Zachary's karate club graph), Shortest path, Diameter, Centrality, Degree, PageRank, Community Detection
- Week 12: Chatbots
- Week 13: Dimensionality Reduction: PCA, t-SNE, MDS, Isomap
For the project, you will have to work with Git and GitHub. The following documentation can be useful to you: