Used natural language processing to perform topic modeling and develop a content-based recommendation system for change.org petitions. Analysis and modeling done with scikit-learn feature extraction and decomposition methods and cosine similarity. Interactive dashboard visualizations created with Plotly and Cufflinks, deployed with Flask on AWS EC2 instance. Data scraped from change.org using Scrapy and Selenium and stored in mongoDB.
Please contact me at [email protected] if you would like access to the dataset.