Skip to content

Latest commit

 

History

History
35 lines (24 loc) · 2.09 KB

README.md

File metadata and controls

35 lines (24 loc) · 2.09 KB

Fake News Detection

Given a dataset containing textual news articles or headlines, the goal is to classify each article as either "Fake" or "Authentic." Fake news is typically defined as news that contains false or intentionally misleading information, while real news contains accurate and factual information. The challenge is to develop a model that can effectively distinguish between fake and real news articles.

Working Implementation

Demo_Fake-News-Detection.mp4

Proposed Solution

A proposed solution for detecting fake news is a Python-based machine learning model that uses a dataset of news articles and performs preprocessing, vectorization, and training to classify the articles as real or fake. The model uses Linear Support Vector Classification (Linear SVC) algorithm and has shown high accuracy in detecting fake news. Exploratory Data Analysis will also be performed on the dataset. We create a pipeline that combines TF-IDF vectorization and LinearSVC.

To Test

Installation

  1. Create a virtual environment.

    • In this project we use the virtualenv package which can be installed by running pip install virtualenv in the terminal.
    • Create a virtual environment by running python -m virtualenv venv.
    • Activate the virtual environment by running venv\Scripts\activate on Windows.
  2. Install the required packages.

    • The packages can be installed by running pip install -r requirements.txt.
    • This should install the necessary packages, however, some packages could be deprecated.
  3. Run the cells within "prerequisites.ipynb"

  4. In the terminal: streamlit run analysis.py (Will take some time to run).

    • hosted_analysis.py: Does not make use of PySpark
    • analysis.py: Makes use of PySpark

Reference