twitter-watch

A twitter scraper using Django, Celery and Playwright.
This has been written for a Job Enterance competition.

How it Works

This is the block diagram of system:

This is how the application works

Task1 and Task2 are the core of application. We configure these tasks in twitterwatch\settings.py as follows:

CELERY_BEAT_SCHEDULE = {
    'scraping_task': {
        'task': 'scraping.tasks.scraping_method',
        'schedule': 60 * 15,
    },
    'sentiment_detection_task': {
        'task': 'scraping.tasks.sentiment_detection',
        'schedule': 60 * 17,
    }
}

Task1 is the Scraper which directly scrapes data from three accounts every 15 minutes and save the results in db.sqlite.
Task2 is responsible for sentiment detection using OpenAI's text-davinci-003 model. It checks the database every 17 minutes and if a tweet does note have a sentiment in DB ( == '') then detects vi API and records it in the DB.
Both tasks are defined in scraping\tasks.py

Run

You can simply run the project using docker-compose:

docker-compose up --build -d

And head over to url:8000/tweets to see the results in JSON format.
url:8000/tweets/@handle returns the tweets for every account and
url:8000/tweets/@handle/id is for each tweet with details such as "likes", "retweets" and etc.

[
    {
        "handle": "@elonmusk",
        "publish_date": "2023-03-16T17:10:48Z",
        "text": "You’re terrible & I love you",
        "sentiment": "Mixed sentiment.",
        "tweet_url": "https://twitter.com/elonmusk/status/1636414725549592576",
        "image_url": "",
        "video_url": "",
        "created_at": "2023-03-16T17:15:13.182073Z",
        "tweet_id": 1636414725549592576,
        "likes": "8,441",
        "retweets": "1,369",
        "replies": "2,120",
        "views": "274360"
    }
]

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
Docs		Docs
scraping		scraping
twitterwatch		twitterwatch
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
celerybeat-schedule.db		celerybeat-schedule.db
db.sqlite3		db.sqlite3
docker-compose.yml		docker-compose.yml
docker-entrypoint.sh		docker-entrypoint.sh
manage.py		manage.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

twitter-watch

How it Works

This is how the application works

Run

About

Releases

Packages

Languages

h4med/twitter-watch

Folders and files

Latest commit

History

Repository files navigation

twitter-watch

How it Works

This is how the application works

Run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages