-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add readme documentation about background scheduler
- Loading branch information
1 parent
6766ece
commit 921baa9
Showing
1 changed file
with
145 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,145 @@ | ||
# Background Scheduler Documentation | ||
|
||
## What is a Background Scheduler? | ||
|
||
A Background Scheduler is a tool that allows you to run tasks or functions at specific intervals, asynchronously, | ||
without blocking the main application process. It is often used to perform background tasks that need to run | ||
periodically, such as data synchronization, sending notifications, running maintenance tasks, or cleaning up resources. | ||
|
||
**APScheduler** (Advanced Python Scheduler) is a popular Python library that provides powerful functionality for | ||
scheduling jobs in the background. It supports various job stores, execution options, and flexibility in scheduling. | ||
|
||
## How Background Scheduler is Typically Used | ||
|
||
1. **Running Periodic Tasks**: Background schedulers are commonly used for tasks that need to be executed at regular | ||
intervals, like cleaning up expired data from a database or updating a cache. | ||
2. **Scheduled Notifications**: Sending scheduled emails, reminders, or push notifications at set intervals or times. | ||
3. **Data Syncing**: Automatically fetching or syncing data from external services (e.g., APIs) at regular intervals to | ||
ensure fresh and up-to-date data. | ||
4. **Database Maintenance**: Running database cleanup jobs, backup tasks, or archiving operations periodically. | ||
|
||
## APScheduler Components | ||
|
||
APScheduler works by scheduling jobs and running them either once or on a repeating schedule. Here are some of the key | ||
components: | ||
|
||
- **Job**: A task or function that you want to schedule. This can be a function or method. | ||
- **Trigger**: Specifies when a job should be executed. Commonly used triggers include `interval` (to run periodically), | ||
`date` (to run once at a specific time), and `cron` (to run based on a cron expression). | ||
- **Job Store**: Where the job metadata is stored (in-memory, file, or a database). | ||
- **Executor**: Specifies how the job should be executed (e.g., in a thread or in a process). | ||
|
||
### Basic Example of Using APScheduler | ||
|
||
```python | ||
from apscheduler.schedulers.background import BackgroundScheduler | ||
import time | ||
|
||
|
||
# A simple function to be scheduled | ||
def job(): | ||
print("Running scheduled task") | ||
|
||
|
||
# Create a scheduler instance | ||
scheduler = BackgroundScheduler() | ||
|
||
# Add a job to run every 10 seconds | ||
scheduler.add_job(job, 'interval', seconds=10) | ||
|
||
# Start the scheduler in the background | ||
scheduler.start() | ||
|
||
# Keep the script running to allow the scheduler to run in the background | ||
try: | ||
while True: | ||
time.sleep(2) | ||
except (KeyboardInterrupt, SystemExit): | ||
scheduler.shutdown() # Shut down the scheduler when exiting the app | ||
``` | ||
|
||
In this example: | ||
|
||
- The `job` function will be executed every 10 seconds without blocking the main program. | ||
- The `BackgroundScheduler` allows the scheduler to run in the background while the main application continues to | ||
operate. | ||
|
||
### Common Use Cases for Background Schedulers | ||
|
||
- **Web Applications**: Running tasks in the background to offload work from the main application, such as updating | ||
data, sending notifications, or managing queues. | ||
- **Data Pipelines**: Running periodic data aggregation or transformation tasks in the background without user | ||
intervention. | ||
- **Monitoring Systems**: Scheduling tasks to monitor services or systems and send alerts or take action when certain | ||
conditions are met. | ||
- **Maintenance Jobs**: Scheduling cleanup tasks, backups, or other administrative tasks that need to run regularly | ||
without user interaction. | ||
|
||
## How Background Scheduler is Used in This Route | ||
|
||
In the provided route, the **APScheduler** is used to automate the updating of programming language tags from the Stack | ||
Overflow API at regular intervals (every 24 hours). This approach ensures that the data in the database remains | ||
up-to-date without having to rely on manual updates or requests to trigger the update process. | ||
|
||
### Key Functions Using Background Scheduler | ||
|
||
1. `update_tags_in_db` Function: | ||
- This function fetches the latest tags from the Stack Overflow API and updates the database with the new data. | ||
- It is an asynchronous function, allowing it to run in the background without blocking other operations of the | ||
application. | ||
|
||
2. `start_scheduler` Function: | ||
- This function is responsible for initializing the BackgroundScheduler and scheduling the update_tags_in_db | ||
function to run every 24 hours. | ||
- It sets the trigger type to 'interval', which means the job will run at fixed intervals, specifically every 24 | ||
hours in this case. | ||
- The scheduler runs as a background process, meaning the main application does not need to stop or wait for it to | ||
complete. | ||
|
||
#### How It Works in the Application | ||
|
||
- **Scheduled Update**: The scheduler runs automatically upon the application's startup and ensures that the Stack | ||
Overflow tags are updated every 24 hours without user intervention. | ||
- **Non-Blocking**: Since the scheduler runs in the background, the main FastAPI application is free to handle user | ||
requests and other processes without being blocked by this scheduled task. | ||
- **Database Update**: Every 24 hours, the `update_tags_in_db` function fetches the latest tags from the Stack Overflow | ||
API and updates the database with the new tag data, replacing the older entries. | ||
|
||
```python | ||
# Starts the scheduler to update Stack Overflow tags every 24 hours | ||
def start_scheduler(): | ||
scheduler = BackgroundScheduler() | ||
|
||
# Add a job to run the update_tags_in_db function every 24 hours | ||
scheduler.add_job(lambda: asyncio.run(update_tags_in_db()), 'interval', hours=24) | ||
|
||
# Start the scheduler in the background | ||
scheduler.start() | ||
``` | ||
|
||
### Advantages of Using APScheduler in This Route: | ||
|
||
- **Automation**: The update process is fully automated, so no manual triggers are needed to keep the data fresh. | ||
- **Efficiency**: Since the data is updated every 24 hours, API calls are limited, reducing the risk of hitting the | ||
quota limit for the Stack Overflow API. | ||
- **Scalability**: As the application scales, the scheduler will continue to handle periodic updates without additional | ||
complexity. | ||
- **Separation of Concerns**: The main application remains focused on handling user requests, while the scheduler | ||
manages the background task of updating the database. | ||
|
||
### Other Use Cases of Background Scheduler in FastAPI: | ||
|
||
- **Sending Automated Emails**: You can use APScheduler to schedule email reminders or notifications based on user | ||
actions or time intervals. | ||
- **Daily Data Backups**: Schedule regular backups of your database to prevent data loss. | ||
- **Periodic Cache Refreshing**: In applications that rely on external data sources, background schedulers can | ||
periodically refresh the cache to ensure the data is always fresh and up-to-date. | ||
- **Maintenance Tasks**: Tasks such as clearing old logs, cleaning up expired sessions, or recalculating metrics can be | ||
scheduled during off-peak hours. | ||
|
||
## Conclusion | ||
|
||
A **Background Scheduler** is a powerful tool for handling automated, periodic tasks within your applications. In the | ||
context of your FastAPI application, **APScheduler** is used to periodically update data from the Stack Overflow API, | ||
ensuring the system stays current without manual intervention. This enables efficient, scalable, and automated | ||
management of tasks that need to be performed regularly in the background. |