Skip to content

Commit

Permalink
Set worker prefetch multipler from application level
Browse files Browse the repository at this point in the history
  • Loading branch information
kshitijrajsharma committed Mar 3, 2024
1 parent 6dbe157 commit 5f3802e
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 0 deletions.
4 changes: 4 additions & 0 deletions API/api_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
HDX_SOFT_TASK_LIMIT,
)
from src.config import USE_S3_TO_UPLOAD as use_s3_to_upload
from src.config import WORKER_PREFETCH_MULTIPLIER
from src.config import logger as logging
from src.query_builder.builder import format_file_name_str
from src.validation.models import (
Expand All @@ -45,6 +46,9 @@
celery.conf.task_reject_on_worker_lost = True
celery.conf.task_acks_late = True

if WORKER_PREFETCH_MULTIPLIER:
celery.conf.update(worker_prefetch_multiplier=WORKER_PREFETCH_MULTIPLIER)


@celery.task(
bind=True,
Expand Down
2 changes: 2 additions & 0 deletions docs/src/installation/configurations.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ The following are the different configuration options that are accepted.
| `ENABLE_CUSTOM_EXPORTS` | `ENABLE_CUSTOM_EXPORTS` | `[API_CONFIG]` | False | Enables custom exports endpoint and imports | OPTIONAL |
| `POLYGON_STATISTICS_API_URL` | `POLYGON_STATISTICS_API_URL` | `[API_CONFIG]` | `None` | API URL for the polygon statistics to fetch the metadata , Currently tested with graphql query endpoint of Kontour , Only required if it is enabled from ENABLE_POLYGON_STATISTICS_ENDPOINTS | OPTIONAL |
| `POLYGON_STATISTICS_API_URL` | `POLYGON_STATISTICS_API_RATE_LIMIT` | `[API_CONFIG]` | `5` | Rate limit to be applied for statistics endpoint per minute, Defaults to 5 request is allowed per minute | OPTIONAL |
| `WORKER_PREFETCH_MULTIPLIER` | `WORKER_PREFETCH_MULTIPLIER` | `[CELERY]` | `1` | No of tasks that worker can prefetch at a time | OPTIONAL |
| `DEFAULT_SOFT_TASK_LIMIT` | `DEFAULT_SOFT_TASK_LIMIT` | `[API_CONFIG]` | `7200` | Soft task time limit signal for celery workers in seconds.It will gently remind celery to finish up the task and terminate, Defaults to 2 Hour| OPTIONAL |
| `DEFAULT_HARD_TASK_LIMIT` | `DEFAULT_HARD_TASK_LIMIT` | `[API_CONFIG]` | `10800` | Hard task time limit signal for celery workers in seconds. It will immediately kill the celery task.Defaults to 3 Hour| OPTIONAL |
| `USE_DUCK_DB_FOR_CUSTOM_EXPORTS` | `USE_DUCK_DB_FOR_CUSTOM_EXPORTS` | `[API_CONFIG]` | `False` | Enable this setting to use duckdb , By default duck db is disabled and postgres is used| OPTIONAL |
Expand Down Expand Up @@ -132,6 +133,7 @@ API Tokens have expiry date, It is `important to update API Tokens manually each
| `ENABLE_CUSTOM_EXPORTS` | `[API_CONFIG]` | Yes | Yes |
| `CELERY_BROKER_URL` | `[CELERY]` | Yes | Yes |
| `CELERY_RESULT_BACKEND` | `[CELERY]` | Yes | Yes |
| `WORKER_PREFETCH_MULTIPLIER` | `[CELERY]` | Yes | Yes |
| `FILE_UPLOAD_METHOD` | `[EXPORT_UPLOAD]` | Yes | Yes |
| `BUCKET_NAME` | `[EXPORT_UPLOAD]` | Yes | Yes |
| `AWS_ACCESS_KEY_ID` | `[EXPORT_UPLOAD]` | Yes | Yes |
Expand Down
5 changes: 5 additions & 0 deletions src/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,11 @@ def get_bool_env_var(key, default=False):
"CELERY", "CELERY_RESULT_BACKEND", fallback="redis://localhost:6379"
)

WORKER_PREFETCH_MULTIPLIER = int(
os.environ.get("WORKER_PREFETCH_MULTIPLIER")
or config.get("CELERY", "WORKER_PREFETCH_MULTIPLIER", fallback=1)
)

### API CONFIG BLOCK #######################

RATE_LIMIT_PER_MIN = os.environ.get("RATE_LIMIT_PER_MIN") or int(
Expand Down

0 comments on commit 5f3802e

Please sign in to comment.