You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently encountered a batch submission that eventually failed after numerous errors like this one — but nonetheless submitted a new batch containing zero jobs.
[…]
File "/usr/local/lib/python3.10/site-packages/hailtop/utils/utils.py", line 792, in retry_transient_errors
return await retry_transient_errors_with_debug_string('', 0, f, *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/hailtop/utils/utils.py", line 834, in retry_transient_errors_with_debug_string
st = ''.join(traceback.format_stack())
. The most recent error was <class 'hailtop.httpx.ClientResponseError'> 500, message='Internal Server Error', url=URL('http://batch.hail/api/v1alpha/batches/485962/updates/1/jobs/create') body='500 Internal Server Error\n\nServer got itself in trouble'.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/hailtop/utils/utils.py", line 809, in retry_transient_errors_with_debug_string
return await f(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/hailtop/aiocloud/common/session.py", line 117, in _request_with_valid_authn
return await self._http_session.request(method, url, **kwargs)
File "/usr/local/lib/python3.10/site-packages/hailtop/httpx.py", line 148, in request_and_raise_for_status
raise ClientResponseError(
hailtop.httpx.ClientResponseError: 500, message='Internal Server Error', url=URL('http://batch.hail/api/v1alpha/batches/485962/updates/1/jobs/create') body='500 Internal Server Error\n\nServer got itself in trouble'
2024-09-25 01:54:55,288 - hailtop.utils 835 - WARNING - A transient error occured. We will automatically retry. We have thus far seen 50 transient errors (next delay: 60.0s).
The corresponding server-side error was
pymysql.err.DataError: (1406, \"Data too long for column 'value' at row 106\")
coming from the INSERT INTO job_attributes … query in insert_jobs_into_db().
We write a list of the samples being processed as a job attribute, and it turned out that for at least some of the jobs of this batch this list had grown to longer than 64K of text.
The job_attributes.value database field is of type TEXT, which limits each individual attribute to 64KiB bytes.
While writing a long list of sample ids as an attribute may or may not be a great idea 😄 it is fair to say that 64K is not a large maximum for user-supplied data here in the 21st century!
It may be worth adding a database migration to change the job_attributes.value column type (and perhaps also that of job_group_attributes.value) from TEXT to MEDIUMTEXT, which would raise the limit to 16 MiB bytes (at, it appears, a cost of 1 byte per row).
The text was updated successfully, but these errors were encountered:
Hi @jmarshall, the team talked about this issue in our standup today. We had some concerns about appropriateness of using this table as a long term storage area for larger metadata, and the likely developer effort and system downtime to perform the migration. So we currently don't plan on prioritizing this in the immediate future, but do let us know if you have any concerns about that - or if it ends up being impossible for you to work around this - and we might be able to reconsider (or maybe come up with alternative solutions). Thanks!
We recently encountered a batch submission that eventually failed after numerous errors like this one — but nonetheless submitted a new batch containing zero jobs.
The corresponding server-side error was
coming from the
INSERT INTO job_attributes …
query ininsert_jobs_into_db()
.We write a list of the samples being processed as a job attribute, and it turned out that for at least some of the jobs of this batch this list had grown to longer than 64K of text.
The
job_attributes.value
database field is of type TEXT, which limits each individual attribute to 64KiB bytes.While writing a long list of sample ids as an attribute may or may not be a great idea 😄 it is fair to say that 64K is not a large maximum for user-supplied data here in the 21st century!
It may be worth adding a database migration to change the
job_attributes.value
column type (and perhaps also that ofjob_group_attributes.value
) from TEXT to MEDIUMTEXT, which would raise the limit to 16 MiB bytes (at, it appears, a cost of 1 byte per row).The text was updated successfully, but these errors were encountered: