Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR - Exception in ASGI application / Error in pixCreateHeader #333

Open
zakariamehbi opened this issue Dec 23, 2023 · 8 comments · Fixed by #334
Open

ERROR - Exception in ASGI application / Error in pixCreateHeader #333

zakariamehbi opened this issue Dec 23, 2023 · 8 comments · Fixed by #334

Comments

@zakariamehbi
Copy link

Describe the bug

2023-12-23 13:47:01,827 41.250.50.106:53059 POST /general/v0/general HTTP/1.1 - 500 Internal Server Error
2023-12-23 13:47:01,827 uvicorn.error ERROR Exception in ASGI application
Traceback (most recent call last):
  File "/home/notebook-user/.local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/home/notebook-user/.local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/fastapi/applications.py", line 1106, in __call__
    await super().__call__(scope, receive, send)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/home/notebook-user/.local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/home/notebook-user/.local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/home/notebook-user/.local/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/fastapi/routing.py", line 274, in app
    raw_response = await run_endpoint_function(
  File "/home/notebook-user/.local/lib/python3.10/site-packages/fastapi/routing.py", line 193, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/notebook-user/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/home/notebook-user/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/home/notebook-user/prepline_general/api/general.py", line 811, in pipeline_1
    list(response_generator(is_multipart=False))[0]
  File "/home/notebook-user/prepline_general/api/general.py", line 749, in response_generator
    response = pipeline_api(
  File "/home/notebook-user/prepline_general/api/general.py", line 434, in pipeline_api
    elements = partition(**partition_kwargs)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/partition/auto.py", line 384, in partition
    elements = _partition_pdf(
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/documents/elements.py", line 503, in wrapper
    elements = func(*args, **kwargs)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/file_utils/filetype.py", line 591, in wrapper
    elements = func(*args, **kwargs)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/file_utils/filetype.py", line 546, in wrapper
    elements = func(*args, **kwargs)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/chunking/title.py", line 241, in wrapper
    elements = func(*args, **kwargs)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/partition/pdf.py", line 172, in partition_pdf
    return partition_pdf_or_image(
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/partition/pdf.py", line 279, in partition_pdf_or_image
    elements = _partition_pdf_or_image_local(
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/utils.py", line 214, in wrapper
    return func(*args, **kwargs)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/partition/pdf.py", line 409, in _partition_pdf_or_image_local
    final_layout = process_data_with_ocr(
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/partition/ocr.py", line 82, in process_data_with_ocr
    merged_layouts = process_file_with_ocr(
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/partition/ocr.py", line 168, in process_file_with_ocr
    raise e
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/partition/ocr.py", line 157, in process_file_with_ocr
    merged_page_layout = supplement_page_layout_with_ocr(
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/partition/ocr.py", line 190, in supplement_page_layout_with_ocr
    ocr_layout = get_ocr_layout_from_image(
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/partition/ocr.py", line 430, in get_ocr_layout_from_image
    ocr_regions = get_ocr_layout_tesseract(image, ocr_languages)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured/partition/ocr.py", line 465, in get_ocr_layout_tesseract
    ocr_df = unstructured_pytesseract.image_to_data(
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured_pytesseract/pytesseract.py", line 591, in image_to_data
    return {
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured_pytesseract/pytesseract.py", line 593, in <lambda>
    Output.DATAFRAME: lambda: get_pandas_output(
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured_pytesseract/pytesseract.py", line 568, in get_pandas_output
    return pd.read_csv(BytesIO(run_and_get_output(*args)), **kwargs)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured_pytesseract/pytesseract.py", line 347, in run_and_get_output
    run_tesseract(**kwargs)
  File "/home/notebook-user/.local/lib/python3.10/site-packages/unstructured_pytesseract/pytesseract.py", line 279, in run_tesseract
    raise TesseractError(proc.returncode, get_errors(error_string))
unstructured_pytesseract.pytesseract.TesseractError: (1, 'Error in pixCreateHeader: requested w = 34680, h = 48360, d = 32 Error in pixCreateHeader: requested bytes >= 2^31 Error in pixCreateNoInit: pixd not made Error in pixCreate: pixd not made Error in pixReadStreamPng: pix not made Error in pixReadStream: png: no pix returned Error in pixRead: pix not read Error during processing.')

To Reproduce

var myHeaders = new Headers();
var formdata = new FormData();
formdata.append("files", fileInput.files[0], "ΣΤ ΔΗΜΟΤΙΚΟΥ 3.pdf");
formdata.append("output_format", "application/json");
formdata.append("coordinates", "false");
formdata.append("encoding", "utf-8");
formdata.append("hi_res_model_name", "detectron2_onnx");
formdata.append("include_page_breaks", "false");
formdata.append("ocr_languages", "");
formdata.append("pdf_infer_table_structure", "true");
formdata.append("skip_infer_table_types", "jpg, png");
formdata.append("strategy", "hi_res");
formdata.append("xml_keep_tags", "true");

var requestOptions = {
		method: 'POST',
		headers: myHeaders,
		body: formdata,
		redirect: 'follow'
};

fetch("http://my_hosted_api:8000/general/v0/general", requestOptions)
		.then(response => response.text())
		.then(result => console.log(result))
		.catch(error => console.log('error', error));

Filetype

Environment:

  • Self-hosting
  • Postman
@awalker4
Copy link
Collaborator

awalker4 commented Dec 23, 2023

Hi there, this error has hopefully been fixed in the library here. We're a bit behind on the unstructured version in the requirements here - can you try pip install unstructured==0.11.6 and see if this is resolved?

@zakariamehbi
Copy link
Author

Hello @awalker4, I can't because I'm using the docker image, is there any other way? About the versions, is there a reason for the lagging behind? Thank you

@awalker4
Copy link
Collaborator

No good reason other than our Dependabot seems to broken 😂 Hang tight, I'll bump the versions now to get a new image out.

@zakariamehbi
Copy link
Author

Thank you @awalker4 😂

awalker4 added a commit that referenced this issue Dec 23, 2023
This will close #333 which has been fixed in the library.
@zakariamehbi
Copy link
Author

@awalker4 No new docker image?

@awalker4
Copy link
Collaborator

Ah, seems this job just needs to finish: https://github.com/Unstructured-IO/unstructured-api/actions/runs/7310799243

@zakariamehbi
Copy link
Author

@awalker4 I did another test with the new image and unfortunately, I got the same error.

@awalker4
Copy link
Collaborator

awalker4 commented Feb 9, 2024

Apologies, this bug slipped off the radar. Are you still seeing this issue?

@awalker4 awalker4 reopened this Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants