You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When I set split_pdf_page=True,split_pdf_concurrency_level=15.
Assuming the pdf is divided into 10 sets, it will report an error:
ERROR: Failed to send request for page 1
...
WARNING: Failed to partition set #1, its elements will be omitted in the final result.
...
WARNING: Failed to partition set #9, its elements will be omitted in the final result.
INFO: Successfully partitioned set #10, elements added to the final result.
To Reproduce
code:
import os, json
import requests
from unstructured_client.models.operations import PartitionRequest
from unstructured_client.models.shared import PartitionParameters, ChunkingStrategy
os.environ["UNSTRUCTURED_API_KEY"] = "EMPTY"
os.environ["UNSTRUCTURED_API_URL"] = ""
import unstructured_client
from unstructured_client.models import shared, operations
requests_client = requests.Session()
client = unstructured_client.UnstructuredClient(
api_key_auth=os.getenv("UNSTRUCTURED_API_KEY"),
server_url=os.getenv("UNSTRUCTURED_API_URL"),
client=requests_client
)
filename = "./test_pdf.pdf"
file = open(filename, "rb")
req = operations.PartitionRequest(
partition_parameters=shared.PartitionParameters(
files=shared.Files(
content=file.read(),
file_name=filename,
),
strategy=shared.Strategy.HI_RES,
split_pdf_page=True,
split_pdf_concurrency_level=15,
chunking_strategy=ChunkingStrategy("by_title")
)
)
try:
res = client.general.partition(req)
element_dicts = [element for element in res.elements]
print(element_dicts)
for e in element_dicts:
print(e['text'])
except Exception as e:
print(e)
Console Information:
INFO: Preparing to split document for partition.
INFO: Concurrency level set to 15
INFO: Splitting pages 1 to 23 (23 total)
INFO: Determined optimal split size of 2 pages.
INFO: Partitioning 11 files with 2 page(s) each.
INFO: Partitioning 1 file with 1 page(s).
INFO: Partitioning set #1 (pages 1-2).
INFO: Partitioning set #2 (pages 3-4).
INFO: Partitioning set #3 (pages 5-6).
INFO: Partitioning set #4 (pages 7-8).
INFO: Partitioning set #5 (pages 9-10).
INFO: Partitioning set #6 (pages 11-12).
INFO: Partitioning set #7 (pages 13-14).
INFO: Partitioning set #8 (pages 15-16).
INFO: Partitioning set #9 (pages 17-18).
INFO: Partitioning set #10 (pages 19-20).
INFO: Partitioning set #11 (pages 21-22).
INFO: Partitioning set #12 (pages 23-23).
ERROR: Failed to send request for page 1
ERROR: Failed to send request for page 3
ERROR: Failed to send request for page 5
ERROR: Failed to send request for page 7
ERROR: Failed to send request for page 9
ERROR: Failed to send request for page 11
ERROR: Failed to send request for page 13
ERROR: Failed to send request for page 15
ERROR: Failed to send request for page 17
ERROR: Failed to send request for page 19
ERROR: Failed to send request for page 21
WARNING: Failed to partition set #1, its elements will be omitted in the final result.
WARNING: Failed to partition set #2, its elements will be omitted in the final result.
WARNING: Failed to partition set #3, its elements will be omitted in the final result.
WARNING: Failed to partition set #4, its elements will be omitted in the final result.
WARNING: Failed to partition set #5, its elements will be omitted in the final result.
WARNING: Failed to partition set #6, its elements will be omitted in the final result.
WARNING: Failed to partition set #7, its elements will be omitted in the final result.
WARNING: Failed to partition set #8, its elements will be omitted in the final result.
WARNING: Failed to partition set #9, its elements will be omitted in the final result.
WARNING: Failed to partition set #10, its elements will be omitted in the final result.
WARNING: Failed to partition set #11, its elements will be omitted in the final result.
INFO: Successfully partitioned set #12, elements added to the final result.
INFO: Successfully partitioned the document.
The text was updated successfully, but these errors were encountered:
Describe the bug
When I set split_pdf_page=True,split_pdf_concurrency_level=15.
Assuming the pdf is divided into 10 sets, it will report an error:
ERROR: Failed to send request for page 1
...
WARNING: Failed to partition set #1, its elements will be omitted in the final result.
...
WARNING: Failed to partition set #9, its elements will be omitted in the final result.
INFO: Successfully partitioned set #10, elements added to the final result.
To Reproduce
code:
Console Information:
The text was updated successfully, but these errors were encountered: