Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PaddleX插件方式安装PaddleOCR,推理运行时间很长 #2730

Open
3 tasks done
Longdexin opened this issue Dec 25, 2024 · 2 comments
Open
3 tasks done

PaddleX插件方式安装PaddleOCR,推理运行时间很长 #2730

Longdexin opened this issue Dec 25, 2024 · 2 comments
Assignees

Comments

@Longdexin
Copy link

Longdexin commented Dec 25, 2024

🔎 Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.
  • I have searched the PaddleOCR Issues and found no similar bug report.
  • I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

  1. 产线模式运行,设置device=‘cpu',识别一张图片需要38秒,不设置device,则需要10秒;
  2. 普通模式运行,需要20秒;
  3. 这些运行时间都太长了,感觉不太对。

🏃‍♂️ Environment (运行环境)

OS: Ubuntu 24.04.1 LTS
Mem: 32G
Python: 3.10.16
CPU: AMD Ryzen™ 9 5900X × 24

Package                   Version             Editable project location
------------------------- ------------------- -----------------------------------------
aiofiles                  24.1.0
aiohappyeyeballs          2.4.4
aiohttp                   3.11.11
aiosignal                 1.3.2
aistudio-sdk              0.2.6
albucore                  0.0.13
albumentations            1.4.10
annotated-types           0.7.0
anyio                     4.6.2.post1
argon2-cffi               23.1.0
argon2-cffi-bindings      21.2.0
arrow                     1.3.0
astor                     0.8.1
asttokens                 3.0.0
async-lru                 2.0.4
async-timeout             4.0.3
asyncio-atexit            1.0.1
attrdict3                 2.0.2
attrs                     24.3.0
babel                     2.16.0
backoff                   2.2.1
bce-python-sdk            0.9.25
beautifulsoup4            4.12.3
bleach                    6.2.0
blinker                   1.9.0
cachetools                5.5.0
certifi                   2024.8.30
cffi                      1.17.1
chardet                   5.2.0
charset-normalizer        3.4.1
chinese_calendar          1.10.0
click                     8.1.8
colorama                  0.4.6
colorlog                  6.9.0
comm                      0.2.2
cryptography              44.0.0
cssselect                 1.2.0
cssutils                  2.11.1
cycler                    0.12.1
Cython                    3.0.11
dataclasses-json          0.6.7
datasets                  3.2.0
debugpy                   1.8.11
decorator                 5.1.1
defusedxml                0.7.1
dill                      0.3.4
easydict                  1.13
editdistance              0.8.1
emoji                     2.14.0
erniebot                  0.5.0
erniebot_agent            0.5.0
et_xmlfile                2.0.0
eval_type_backport        0.2.2
exceptiongroup            1.2.2
executing                 2.1.0
faiss-cpu                 1.8.0.post1
fastapi                   0.115.6
fastjsonschema            2.21.1
filelock                  3.16.1
filetype                  1.2.0
fire                      0.7.0
Flask                     3.1.0
flask-babel               4.0.0
fonttools                 4.55.3
fqdn                      1.5.1
frozenlist                1.5.0
fsspec                    2024.9.0
future                    1.0.0
gast                      0.3.3
GPUtil                    1.4.0
greenlet                  3.1.1
grpcio                    1.51.3
h11                       0.14.0
html5lib                  1.1
httpcore                  1.0.6
httpx                     0.27.2
huggingface-hub           0.27.0
idna                      3.10
imageio                   2.36.1
imagesize                 1.4.1
imgaug                    0.4.0
ipykernel                 6.29.5
ipython                   8.31.0
isoduration               20.11.0
itsdangerous              2.2.0
jedi                      0.19.2
jieba                     0.42.1
Jinja2                    3.1.5
joblib                    1.4.2
json5                     0.10.0
jsonpatch                 1.33
jsonpath-python           1.0.6
jsonpointer               3.0.0
jsonschema                4.23.0
jsonschema-path           0.3.3
jsonschema-specifications 2023.12.1
jupyter_client            8.6.3
jupyter_core              5.7.2
jupyter-events            0.11.0
jupyter-lsp               2.2.5
jupyter_server            2.15.0
jupyter_server_terminals  0.5.3
jupyterlab                4.3.4
jupyterlab_pygments       0.3.0
jupyterlab_server         2.27.3
kiwisolver                1.4.8
langchain                 0.1.5
langchain-community       0.0.17
langchain-core            0.1.23
langdetect                1.0.9
langsmith                 0.0.87
lapx                      0.5.11.post1
lazy_loader               0.4
lazy-object-proxy         1.10.0
lmdb                      1.5.1
lxml                      5.3.0
markdown-it-py            3.0.0
MarkupSafe                3.0.2
marshmallow               3.23.2
matplotlib                3.5.2
matplotlib-inline         0.1.7
mdurl                     0.1.2
mistune                   3.0.2
more-itertools            10.5.0
motmetrics                1.4.0
multidict                 6.1.0
multiprocess              0.70.12.2
mypy-extensions           1.0.0
nbclient                  0.10.2
nbconvert                 7.16.4
nbformat                  5.10.4
nest-asyncio              1.6.0
networkx                  3.4.2
nltk                      3.9.1
notebook_shim             0.2.4
numpy                     1.24.4
olefile                   0.47
onnx                      1.17.0
openapi-schema-validator  0.6.2
openapi-spec-validator    0.7.1
opencv-contrib-python     4.10.0.84
opencv-python             4.5.5.64
opencv-python-headless    4.10.0.84
openpyxl                  3.1.5
opt-einsum                3.3.0
overrides                 7.7.0
packaging                 23.2
paddle2onnx               1.3.1
paddleclas                2.6.0
paddledet                 0.0.0
paddlefsl                 1.1.0
paddlenlp                 2.8.0.post0
paddleocr                 0.1.0.dev1+g3f32858
paddlepaddle              3.0.0b2
paddlex                   3.0.0b2             /home/longdexin/python-venvs/demo/PaddleX
pandas                    2.2.3
pandocfilters             1.5.1
Parsley                   1.3
parso                     0.8.4
pathable                  0.4.3
pexpect                   4.9.0
pillow                    11.0.0
pip                       24.3.1
platformdirs              4.3.6
premailer                 3.10.0
prettytable               3.12.0
prometheus_client         0.21.1
prompt_toolkit            3.0.48
propcache                 0.2.1
protobuf                  5.28.3
psutil                    5.9.8
ptyprocess                0.7.0
pure_eval                 0.2.3
pyarrow                   18.1.0
pybind11                  2.13.6
pyclipper                 1.3.0.post6
pycocotools               2.0.8
pycparser                 2.22
pycryptodome              3.21.0
pydantic                  2.9.2
pydantic_core             2.23.4
Pygments                  2.18.0
PyMuPDF                   1.25.1
pypandoc                  1.14
pyparsing                 3.2.0
pypdf                     5.1.0
python-dateutil           2.9.0.post0
python-docx               1.1.2
python-iso639             2024.10.22
python-json-logger        3.2.1
python-magic              0.4.27
python-oxmsg              0.0.1
pytz                      2024.2
PyYAML                    6.0.2
pyzmq                     26.2.0
qianfan                   0.0.3
RapidFuzz                 3.11.0
rarfile                   4.2
referencing               0.35.1
regex                     2024.11.6
requests                  2.32.3
requests-toolbelt         1.0.0
rfc3339-validator         0.1.4
rfc3986-validator         0.1.1
rich                      13.9.4
rpds-py                   0.22.3
ruamel.yaml               0.18.6
ruamel.yaml.clib          0.2.12
safetensors               0.4.5
scikit-image              0.25.0
scikit-learn              1.6.0
scipy                     1.14.1
Send2Trash                1.8.3
sentencepiece             0.2.0
seqeval                   1.2.2
setuptools                65.5.0
shapely                   2.0.6
shellingham               1.5.4
six                       1.17.0
sklearn                   0.0
sniffio                   1.3.1
soupsieve                 2.6
SQLAlchemy                2.0.36
stack-data                0.6.3
starlette                 0.41.3
tenacity                  8.5.0
termcolor                 2.5.0
terminado                 0.18.1
terminaltables            3.1.10
threadpoolctl             3.5.0
tifffile                  2024.12.12
tinycss2                  1.4.0
tokenizers                0.19.1
tomark                    0.1.4
tomli                     2.2.1
tool_helpers              0.1.2
tornado                   6.4.2
tqdm                      4.67.1
traitlets                 5.14.3
typeguard                 4.4.1
typer                     0.15.1
types-python-dateutil     2.9.0.20241206
typing_extensions         4.12.2
typing-inspect            0.9.0
tzdata                    2024.2
ujson                     5.10.0
unstructured              0.16.11
unstructured-client       0.28.1
uri-template              1.3.0
urllib3                   2.3.0
uvicorn                   0.34.0
visualdl                  2.5.3
wcwidth                   0.2.13
webcolors                 24.11.1
webencodings              0.5.1
websocket-client          1.8.0
Werkzeug                  3.1.3
wget                      3.2
wrapt                     1.17.0
xmltodict                 0.14.2
xxhash                    3.5.0
yacs                      0.1.8
yarl                      1.18.3

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

OCR.yaml

Global:
  pipeline_name: OCR
  input: https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_001.png
  
Pipeline:
  text_det_model: PP-OCRv4_server_det
  text_rec_model: PP-OCRv4_server_rec
  text_rec_batch_size: 1

代码

import cv2
import time
from paddlex import create_pipeline
from paddleocr import PaddleOCR

pp_ocr=PaddleOCR(use_angle_cls=False)
ocr_pipeline = create_pipeline(pipeline='OCR.yaml',device='cpu')
for i in range(5):
    time1 = time.perf_counter()
    img = cv2.imread('FM00064655_7.jpg')
    height, width, _ = img.shape
    ocr_output = ocr_pipeline.predict(img)
    ocr_data=[]
    for res in ocr_output:
        for idx in range(len(res.json['dt_polys'])):
            box = res.json['dt_polys'][idx]
            score = res.json['rec_score'][idx]
            text = res.json['rec_text'][idx]
            ocr_data.append({'text':text,'score':score,'box':box})
    time2 = time.perf_counter()
    print(f"ocr1: {i}|{time2-time1} 秒")
    ocr_result = pp_ocr.ocr(img, cls=False)
    ocr_data=[]
    for idx in range(len(ocr_result)):
        res = ocr_result[idx]
        for line in res:
            box,text,score=line[0],line[1][0],line[1][1]
            ocr_data.append({"text":text,"score":score,"box":box})
    time3 = time.perf_counter()
    print(f"ocr2: {i}|{time3-time2} 秒")

1014ca8d1f98d1cff4ec67349f4c6ebe
image

@Longdexin Longdexin changed the title 2.9.1版本推理时间太长了 PaddleX插件方式安装PaddleOCR,推理运行时间很长 Dec 25, 2024
@jzhang533 jzhang533 transferred this issue from PaddlePaddle/PaddleOCR Dec 25, 2024
@cuicheng01
Copy link
Collaborator

是否方便提供一下你的图像呢?

@Longdexin
Copy link
Author

Longdexin commented Dec 30, 2024

@cuicheng01 ,您好,非常感谢!图片是同事做的实验结果,不太适合贴出来,用这张图也是一样的,运行时间很长,一个8秒,另一个1秒多,https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_001.png
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants