Skip to content

Commit

Permalink
Merge pull request #1422 from myhloli/dev
Browse files Browse the repository at this point in the history
docs(README): update for 1.0.0 release and improve documentation
  • Loading branch information
myhloli authored Jan 6, 2025
2 parents 57e0af4 + 29dde7c commit bc39fa8
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 6 deletions.
14 changes: 11 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,13 @@
</div>

# Changelog
- 2025/01/06 1.0.0 released. This is our first official release, where we have introduced a completely new API interface and enhanced compatibility through extensive refactoring:
- New API Interface
- For the data-side API, we have introduced the Dataset class, designed to provide a robust and flexible data processing framework. This framework currently supports a variety of document formats, including images (.jpg and .png), PDFs, Word documents (.doc and .docx), and PowerPoint presentations (.ppt and .pptx). It ensures effective support for data processing tasks ranging from simple to complex.
- For the user-side API, we have meticulously designed the MinerU processing workflow as a series of composable Stages. Each Stage represents a specific processing step, allowing users to define new Stages according to their needs and creatively combine these stages to customize their data processing workflows.
- Enhanced Compatibility
- By optimized the dependency environment, we ensure stable and efficient operation on ARM architecture Linux systems.
- We have deeply integrated with Huawei Ascend NPU acceleration, providing autonomous and controllable high-performance computing capabilities. This supports the localization and development of AI application platforms in China.
- 2024/11/22 0.10.0 released. Introducing hybrid OCR text extraction capabilities,
- Significantly improved parsing performance in complex text distribution scenarios such as dense formulas, irregular span regions, and text represented by images.
- Combines the dual advantages of accurate content extraction and faster speed in text mode, and more precise span/line region recognition in OCR mode.
Expand Down Expand Up @@ -248,7 +255,7 @@ You can modify certain configurations in this file to enable or disable features
{
// other config
"layout-config": {
"model": "layoutlmv3" // Please change to "doclayout_yolo" when using doclayout_yolo.
"model": "doclayout_yolo" // Please change to "layoutlmv3" when using layoutlmv3.
},
"formula-config": {
"mfd_model": "yolo_v8_mfd",
Expand Down Expand Up @@ -288,20 +295,21 @@ If your device supports CUDA and meets the GPU requirements of the mainline envi
### Using NPU

If your device has NPU acceleration hardware, you can follow the tutorial below to use NPU acceleration:

[Ascend NPU Acceleration](docs/README_Ascend_NPU_Acceleration_zh_CN.md)

## Usage

### Command Line

[Using MinerU via Command Line](https://mineru.readthedocs.io/en/latest/user_guide/quick_start/command_line.html)
[Using MinerU via Command Line](https://mineru.readthedocs.io/en/latest/user_guide/usage/command_line.html)

> [!TIP]
> For more information about the output files, please refer to the [Output File Description](docs/output_file_en_us.md).
### API

[Using MinerU via Python API](https://mineru.readthedocs.io/en/latest/user_guide/quick_start/to_markdown.html)
[Using MinerU via Python API](https://mineru.readthedocs.io/en/latest/user_guide/usage/api.html)


### Deploy Derived Projects
Expand Down
14 changes: 11 additions & 3 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,13 @@
</div>

# 更新记录
- 2025/01/06 1.0.0 发布,这是我们的第一个正式版本,在这个版本中,我们通过大量重构带来了全新的API接口和更广泛的兼容性:
- 全新API接口
- 对于数据侧API,我们引入了Dataset类,旨在提供一个强大而灵活的数据处理框架。该框架当前支持包括图像(.jpg及.png)、PDF、Word(.doc及.docx)、以及PowerPoint(.ppt及.pptx)在内的多种文档格式,确保了从简单到复杂的数据处理任务都能得到有效的支持。
- 针对用户侧API,我们将MinerU的处理流程精心设计为一系列可组合的Stage阶段。每个Stage代表了一个特定的处理步骤,用户可以根据自身需求自由地定义新的Stage,并通过创造性地组合这些阶段来定制专属的数据处理流程。
- 更广泛的兼容性适配
- 通过精细化依赖环境调整,确保在ARM架构Linux系统上能够稳定高效运行。
- 深度适配华为昇腾NPU加速,积极响应信创要求,提供自主可控的高性能计算能力,助力人工智能应用平台的国产化应用与发展。
- 2024/11/22 0.10.0发布,通过引入混合OCR文本提取能力,
- 在公式密集、span区域不规范、部分文本使用图像表现等复杂文本分布场景下获得解析效果的显著提升
- 同时具备文本模式内容提取准确、速度更快与OCR模式span/line区域识别更准的双重优势
Expand Down Expand Up @@ -253,7 +260,7 @@ pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com -i h
{
// other config
"layout-config": {
"model": "layoutlmv3" // 使用doclayout_yolo请修改为“doclayout_yolo"
"model": "doclayout_yolo" // 使用layoutlmv3请修改为“layoutlmv3"
},
"formula-config": {
"mfd_model": "yolo_v8_mfd",
Expand Down Expand Up @@ -292,20 +299,21 @@ pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com -i h
### 使用NPU

如果您的设备存在NPU加速硬件,则可以通过以下教程使用NPU加速:

[NPU加速教程](docs/README_Ascend_NPU_Acceleration_zh_CN.md)

## 使用

### 命令行

[通过命令行使用MinerU](https://mineru.readthedocs.io/zh-cn/latest/user_guide/quick_start/command_line.html)
[通过命令行使用MinerU](https://mineru.readthedocs.io/en/latest/user_guide/usage/command_line.html)

> [!TIP]
> 更多有关输出文件的信息,请参考[输出文件说明](docs/output_file_zh_cn.md)
### API

[通过Python代码调用MinerU](https://mineru.readthedocs.io/zh-cn/latest/user_guide/quick_start/to_markdown.html)
[通过Python代码调用MinerU](https://mineru.readthedocs.io/en/latest/user_guide/usage/api.html)


### 部署衍生项目
Expand Down

0 comments on commit bc39fa8

Please sign in to comment.