Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何只关注正文并忽略文档页眉页脚和引用来源信息? #775

Closed
equationdz opened this issue Oct 23, 2024 · 1 comment
Closed
Labels
enhancement New feature or request

Comments

@equationdz
Copy link

很多文档识别以后,尾部都带了几百条引用来源信息,现在任务需求是只关注正文的信息提取并且进入AI模型做总结摘要,如何忽略文档页眉页脚和引用来源信息?有没有参数可以控制?

@equationdz equationdz added the enhancement New feature or request label Oct 23, 2024
@myhloli myhloli closed this as not planned Won't fix, can't repro, duplicate, stale Jan 5, 2025
@myhloli
Copy link
Collaborator

myhloli commented Jan 5, 2025

可以尝试通过识别References关键词来分割正文和引用

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants