-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add default document loader and parser for RAG #624
Conversation
Thanks @AgentGenie! A couple of things:
On the note of selenium, it may also be a worthwhile option to look at using the Crawl4AI package to scrape the page when loading a URL. They say "Creates smart, concise Markdown optimized for RAG and fine-tuning applications.". |
Thanks for updating the parser name, can you also add an extra and packages to the pyproject.toml for all the additional packages required. |
I fixed the packaging and related stuff, will push my changes soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @AgentGenie and @davorrunje
Gemini test failure unrelated. |
Why are these changes needed?
Add document loader and parser (Docling) for RAG.
Related issue number
#438
Checks