PyPDFToDocument
: make conversion customization easier for users
#8553
Labels
PyPDFToDocument
: make conversion customization easier for users
#8553
Is your feature request related to a problem? Please describe.
This stemmed from deepset-ai/haystack-tutorials#362.
Currently, to customize the PDF conversion process, the user has to provide a custom
Converter
(adhering toPyPDFConverter
protocol).While this allows great flexibility, it requires considerable effort for users who wish to customize only one extraction parameter (for example,
extraction_mode
). PyPDF extraction parametersDescribe the solution you'd like
It would be nice to provide a easier way to do simple customizations.
extraction_kwargs
in__init__
and also include them in thePyPDFConverter
protocol.CustomConverter
implementation (adhering toPyPDFConverter
protocol) and make it possible for users to use it in a simple way.Something like:
We would like to get @shadeMe's opinion on this...
The text was updated successfully, but these errors were encountered: