This project demonstrates how to perform common actions on Document AI Warehouse through the API. dw_processing.ipynb
uses DocumentWarehouseUtils.py
for abstraction and readability.
It is recommended to look at the code provided in the utils python files.
- Create document & folder Schema.
- Create a folder using schema created in step #1.
- Create a document using schema created in step #1 using inline raw document & set property values.
- Create a document using schema created in step #1 using document stored in gcs & embed DocumentAI processor output alongwith.
- Link document created in step #4 to the folder
- Search document
- Clean-up
- Please ensure that you have a Document AI Warehouse instance in your project. You can follow this quickstart to complete the setup.
- Create a Document AI Invoice processor and update the
DOCAI_PROCESSOR_ID
variable below. - If you are using a Vertex AI Workbench Managed Notebook, ensure to grant the following roles:
If you are using your own dev environment please ensure to grant the specified permissions to the identity.
- Install dependencies mentioned in requirements.txt
pip install -r requirements.txt
- Open
dw_processing.ipynb
and follow the step-by-step guide.