This demo illustrates how to build a fully automated specialized document processing pipeline with Document AI.
For an example use case, the application is equipped to process an individual US Tax Return using the Lending Document AI Processors.
NOTE: Most of the Processors in this Demo require allowlisting to use.
Read More about Lending DocAI
- Document Classification
- Document Parsing
- Entity Extraction
- Data Storage
- Data Processing
- Lending Document AI Processors
LDAI Splitter & Classifier
W-2 Parser
1099 Parser(s)
- Firestore (Native Mode)
- Cloud Run
- Install Python
- Install the Google Cloud SDK
- Install the prerequisites:
pip install -r requirements.txt
- Run
gcloud init
, create a new project, and enable billing - Enable the Document AI API:
gcloud services enable documentai.googleapis.com
- Setup application default authentication, run:
gcloud auth application-default login
- Create a Firestore Database in Native Mode
gcloud firestore databases create
-
Create a
config.yaml
with the following formatdocai_processor_location: us # Document AI Processor Location (us OR eu) docai_project_id: YOUR_PROJECT_ID # Project ID for Document AI Processors firestore: collection: tax_documents # Set with your preferred Firestore Collection Name project_id: YOUR_PROJECT_ID # Project ID for Firestore Database docai_active_processors:
-
Run setup scripts to create the processors and Cloud Run app in your project.
python3 setup.py
gcloud run deploy tax-demo --source .
-
Visit the deployed web page
-
Upload Sample Documents
- Currently supports the following Document Types (2020 Editions)
W-2
1099-DIV
1099-INT
1099-MISC
1099-NEC
- Currently supports the following Document Types (2020 Editions)
-
Click "Upload" Button, wait for processing to complete
-
Click "View Saved Data" to see the tax calculation output
- This output is designed to match up with the 2020
1040
US Tax Return Form
- This output is designed to match up with the 2020
WARNING: This is NOT financial advice, for educational purposes only!
Copyright 2022 Google LLC Author: Holt Skinner