GitHub - ayushayush591/EIGEN-High-Fidelity-Extraction-Document-Images

Joint Learning Aggregation for High-Fidelity Information Extraction from Document Images

EIGEN: Expert-Informed Joint Learning Aggregation for High-Fidelity Information Extraction from Document Images
Abhishek Singh, Venkatapathy Subramaninan, Ayush Maheshwari, Pradeep Narayan, Devi Prasad Shetty and Ganesh Ramakrishnan
Machine Learning For Health, (ML4H) 2023

###Instruction For Training Create a new virtual environment, navigate to this directory and run the following command:

git clone main branch.
Download CORDS receipt dataset in current directory "https://drive.google.com/drive/folders/1mKrsYBW7xXzfxNLSYwQ02bHayqVfe-94?usp=sharing".
pip install -r requirements.txt for installing all the dependency.
git clone https://github.com/iitb-research-code/spear4HighFidelity.gitto get all the required files to run spear and CAGE.
Then change labeling function as per your need, Ex- adding or removing labeling function and make appropriate changes.(optional).
Run labeling_function file python main.py
Your pickle file which was required for training and trained Model files will get store in Paths folder.

###Files information

Cage_cords.ipynb is the file which contains code for running CAGE model on Cords dataset.
NH_cage.ipynb is the file which contains code for running CAGE model on NH dataset.
Paths directory contain all the pickle files which is needed for training.
cords_demo.ipynb is the file which contains code for running inference on CORDS data from the stored model.
nh_demo.ipynb is the file which contains code for running inference on NH data from the stored model.
train.py has the code for Jointly training of feature model and Cage model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Joint Learning Aggregation for High-Fidelity Information Extraction from Document Images

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
Paths		Paths
Cage_cords.ipynb		Cage_cords.ipynb
NH_cage.ipynb		NH_cage.ipynb
README.md		README.md
all_dependency.txt		all_dependency.txt
cords_demo.ipynb		cords_demo.ipynb
main.py		main.py
nh_demo.ipynb		nh_demo.ipynb
requirements.txt		requirements.txt
train.py		train.py
train_copy.py		train_copy.py

ayushayush591/EIGEN-High-Fidelity-Extraction-Document-Images

Folders and files

Latest commit

History

Repository files navigation

Joint Learning Aggregation for High-Fidelity Information Extraction from Document Images

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages