Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ATF importer #552

Open
wants to merge 47 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
68d54f7
Refactor atf importer (WiP)
khoidt May 16, 2024
e3e058a
Update ebl/atf_importer/application/lemma_lookup.py
khoidt May 16, 2024
ab33279
Update ebl/atf_importer/application/lemma_lookup.py
khoidt May 16, 2024
f6bed68
Update ebl/atf_importer/application/atf_importer_base.py
khoidt May 16, 2024
44f2daf
Update ebl/atf_importer/application/atf_importer_base.py
khoidt May 16, 2024
a3e9e52
Update ebl/atf_importer/domain/atf_preprocessor_cdli.py
khoidt May 16, 2024
43a9cfa
Update ebl/atf_importer/domain/atf_preprocessor_cdli.py
khoidt May 16, 2024
860ad08
Update ebl/atf_importer/domain/atf_preprocessor_cdli.py
khoidt May 16, 2024
171ebac
Fix lark paths
khoidt May 16, 2024
65aa777
Update test
khoidt May 16, 2024
0ffb31e
Refactor & update
khoidt May 17, 2024
72813c8
Clean up
khoidt May 17, 2024
6bdc503
Refactor more
khoidt May 17, 2024
b50e5b9
Update
khoidt May 17, 2024
08f4430
Fix type
khoidt May 21, 2024
ac21caa
Improve
khoidt May 21, 2024
4f53377
Improve
khoidt May 21, 2024
cf568d2
Update & fix preprocessor tests
khoidt May 23, 2024
9c19bc4
Refactor & update
khoidt May 23, 2024
115c5f6
Fix test (use transliteration chars)
khoidt May 23, 2024
0426000
Improve
khoidt May 23, 2024
7464010
Fix glossary data (WiP)
khoidt May 24, 2024
8d03672
Update, improve & refactor to fix test (WiP)
khoidt May 27, 2024
b1c8081
Update, refactor & add logging (WiP)
khoidt May 28, 2024
a7e070a
Update logging & improve
khoidt May 29, 2024
e662c72
Refactor
khoidt May 31, 2024
fb84e2c
Update logging
khoidt May 31, 2024
e39e8b3
Update preprocessor & add importer test (WiP)
khoidt Jun 3, 2024
14be543
Update atf preprocessor (WiP)
khoidt Jun 4, 2024
18a1ef7
Fix
khoidt Jun 4, 2024
3c50de9
Update structure, use only ebl atf parser (WiP)
khoidt Jun 12, 2024
1cea5ff
Refactor, update & fix tests (WiP)
khoidt Jul 4, 2024
172631a
Update (WiP)
khoidt Jul 10, 2024
b2a0405
Refactor & fix tests (WiP)
khoidt Oct 15, 2024
4e4693f
Merge remote-tracking branch 'origin/master' into atf-import-update
khoidt Oct 15, 2024
b4b8159
Update, refactor & fix (WiP)
khoidt Oct 23, 2024
276cc63
Update visitor & transformers (WiP)
khoidt Oct 24, 2024
0a0e481
Update transformers pipeline & tests (WiP)
khoidt Oct 29, 2024
60a809f
Add paths in transformers to trace ancestors & break at, fix tests
khoidt Oct 31, 2024
9f75e7f
Add & update transformers & tests (WiP)
khoidt Nov 5, 2024
1973bef
Update serialization logic & fix tests (WiP)
khoidt Nov 7, 2024
d3ab25b
Merge remote-tracking branch 'origin/master' into atf-import-update
khoidt Nov 7, 2024
7e20341
Add transformers and tests, update parser & fix typing
khoidt Nov 12, 2024
f1e3729
Refactor lark grammar (correct at line and structure in general)
khoidt Nov 15, 2024
9974997
Update & fix tests (WiP)
khoidt Nov 19, 2024
9ad365b
Fix more tests & format (WiP)
khoidt Nov 19, 2024
eedec78
Restructure main lark parser (WiP)
khoidt Nov 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -494,8 +494,8 @@ poetry run python -m ebl.atf_importer.application.atf_importer -i "ebl/atf_impor

#### Troubleshooting

If a fragment cannot be imported check the console output for errors. Also check the specified log folder (`error_lines.txt`,`unparseable_lines_[fragment_file].txt`, `not_imported.txt`) and see which lines could not be parsed.
If lines are faulty, fix them manually and retry the import process. If tokes are not lemmatized correctly, check the log-file `not_lemmatized.txt`.
If a fragment cannot be imported check the console output for errors. Also check the specified log folder (`error_lines.txt`,`unparsable_lines_[fragment_file].txt`, `not_imported_files.txt`) and see which lines could not be parsed.
If lines are faulty, fix them manually and retry the import process. If tokes are not lemmatized correctly, check the log-file `not_lemmatized_tokens.txt`.

## Acknowledgements

Expand Down
2 changes: 1 addition & 1 deletion docs/ebl-atf.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ encoding. The grammar definitions below use [EBNF](https://en.wikipedia.org/wiki
The EBNF grammar below is an idealized representation of the eBL-ATF as it does
not deal with ambiguities and implementation details necessary to create the
domain model in practice. A fully functional grammar is defined in
[ebl-atf.lark](https://github.com/ElectronicBabylonianLiterature/ebl-api/blob/master/ebl/transliteration/domain/ebl_atf.lark).
[ebl-atf.lark](https://github.com/ElectronicBabylonianLiterature/ebl-api/blob/master/ebl/transliteration/domain/atf_parsers/lark_parser/ebl_atf.lark).
The file uses the EBNF variant of the [Lark parsing library](https://github.com/lark-parser/lark).
See [Grammar Reference](https://lark-parser.readthedocs.io/en/latest/grammar/)
and [Lark Cheat Sheet](https://lark-parser.readthedocs.io/en/latest/lark_cheatsheet.pdf).
Expand Down
Loading
Loading