OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones? 🔍

The first attempt to apply large multimodal models in paleography and archaeology

Zijian Chen¹, Tingzhu Chen^2*, Wenjun Zhang¹, Guangtao Zhai^1*

¹Institute of Image Communication and Information Processing, Shanghai Jiao Tong University

²School of Humanities, Shanghai Jiao Tong University

^*Corresponding authors

中文版速递：知乎

Overview of the OBI-Bench: OBI-Bench presents five in-process tasks: 1) recognition: locating dense oracle bone characters from original oracle bone or rubbings; 2) rejoining: reconstructing fragmented text fragments into coherent texts; 3) classification: categorizing individual characters into their respective meanings; 4) retrieval: returning relevant results according to the given query OBI images; 5) deciphering: interpreting the OBI for historical and cultural investigation.

Release

[2024/12/2] 🔥Github repo for OBI-Bench is online.

General Principles

Focusing on OBI Task-oriented Abilities of LMMs & Covering Multi-stage Font Appearances

Image Sources

We collect 5,523 OBI images from 11 distinct sources. Due to the lack of publicly available OBI recognition datasets on real oracle bones and OBI rejoining datasets, we propose the original oracle bone recognition (O2BR) dataset and OBI-rejoin dataset.

Benchmark Candidates

We select 23 up to date and prevailing LMMs for evaluation including 6 proprietary LMMs and 17 open-source LMMs.

Performance Benchmark on Five OBI Tasks

Results on the recognition tasks (click to expand)

Results on the rejoining tasks (click to expand)

Results on the classification tasks (click to expand)

Effects of the number of character categories on classification accuracy:

Results on the retrieval tasks (click to expand)

Results on the deciphering tasks (click to expand)

Comparison between GPT-4o and Qwen-VL-Max:

More deciphering results (click to expand)

Original Oracle Bone Recognition (O2BR) Dataset 📦

To be released

OBI-rejoin Dataset 📦

To be released

Contact 📧

Please contact the first author of this paper for queries.

Zijian Chen, [email protected]

Citation📎

If you find our work interesting, please feel free to cite our paper:

@article{chen2024obi,
  title={OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?},
  author={Chen, Zijian and Chen, Tingzhu and Zhang, Wenjun and Zhai, Guangtao},
  journal={arXiv preprint arXiv:2412.01175},
  year={2024}
}

Acknowledgements💡

We extend our deepest gratitude to the frontline OBI researchers and scholars involved in the meticulous collation and proofreading of the oracle bone inscriptions. It is your persistent manual efforts that have provided a valuable data foundation for the development of artificial intelligence models.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
figure		figure
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones? 🔍

Release

General Principles

Focusing on OBI Task-oriented Abilities of LMMs & Covering Multi-stage Font Appearances

Image Sources

Benchmark Candidates

Performance Benchmark on Five OBI Tasks

Original Oracle Bone Recognition (O2BR) Dataset 📦

OBI-rejoin Dataset 📦

Contact 📧

Citation📎

Acknowledgements💡

About

Releases

Packages

OBI-Future/OBI-Bench

Folders and files

Latest commit

History

Repository files navigation

OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones? 🔍

Release

General Principles

Focusing on OBI Task-oriented Abilities of LMMs & Covering Multi-stage Font Appearances

Image Sources

Benchmark Candidates

Performance Benchmark on Five OBI Tasks

Original Oracle Bone Recognition (O2BR) Dataset 📦

OBI-rejoin Dataset 📦

Contact 📧

Citation📎

Acknowledgements💡

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages