If you use these data please cite this dataset using the DOI of the particular released version you were using
This dataset is licensed under a CC-BY-4.0 license
The goal of this work is to create a comparative dictionary […] as the primary source of historical data on the entire Austronesian language family.
- ABOUT.md provides the introduction to the legacy online version of the ACD at https://trussel2.com/ACD/ formatted as markdown document.
- USAGE.md describes tools and techniques which help with using the highly relational data of the ACD.
- VALIDATION.md describes how our dataset was validated, in particular with respect to completeness and correctness of the conversion from the legacy HTML pages.
- Varieties: 1,063 (linked to 820 different Glottocodes)
- Concepts: 86,502 (linked to 0 different Concepticon concept sets)
- Lexemes: 146,733
- Sources: 454
- Synonymy: 1.01
- Cognacy: 121,682 cognates in 10,857 cognate sets (5 singletons)
- Cognate Diversity: -1.26
- Entries missing sources: 38249/146733 (26.07%)
Name | GitHub user | Role |
---|---|---|
Robert Blust | author | |
Stephen Trussel | author | |
Robert Forkel | @xrotwang | DataCurator, author |
Alexander D. Smith | @The-Austronesianist | author |
The following CLDF datasets are available in cldf:
- CLDF Wordlist at cldf/cldf-metadata.json