GitHub - donnekgit/autoglosser: Bangor Autoglosser

Bangor autoglosser

The code here was produced to POS-tag the conversational corpora assembled by the ESRC Centre for Research on Bilingualism in Theory & Practice at University of Wales Bangor.

The data was bilingual conversational running text, and the autoglosser tags it in one pass based on constraint grammar linguistic rules for each language.

Note that this code is not really packaged properly: because a lot of the work was done ad hoc, it's more like a compendium of things that worked for us. (To get a smaller, cleaner implementation, try the Gáidhlig autoglosser.)

This was remedied to some extent in the second version, Autoglosser2, though that was aimed at written Welsh only, rather than the conversational, code-switched, multilingual text in the Bangor corpora.

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
autoglosser		autoglosser
clauses		clauses
cognates		cognates
combiwords		combiwords
dbs		dbs
docs		docs
enlist		enlist
florian		florian
grammar		grammar
gt		gt
histcorpus		histcorpus
includes		includes
insertions		insertions
lookups		lookups
mc		mc
tex		tex
tiers		tiers
unknowns		unknowns
utils		utils
GNU_Affero_GPL.txt		GNU_Affero_GPL.txt
GNU_GPL.txt		GNU_GPL.txt
README.md		README.md
anonymise_audio.php		anonymise_audio.php
anonymise_audio_nowww.php		anonymise_audio_nowww.php
anonymise_audio_wav.php		anonymise_audio_wav.php
append_or.php		append_or.php
apply_cg.php		apply_cg.php
apply_prepub.php		apply_prepub.php
apply_traced_cg.php		apply_traced_cg.php
autogloss_only.php		autogloss_only.php
cgimport.php		cgimport.php
copy_header.php		copy_header.php
corrections.txt		corrections.txt
create_cgfinished.php		create_cgfinished.php
create_cgutterances.php		create_cgutterances.php
create_cgwords.php		create_cgwords.php
create_prepub.php		create_prepub.php
create_sampleclauses.php		create_sampleclauses.php
diff_my_files.php		diff_my_files.php
do_everything.php		do_everything.php
do_mor.php		do_mor.php
fix_um-uh.php		fix_um-uh.php
gather_fixes.php		gather_fixes.php
import_and_convert.php		import_and_convert.php
import_only.php		import_only.php
join_tags.php		join_tags.php
newlangid.php		newlangid.php
osfixes.php		osfixes.php
owfixes.php		owfixes.php
prepare_file.php		prepare_file.php
rewrite_utterances.php		rewrite_utterances.php
tidy_or.php		tidy_or.php
write_cgautogloss.php		write_cgautogloss.php
write_cgfinished.php		write_cgfinished.php
write_cohorts.php		write_cohorts.php
write_compare_glosses.php		write_compare_glosses.php
write_dataset.php		write_dataset.php
write_mysyntax.php		write_mysyntax.php
write_tagless.php		write_tagless.php
writeout_only.php		writeout_only.php

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bangor autoglosser

About

Releases

Packages

Contributors 2

Languages

donnekgit/autoglosser

Folders and files

Latest commit

History

Repository files navigation

Bangor autoglosser

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages