This repository contains a joint collection of libraries and tools for multimodal content analysis and machine translation from Aalto University, EURECOM, INA and University of Helsinki. Some of the tools included have been initiated before the MeMAD project and developed further during it, some are results of the project.
The collection consists of the following submodules:
- PicSOM: https://github.com/aalto-cbir/PicSOM
- DeepCaption: https://github.com/aalto-cbir/DeepCaption
- Visual storytelling: https://github.com/aalto-cbir/visual-storytelling
- Speech recognition training scripts for Finnish: https://github.com/psmit/char-fin-2017
- Speaker-aware ASR training: https://github.com/MeMAD-project/speaker-aware-attention-asr
- SphereDiar: https://github.com/Livefull/SphereDiar
- Multimodal ASR: https://github.com/aalto-speech/avsr
- Spoken language identification: https://github.com/py-lidbox/lidbox
- Audio event classification: https://github.com/MeMAD-project/AudioTagger
- Multi-modal image caption translation: https://github.com/MeMAD-project/image-caption-translation
- Statistical tools for caption dataset analysis: https://github.com/MeMAD-project/statistical-tools
- Face recognition: https://github.com/D2KLab/FaceRec
- Media memorability in MediaEval 2019-20: https://github.com/MeMAD-project/media-memorability
- Video content segmentation: https://github.com/MeMAD-project/content-segmentation
- MeMAD metadata converter: https://github.com/MeMAD-project/rdf-converter
- MeMAD Knowledge Graph API: https://github.com/MeMAD-project/api
- MeMAD Explorer: https://github.com/MeMAD-project/explorer
- MeMAD metadata interchange formats: https://github.com/MeMAD-project/interchange-formats
- inaSpeechSegmenter: https://github.com/ina-foss/inaSpeechSegmenter
- inaFaceGender: https://github.com/ina-foss/inaFaceGender
- Subtitle translation: https://github.com/MeMAD-project/subtitle-translation
- Tools for converting and aligning subtitles: https://github.com/MeMAD-project/subalign
- Speech translation: https://github.com/MeMAD-project/speech-translation
- Discourse-aware machine translation: https://github.com/MeMAD-project/doclevel-translation
- Cross-lingual content retrieval: https://github.com/MeMAD-project/cross-lingual-retrieval
- OPUS-MT: MT servers and pre-trained translation models: https://github.com/MeMAD-project/Opus-MT
- OPUS-MT-train: MT training procedures and pipelines: https://github.com/MeMAD-project/OPUS-MT-train
- OPUS-eval: A collection of MT benchmarks: https://github.com/MeMAD-project/OPUS-MT-eval
- The Tatoeba MT Challenge: Multilingual data sets and benchmarks for machine translation: https://github.com//MeMAD-project/Tatoeba-Challenge
- OPUS-CAT: MT plugins for professional translators: https://github.com/MeMAD-project/OPUS-CAT
- OPUS-translator: Web interface for machine translation: https://github.com/MeMAD-project/OPUS-translator
- Document-level machine translation benchmarks
- OpenSubtitles2018: a large collection of aligned movie subtitles
- TED2020: Aligned TedTalk subtitles
- QED: Aligned subtitles of educational videos