From failing to study medicine β‘οΈ BSc industrial engineer β‘οΈ MSc computer scientist.
Life can be strange, so better enjoy it.
IΒ΄m sure I do by: π¨π½βπ³ Cooking, π¨π½βπ» Coding, π Committing.
- No data? No problem! - synthetic data to the rescue
- Practical AI Podcast - Towards high-quality (maybe synthetic) datasets
- Code Together Podcast | Intel Software - Scaling LLM Datasets with Less Effort Using Argilla
- Mastering LLMs - Creating, curating, and cleaning data for LLMs
- π§Ό From GPU-poor to data-rich - data quality practices for LLM fine-tuning
- Deeplearning.ai LLM workshop - get started with Argilla for human- and distilabel for AI feedback
- NLP Healthcare Summit 2023 - Smart Shortcuts for Bootstrapping a Healthcare NER Project
- Anyscale Ray Europe Meetup - Smart shortcuts for Bootstrapping a Text Classification project
- Hugging Face π€ (2024-current) - The AI community building the future
- Argilla (2022-current) - data annotation and monitoring for enterprise NLP
- Pandora Intelligence (2020-2022) - an independent intelligence company, specialized in security risks
- observers - A Lightweight Library for AI Observability
- dataset-viber - Data viber is your chill repo for data collection and vibe checks
- concise-concepts - a word similarity approach to few-shot NER
- fast-sentence-transformers - simply, faster, sentence-transformers
- classy-classification - a quick and dirty few-shot text classification solution
- crosslingual-coreference - a multi-lingual CoRef resolver using cross-lingual training
- adept-augmentations - a Python library aimed at dissecting and augmenting NER training data
- spacy-setfit - a Python library aimed to facilitate easy SetFit usage in spaCy
- Haystack - small feature and CI/CD updates
- InMemoryDatabase - Serialization + to and from disk methods
- GitHub Actions - caching for pip environment
- spaCy - several additions to the spacy-universe
- spanmarker - added
.pipe()
method to spaCy integration - spacy-dbpedia-spotlight - added a batch processing functionality
- spacy-fishing - added a batch processing functionality + bug fixes
- spacy-opentapioca - added a batch processing functionality
- spanmarker - added
- streamlit-url-fragment - resolved Python versioning issues
- allennlp-models - added a batch processing functionality
- mutate - resolved Python versioning issues and added
PyPI
support - rebel - added a batch processing functionality
- trl - updated RLHF documentation for
PPOTrainer
- Bonfari - small to medium sustainable scale projects in Gambia π¬π²
- 510 red-cross - occasional projects to improve humanitarian aid with data