Skip to content
View kermitt2's full-sized avatar

Organizations

@istex @termith-anr @anHALytics @science-miner @howisonlab @DataSeer

Block or report kermitt2

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Analyses of software mentions and dependencies

Go 5 Updated Jan 8, 2025

Tympi News web app

JavaScript 2 Updated Nov 6, 2024

Source of the article "Mining experimental data from Materials Science literature with Large Language Models: an evaluation study"

TeX 5 Updated Aug 15, 2024

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Jupyter Notebook 8,300 881 Updated Jul 26, 2024

Indexes metadata from Crossref into Elasticsearch. Primarily to be used with Biblio-Glutton

Go 3 Updated May 30, 2023

PhD Dissertation "Automated Extraction and Curation of Materials Information from Scientific Literature"

TeX 8 Updated Feb 20, 2024

Opensource IDE For Exploring and Testing Api's (lightweight alternative to postman/insomnia)

JavaScript 29,581 1,387 Updated Jan 11, 2025

Convert PDF to markdown + JSON quickly with high accuracy

Python 19,170 1,137 Updated Jan 10, 2025

Viewer for the structure extracted by Grobid on PDF documents

Python 43 8 Updated Jan 11, 2025

Streamlit PDF viewer

Python 117 9 Updated Jan 11, 2025

library supporting NLP and CV research on scientific papers

Python 724 57 Updated Nov 8, 2024

Scientific Document Insight Q/A

Python 26 5 Updated Nov 21, 2024

A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.

Python 56 5 Updated Jul 29, 2024

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Python 5,955 480 Updated Jul 11, 2024

Implementation of Nougat Neural Optical Understanding for Academic Documents

Python 9,140 585 Updated Apr 16, 2024

Python tools for processing the stackexchange data dumps into a text dataset for Language Models

Python 80 15 Updated Dec 6, 2023

Active learning for systematic reviews

Python 662 123 Updated Jan 11, 2025

One downloader for many scientific data and code repositories! DOI 👐 Data

Python 66 10 Updated Jan 6, 2025

The official codes for "PMC-LLaMA: Towards Building Open-source Language Models for Medicine"

Python 621 54 Updated Jul 8, 2024

A fast DVI, EPS, and PDF to SVG converter

C++ 316 34 Updated Jan 11, 2025

Slides and resources from my CSV Conf 2023 keynote

17 Updated May 1, 2023

Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)

Python 66 1 Updated Nov 11, 2022

This repository collects 100 papers related to negative sampling methods.

190 19 Updated Jun 25, 2023

A JavaScript library for text annotation

JavaScript 373 42 Updated Mar 28, 2024

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports comp…

Python 8,188 1,199 Updated Jan 12, 2025

Get answers to research questions from 200M+ papers. Link to demo -

Jupyter Notebook 204 22 Updated Dec 28, 2023

Easily compute clip embeddings and build a clip retrieval system with them

Jupyter Notebook 2,469 216 Updated Apr 15, 2024

Unstructured data extract platform based on LlamaIndex, Pgvector, React and Django.

Python 751 67 Updated Jan 8, 2025
Next
Showing results