From 341b1297355af23133908f19854fe853fb5f9ac7 Mon Sep 17 00:00:00 2001 From: Adrien Barbaresi Date: Mon, 28 Oct 2024 14:48:16 +0100 Subject: [PATCH] update readme: add Trafilatura --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index bb746e3..2671439 100644 --- a/README.md +++ b/README.md @@ -42,6 +42,7 @@ This is a curated list of tools, resources, and services supporting the Digital - [OpenArchive](https://open-archive.org/) - Making it easy to store, share, and amplify your mobile media while protecting your identity. - [Open EU Data Portal](https://data.europa.eu/euodp/en/data/) - European Union open data. - [Social Feed Manager](https://gwu-libraries.github.io/sfm-ui/) - Open source software that harvests social media data and web resources from Twitter, Tumblr, Flickr, and Sina Weibo. +- [Trafilatura](https://trafilatura.readthedocs.io/) - Open source software to gather text and metadata on the Web: Crawling, scraping, extraction, output in multiple formats. Usable with Python, R and on the command-line. - [Transkribus](https://transkribus.eu/) - Transcribe. Collaborate. Share and benefit from cutting edge research in Handwritten Text Recognition! - [Textgrid](https://textgrid.de/) - Open source tools and services support humanistic scholars during the entire process of research, especially in digital scholarly editing. - [webrecorder.io](https://webrecorder.io/) - Web archiving service anyone can use for free to save web pages.