Skip to content

apinf/api-harvester

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

API-harvester

Apinf.io has a catalog that lists APIs. We need content. The harvester collects API metadata from multiple sources.

Product Owner: Jarkko Moilanen (APInf Oy), jarkko(at)apinf.io

Scope

The harvesting happens in 2 separate steps:

  • Source step
  • Transform step

Source step

We collect data in 3 different ways: web scraping, REST api and CSV files. The output is JSON. Source step requirements are in separate file.

Transform step

JSON input is transformed to the appropriate output using the correct datamodel. Transform step requirements are in separate file.

Graphical overview

Graphical overview

Requirements

  • No GUI is needed. We work with command line.
  • NodeJS is the most appropriate: JSON, mature, easy to deploy
  • This project needs to be done modular. This way we can expand easily.
  • We should be able to run this with a cron script or just by calling command line.
  • Needs to be easily deployed with Docker (Compose).

Out of scope

  • Any other site to be scraped (only programmableweb.com is in scope)
  • Conflict resolution
  • Ownership
  • Deleted APIs
  • Scheduling
  • Lifecycle
  • Performance
  • Frontend design
  • Legal

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •