Hassle-free web scraping service.
- Support client-side-rendered web pages
- Auto extract metadata and article content
- Extract DOM elements via CSS selectors
- Domain blocking (when
BLOCKLIST_URL
environment variable provided) - HTTP proxy (when
HTTP_PROXY
environment variable provided)
- Bundled with a blocklist of over 57,000 adware and malware domains
- Built-in user-agent pool
- Built-in rotating proxies
- Node.js >= 14
- Environment variables specified in .env.example
$ npm i # yarn install
$ npm run start:dev # yarn start:dev
$ npm run docker:build:app # yarn docker:build:app
$ npm run docker:start:prod # yarn docker:start:prod
Start the app and go to /docs
for interactive API documentation.
Read more here.
Read more here.