Skip to content

Latest commit

 

History

History
13 lines (11 loc) · 873 Bytes

README.md

File metadata and controls

13 lines (11 loc) · 873 Bytes

Newsscraping

scrap data from different 20 news channel and add in to database through API, if you dont want to scrap a news chanel than change that channel veriable to 'True'. News channel list: Dawn, Geo, Ary, Ptv, Express, Sama, Duniya, Bol, Abbtak, Ninetytwonews, Abbtak, Twentyfournews, Gnnnews, Dailypakistan, Newsone, Mashion, Mangobaaz, Sunday, Urdunews, Urdupoint, Tribune, Hellopakistanmag, Zaiqa, Islamicinfoenglish, Islamicinfourdu, Royal, Neonews, City42, Jang.

requirements

  • python version 3.10
  • install pandas command: pip install pandas
  • install requests library for API, command: pip install requests
  • install soup library for scraping, command: pip install bs4
  • install lxml converter, command: pip install lxml

others

  • you need jupyter lab to run this scraping-for-latestnews.ipynb
  • if you dont have jupyter lab you can run .py file only