GitHub - johnedstone/python-webscraping-example

References

Some notes about rendering

After reading a few selenium articles, this blog, and this article, it became clear that one should use the sleep parameter to give the page time to render.

Requests-HTML vs Selenium

The following are notes for Requests-HTML. See the Readme_Selenium.md for notes about getting started with Selenium

Simple script

Simple example of scraping the words "Log In" on Facebook's home page

(.venv) $ python simple_fb_test.py 

200
<button class="_42ft _4jy0 _6lth _4jy6 _4jy1 selected _51sy" data-testid="royal_login_button" id="u_0_5_gE" name="login" type="submit" value="1">Log In</button>
Log In

Getting started coding

$ python3 -m venv .venv
$ source .venv/bin/activate

$ python3
    Python 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.

>>> from requests_html import HTMLSession
>>> from bs4 import BeautifulSoup

>>> session = HTMLSession()
>>> r = session.get('https://www.facebook.com')

>>> r.status_code
200

>>> r.html.render()
>>> soup = BeautifulSoup(r.html.html, "html.parser")
>>> soup.find('button', string='Log In')
>>> result = soup.find('button', string='Log In')
>>> result.text

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
Readme.md		Readme.md
Readme_Selenium.md		Readme_Selenium.md
requirements.txt		requirements.txt
roscoe.py		roscoe.py
roscoe_v2.py		roscoe_v2.py
scraper.py		scraper.py
sendinblue_example_sendmail.py		sendinblue_example_sendmail.py
simple_fb_test.py		simple_fb_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

References

Some notes about rendering

Requests-HTML vs Selenium

Simple script

Getting started coding

About

Releases

Packages

Languages

johnedstone/python-webscraping-example

Folders and files

Latest commit

History

Repository files navigation

References

Some notes about rendering

Requests-HTML vs Selenium

Simple script

Getting started coding

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages