Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactored out to function to allow multipages (#4) #5

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

markman123
Copy link

Refactored out some of hte tasks to functions to allow for multipage... i've also allowed a limit of pages and made global variables for things that are re-used often

@funbeedev
Copy link
Member

funbeedev commented Oct 18, 2020

Lots of useful changes. But the error posted below needs to be addressed.

The error occurs whenever the popup appears when transitioning to the next page. The script will run until the popup appears.
For example, it could run up to page 5, but if the pop up appears on page 6 the error will occur and the program will stop without going through the remaining pages.

Can you run it several times to see if you also get the error? Thanks.

Traceback (most recent call last):
  File "job-search-web-scraping.py", line 77, in <module>
    indeed_job_search("machine learning", PAGES, PATH_TO_DRIVER)
  File "job-search-web-scraping.py", line 61, in indeed_job_search
    search_results = get_jobs(browser)
  File "job-search-web-scraping.py", line 40, in get_jobs
    return browser.find_elements_by_xpath("//h2/a")
  File "/home/user/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 410, in find_elements_by_xpath
    return self.find_elements(by=By.XPATH, value=xpath)
  File "/home/user/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 1005, in find_elements
    return self.execute(Command.FIND_ELEMENTS, {
  File "/home/user/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "/home/user/.local/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidSessionIdException: Message: Tried to run command without establishing a connection

@markman123
Copy link
Author

Hmm, I put a function to handle the pop-up in there, will test again

@funbeedev
Copy link
Member

@markman123 Any updates? :)

@funbeedev funbeedev linked an issue Oct 26, 2020 that may be closed by this pull request
@funbeedev funbeedev added the stale no activity label Jan 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale no activity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expand to retrieve more than one page of the search results
2 participants