Need help with retrieving html data with VBA #145070
Replies: 2 comments 1 reply
-
Updated VBA Code to Retrieve All EventsSub RetrieveAllEvents()
Dim ie As Object
Dim strJson As String
Dim totalEvents As Integer
Dim lastHeight As Long
Dim currentHeight As Long
Dim scrollAttempts As Integer
Dim maxScrollAttempts As Integer
' Create a new Internet Explorer application
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
' Navigate to the webpage
ie.Navigate "https://www.meetup.com/cflfreethought/events/"
' Wait for the page to load completely
Do While ie.Busy Or ie.ReadyState <> 4
DoEvents
Loop
' Initialize variables
totalEvents = 0
lastHeight = 0
scrollAttempts = 0
maxScrollAttempts = 10 ' Set a limit to avoid infinite loops
' Scroll until no new content is loaded
Do While scrollAttempts < maxScrollAttempts
' Scroll down the page
ie.Document.parentWindow.scrollBy 0, 1000
' Wait for new content to load
Application.Wait Now + TimeValue("00:00:02") ' Adjust wait time if necessary
' Get the current height of the document
currentHeight = ie.Document.body.scrollHeight
' Check if the height has changed
If currentHeight = lastHeight Then
' No new content loaded, exit the loop
Exit Do
End If
' Update lastHeight for the next iteration
lastHeight = currentHeight
' Increment the scroll attempt counter
scrollAttempts = scrollAttempts + 1
Loop
' Now retrieve the HTML content
strJson = ie.Document.body.innerText
' Output the result to the Immediate Window (Ctrl + G to view)
Debug.Print strJson
' Optionally, parse strJson to extract the event data
' Your parsing logic goes here
' Clean up
ie.Quit
Set ie = Nothing
End Sub
|
Beta Was this translation helpful? Give feedback.
-
💬 Your Product Feedback Has Been Submitted 🎉 Thank you for taking the time to share your insights with us! Your feedback is invaluable as we build a better GitHub experience for all our users. Here's what you can expect moving forward ⏩
Where to look to see what's shipping 👀
What you can do in the meantime 💻
As a member of the GitHub community, your participation is essential. While we can't promise that every suggestion will be implemented, we want to emphasize that your feedback is instrumental in guiding our decisions and priorities. Thank you once again for your contribution to making GitHub even better! We're grateful for your ongoing support and collaboration in shaping the future of our platform. ⭐ |
Beta Was this translation helpful? Give feedback.
-
Body
This webpage lists events for an organization:
https://www.meetup.com/cflfreethought/events/
Currently, there are 33 events, but only 10 are displayed initially when the page is loaded.
The remaining 23 events are retrieved when downward scrolling occurs. This may be called dynamic loading of the data or webpage.
Data is needed for all 33 events.
This VB code returns the data for the first 10 events which can be parsed out using strJson.
Set ie = CreateObject("InternetExplorer.Application")
ie.Navigate html_webpage
strJson = ie.Document.body.innertext
The difficulty is trying to obtain data for events 11 to 33.
Some things that have been tried:
Me!EdgeBrowser1.ExecuteJavascript "window.scrollBy(0, 5000);"
Me!EdgeBrowser1.ExecuteJavascript "window.scrollTo(0, document.body.scrollHeight);"
Neither of these have worked. An html expert is needed to advise how to retrieve the missing data. Thanks!
Guidelines
Beta Was this translation helpful? Give feedback.
All reactions