Can only get at most 20 relevant comments #43

asheseux16 · 2024-03-17T13:21:27Z

Here is my code:

gen = fs.get_posts(
        post_urls = ["https://mbasic.facebook.com/" + fanpage + "/posts/" + '939460300875189'],
        options = {"comments": 30, "progress": True}
    )

post = next(gen)

comments = post['comments_full']

i = 0
for comment in comments:
    print(i)
    print(comment['comment_text'])
    print()
    i += 1

It returns at most 20 comments and they are all "relevant" ones but not "all comments".
The kevinzg version succeeds in getting all comments, but I think it requires using URLs starting with "pfbid" which I don't know how to get.

The text was updated successfully, but these errors were encountered:

Raymondkltse · 2024-03-17T13:28:25Z

in the extractor.py , i have made the following changes to get more than 20 comments.

        replies_url = comment.find(
            #"div.async_elem[data-sigil='replies-see-more'] a[href], div[id*='comment_replies_more'] a[href]",
            "div[id*='comment_replies_more'] a[href]",
            first=True,
        )

asheseux16 · 2024-03-17T17:28:34Z

Thank you for answering. My problem is about comments and not replies, and I found out it's actually this code

more_selector = f"div#see_next_{self.post.get('post_id')} a"

that is having problem - it returns none.
I passed the post_id into kwargs and it worked, I can now get all relevant comments, but still not "all comments".

I think it's something about mbasic pages, seems like it only gives relevant comments

Raymondkltse · 2024-03-17T23:15:48Z

i have managed to update some of the codes provided, you may see the code in my repository, especially the extractor

i would try to make pull request if i think it is good

asheseux16 · 2024-03-18T15:53:55Z

I got a lot of warnings and errors, and still only relevant comments are scraped

kilbu · 2024-05-23T13:49:56Z

i have managed to update some of the codes provided, you may see the code in my repository, especially the extractor

i would try to make pull request if i think it is good

@Raymondkltse
Since I cannot seem to be able to open an issue in your repo:

Thank you for the great work! But I have the same problem. While your code performs well in scraping comment replies, I cannot get all comments to a post. An example:

import facebook_scraper as fs
url = "https://facebook.com/100044484187467/posts/913109196848545"
MAX_COMMENTS = True
gen = fs.get_posts(
    post_urls=[url],
    base_url = "https://mbasic.facebook.com",
    cookies="mycookies.txt",
    options={"comments": MAX_COMMENTS, "progress": True}
)

post = next(gen)
result = post['comments_full']

result

gives me the following warnings:

_WARNING:facebook_scraper.extractors:[None] Extract method extract_post_url didn't return anything
ERROR:facebook_scraper.extractors:'NoneType' object has no attribute 'attrs'
WARNING:facebook_scraper.extractors:[None] Exception while running extract_user_id: KeyError('content_owner_id_new')
WARNING:facebook_scraper.extractors:[None] Extract method extract_video didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_video_thumbnail didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_video_id didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_video_meta didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_factcheck didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_share_information didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_listing didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_with didn't return anything
ERROR:facebook_scraper.extractors:ecwr: Extracting the replies URL NOT success
ERROR:facebook_scraper.extractors:ecwr: Extracting the replies URL NOT success
ERROR:facebook_scraper.extractors:Unable to parse comment <Element 'div' class=('ei',)>: 'NoneType' object has no attribute 'text'
ERROR:facebook_scraper.extractors:Unable to parse comment <Element 'div' class=('ej',)>: 'NoneType' object has no attribute 'text'
ERROR:facebook_scraper.extractors:Unable to parse comment <Element 'div' class=('bb', 'ek', 'dy')>: 'NoneType' object has no attribute 'text'
ERROR:facebook_scraper.extractors:ecwr: Extracting the replies URL NOT success
ERROR:facebook_scraper.extractors:ecwr: Extracting the replies URL NOT success
ERROR:facebook_scraper.extractors:Unable to parse comment <Element 'div' class=('ei',)>: 'NoneType' object has no attribute 'text'
ERROR:facebook_scraper.extractors:Unable to parse comment <Element 'div' class=('ej',)>: 'NoneType' object has no attribute 'text'
ERROR:facebook_scraper.extractors:Unable to parse comment <Element 'div' class=('bb', 'ek', 'dy')>: 'NoneType' object has no attribute 'text'
ERROR:facebook_scraper.extractors:ecwr: Extracting the replies URL NOT success
Oh USer_id is none, ............ and so on

And then the first most important comments, including the replies. But not all comments.

Can you imagine any solution to this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can only get at most 20 relevant comments #43

Can only get at most 20 relevant comments #43

asheseux16 commented Mar 17, 2024 •

edited

Loading

Raymondkltse commented Mar 17, 2024

asheseux16 commented Mar 17, 2024 •

edited

Loading

Raymondkltse commented Mar 17, 2024

asheseux16 commented Mar 18, 2024

kilbu commented May 23, 2024 •

edited

Loading

Can only get at most 20 relevant comments #43

Can only get at most 20 relevant comments #43

Comments

asheseux16 commented Mar 17, 2024 • edited Loading

Raymondkltse commented Mar 17, 2024

asheseux16 commented Mar 17, 2024 • edited Loading

Raymondkltse commented Mar 17, 2024

asheseux16 commented Mar 18, 2024

kilbu commented May 23, 2024 • edited Loading

asheseux16 commented Mar 17, 2024 •

edited

Loading

asheseux16 commented Mar 17, 2024 •

edited

Loading

kilbu commented May 23, 2024 •

edited

Loading