Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can only get at most 20 relevant comments #43

Open
asheseux16 opened this issue Mar 17, 2024 · 5 comments
Open

Can only get at most 20 relevant comments #43

asheseux16 opened this issue Mar 17, 2024 · 5 comments

Comments

@asheseux16
Copy link

asheseux16 commented Mar 17, 2024

Here is my code:

gen = fs.get_posts(
        post_urls = ["https://mbasic.facebook.com/" + fanpage + "/posts/" + '939460300875189'],
        options = {"comments": 30, "progress": True}
    )

post = next(gen)

comments = post['comments_full']

i = 0
for comment in comments:
    print(i)
    print(comment['comment_text'])
    print()
    i += 1

It returns at most 20 comments and they are all "relevant" ones but not "all comments".
The kevinzg version succeeds in getting all comments, but I think it requires using URLs starting with "pfbid" which I don't know how to get.

@Raymondkltse
Copy link

in the extractor.py , i have made the following changes to get more than 20 comments.

        replies_url = comment.find(
            #"div.async_elem[data-sigil='replies-see-more'] a[href], div[id*='comment_replies_more'] a[href]",
            "div[id*='comment_replies_more'] a[href]",
            first=True,
        )

@asheseux16
Copy link
Author

asheseux16 commented Mar 17, 2024

Thank you for answering. My problem is about comments and not replies, and I found out it's actually this code

more_selector = f"div#see_next_{self.post.get('post_id')} a"

that is having problem - it returns none.
I passed the post_id into kwargs and it worked, I can now get all relevant comments, but still not "all comments".

I think it's something about mbasic pages, seems like it only gives relevant comments

@Raymondkltse
Copy link

i have managed to update some of the codes provided, you may see the code in my repository, especially the extractor

i would try to make pull request if i think it is good

@asheseux16
Copy link
Author

I got a lot of warnings and errors, and still only relevant comments are scraped

@kilbu
Copy link

kilbu commented May 23, 2024

i have managed to update some of the codes provided, you may see the code in my repository, especially the extractor

i would try to make pull request if i think it is good

@Raymondkltse
Since I cannot seem to be able to open an issue in your repo:

Thank you for the great work! But I have the same problem. While your code performs well in scraping comment replies, I cannot get all comments to a post. An example:

import facebook_scraper as fs
url = "https://facebook.com/100044484187467/posts/913109196848545"
MAX_COMMENTS = True
gen = fs.get_posts(
    post_urls=[url],
    base_url = "https://mbasic.facebook.com",
    cookies="mycookies.txt",
    options={"comments": MAX_COMMENTS, "progress": True}
)

post = next(gen)
result = post['comments_full']

result

gives me the following warnings:

_WARNING:facebook_scraper.extractors:[None] Extract method extract_post_url didn't return anything
ERROR:facebook_scraper.extractors:'NoneType' object has no attribute 'attrs'
WARNING:facebook_scraper.extractors:[None] Exception while running extract_user_id: KeyError('content_owner_id_new')
WARNING:facebook_scraper.extractors:[None] Extract method extract_video didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_video_thumbnail didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_video_id didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_video_meta didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_factcheck didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_share_information didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_listing didn't return anything
WARNING:facebook_scraper.extractors:[None] Extract method extract_with didn't return anything
ERROR:facebook_scraper.extractors:ecwr: Extracting the replies URL NOT success
ERROR:facebook_scraper.extractors:ecwr: Extracting the replies URL NOT success
ERROR:facebook_scraper.extractors:Unable to parse comment <Element 'div' class=('ei',)>: 'NoneType' object has no attribute 'text'
ERROR:facebook_scraper.extractors:Unable to parse comment <Element 'div' class=('ej',)>: 'NoneType' object has no attribute 'text'
ERROR:facebook_scraper.extractors:Unable to parse comment <Element 'div' class=('bb', 'ek', 'dy')>: 'NoneType' object has no attribute 'text'
ERROR:facebook_scraper.extractors:ecwr: Extracting the replies URL NOT success
ERROR:facebook_scraper.extractors:ecwr: Extracting the replies URL NOT success
ERROR:facebook_scraper.extractors:Unable to parse comment <Element 'div' class=('ei',)>: 'NoneType' object has no attribute 'text'
ERROR:facebook_scraper.extractors:Unable to parse comment <Element 'div' class=('ej',)>: 'NoneType' object has no attribute 'text'
ERROR:facebook_scraper.extractors:Unable to parse comment <Element 'div' class=('bb', 'ek', 'dy')>: 'NoneType' object has no attribute 'text'
ERROR:facebook_scraper.extractors:ecwr: Extracting the replies URL NOT success
Oh USer_id is none, ............ and so on

And then the first most important comments, including the replies. But not all comments.

Can you imagine any solution to this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants