-
-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault (core dumped) in CAR reading from Firehose #437
Comments
Super duper interesting. Probably fails on https://github.com/MarshalX/python-libipld side. Could you please export problematic data bytes to the file? And prepare reproducible example smth like It will help a lot! Thank you |
Does this help?
|
@mockthebear thank you! Super strange. I run repos firehose from 3720770000 cursor. Nothing happens locally for a 3+ min. How much should I wait to reach problematic frame? Speaking of your example it gives errors, but with open('crashy.txt', 'r') as file:
binary_data = ast.literal_eval(file.read().strip())
with open('data.bin', 'wb') as file:
file.write(binary_data) and now it does not segfaults, but gives proper errors about wrong varints, etc this is the code that I use to reproduce with cursor (pls run locally): from atproto import models, FirehoseSubscribeReposClient, firehose_models, parse_subscribe_repos_message
client = FirehoseSubscribeReposClient(models.ComAtprotoSyncSubscribeRepos.Params(cursor=3720770000))
def on_message_handler(message: firehose_models.MessageFrame) -> None:
_ = parse_subscribe_repos_message(message)
print(message.header)
client.start(on_message_handler) |
okay, found thanks to discord @DavidBuchanan314 executing the reported issue MarshalX/python-libipld#9 |
Usually a few seconds, it never passes trough the xxx7000 to xxx8000 Sorry for that, i'm not as experienced in python, so i asked for gpt to parse that output for me x.x I see i see. Maybe because the machine i'm running it has only 1gb of ram left it happens? The code i'm running is this one and it crashes on this line: Changing just a bit your code, i get the crash:
It takes about 9889 lines of 'MessageFrameHeader' to get to the problematic message |
Reproduced, thank you! Known edge case, but this is first time when someone exploits it in the network |
Aparently the post came from https://bsky.app/profile/david.dev.retr0.id |
Hi, I encountered the same issue. Currently, my approach is to skip the data/cursor after Segmentation Fault occurs. Is there a better solution for this? Thanks :) |
Thats the same solution i did, and thats THE solution so far. Until someone fixes it. Lets just hope someone does not abuse this bug again XD |
encountered the same issue here. Does it help to upgrade the |
@Li-WeiCheng hi, no better solution yet :( |
I finished with fix. Gonna release it soon MarshalX/python-libipld#51 |
i was consuming data from the firehouse, specifically at cursor: 3720770000, every time i get a consistent Core Dumped
I managed to find the crash is coming from:
.venv/lib/python3.12/site-packages/atproto_firehose/client.py: frame = Frame.from_bytes(data)
If i print the data i get:
There is no verification of invalid data or too big data, causing the crash
The text was updated successfully, but these errors were encountered: