Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

federation between instances showing many comments not replicating (2023-06-13 example) #3101

Closed
RocketDerp opened this issue Jun 14, 2023 · 36 comments
Labels
area: federation support federation via activitypub bug Something isn't working

Comments

@RocketDerp
Copy link
Contributor

RocketDerp commented Jun 14, 2023

the normal bug report form does not serve this kind of issue. So far, the problem can be observed and reported, but not identified at the code or log levels.

Even after lemmy.ml upgraded hardware yesterday, this posting serves as an example of how comments are not making it to other instances of Lemmy. This is the same posting, different id on each instance:

https://lemmy.ml/post/1239920 has 19 comments (community home)
https://lemmy.pro/post/1354 has 1 comment
https://sh.itjust.works/post/74015 has 9 comments
https://lemmy.world/post/103943 has 9 comments
https://programming.dev/post/29631 has 13 comments
https://beehaw.org/post/536574 has 7 comments
https://lemmy.haigner.me/post/8 has 6 comments (posting originated with user there)
https://lemmy.wtf/post/1021 has 10 comments

@RocketDerp RocketDerp added the bug Something isn't working label Jun 14, 2023
@Nutomic
Copy link
Member

Nutomic commented Jun 14, 2023

I answered this in #3062 (comment)

@Nutomic Nutomic added the area: federation support federation via activitypub label Jun 14, 2023
@Hellhound1
Copy link

@Nutomic could you take a look at this then please, I believe it's the same problem but can't find any reason in the logs to suggest why this is happening.

https://lemmy.zip/post/15952 - our community
https://lemmy.world/post/72446 - their community

To add to that, if I look at the user's profile from our community, there is only one comment:
https://lemmy.zip/u/[email protected]

If I look from their community, lots of comments including others in that community:
https://lemmy.world/u/robotsheepboy

Is this the same issue? There is no user ban, and no community ban either.

@RocketDerp

This comment was marked as abuse.

@Loriborn
Copy link

Loriborn commented Jun 16, 2023

I'm experiencing a similar issue on my instance but I'm not sure if it's a federation issue or an issue with my configuration. My setup is via Docker, and while I can federate, and see federated instances/communities, I cannot see votes or comments at all.

tabletop.place

@Berulacks
Copy link

Berulacks commented Jun 16, 2023

I'm experiencing a similar issue on my instance but I'm not sure if it's a federation issue or an issue with my configuration. My setup is via Docker, and while I can federate, and see federated instances/communities, I cannot see votes or comments at all.

tabletop.place

I'm having the same issue, trying to figure out what I did wrong during set-up.

Edit: My fault, I was using Postgresql 11 instead of 15 - was getting errors trying to add comments to the DB. Woopsie! After upgrading Postgres and my DB, all's good.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@DomiStyle
Copy link

This is by far the biggest issue I'm having with Lemmy. Content being missing from federated instances is sort of a deal breaker. I posted these examples 3 days ago but the situation has still not improved:

https://lemmy.ml/post/1250165

Instance Comments Votes
lemmy.ml 13 95
beehaw.org 6 22
lemmy.world 11 64
sh.itjust.works 10 45

https://beehaw.org/post/548636

Instance Comments Votes
beehaw.org 70 59
lemmy.ml 63 20
lemmy.world 66 160
sh.itjust.works 62 127

This seems to mostly happen with lemmy.ml and beehaw.org so probably related to server load. lemmy.ml is pretty much dead with just 1/4 of comments coming through. In smaller communities sometimes none.

With lemmy.ml it has gotten so bad that even trying to subscribe to a new community is stuck at "Subscribe pending" while subscriptions from lemmy.world are done in under a second. beehaw.org needs a few seconds to come through but it works.

From issue #3062 @Nutomic :

@DomiStyle Im not sure whats the reason, its certainly worth investating. A possibility would be instance blocks or user bans. Could also be networking problems, or a software bug. There is also #2142 which means that activities will get lost during restart.

I don't think it's related to instance blocks or user bans, my instance is not blocked on any of the listed instances. Also not related to server restarts since it's been like this for days. Since it happens on all instances probably also not network related.

Is there a way to get more diagnostics info from Lemmy? Like database queries per second, average query response time, running jobs/tasks, incoming/outgoing federation activities, amount of errors per hour/day/week?

It seems like we're poking in the dark right now.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@DomiStyle
Copy link

Looks very promising so far, posts and comments from lemmy.ml are flowing in again.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@jheidecker
Copy link

REPEAT from over 30 hours ago: I am reaching a point where I feel like the major site operators are not publicly, on Lemmy or Github, acknowledging the scaling problems that Lemmy platform is having.

The 0.18 RELEASE notes should have said that there are major problems with data not making it between instances. The front page of this GitHub repo says that Lemmy is "high performance", and in reality, it is not scaling and the person who did the DDOS yesterday causing multi-hour outage shared the details of just how trivial it is to bring down 0.18.0

Uh. Seriously. the elephant in the room right now. Pretty weird to think these admins are already building their little empires for fake internet points. I don't think the developers need to fix scalability problems if someone's instance running on a raspberry pi blows up metaphorically, and then blows up literally. They should change the messaging around "high performance" though.

@Nutomic
Copy link
Member

Nutomic commented Jun 26, 2023

LemmyNet/activitypub-federation-rust#52 will help with this.

@artindk
Copy link

artindk commented Jun 27, 2023

Another example. My comment on https://lemmy.world/post/530448 is not visible on https://rblind.com/post/2240607.

@RocketDerp

This comment was marked as abuse.

@RocketDerp

This comment was marked as abuse.

@Nutomic
Copy link
Member

Nutomic commented Jun 28, 2023

The federation fix mentioned above is still not merged (#3379). It will be included in one of the next rcs, so you should wait a bit with further testing. Anyway pasting different comment counts is not helpful at all.

@RocketDerp

This comment was marked as abuse.

@sunaurus
Copy link
Collaborator

sunaurus commented Jul 1, 2023

I am seeing much improved federation on 0.18.1. After lemmy.world upgraded to 0.18.1 today, there is a big amount of lemmy.world posts and comments now visible on the lemm.ee front page - it's a huge improvement compared to when lemmy.world was on 0.17.4.

@RocketDerp

This comment was marked as abuse.

@sunaurus
Copy link
Collaborator

sunaurus commented Jul 4, 2023

lemmy.ml is not on 0.18.1 yet @anonCantCode

@RocketDerp

This comment was marked as abuse.

@Dakkaron
Copy link

Dakkaron commented Jul 7, 2023

The improved federation due to performance fixes is really good and important.

But is there any kind of retry mechanism in case syncing fails?

With growing amounts of users, Lemmy is bound to run into performance issues again.

Would be good to have some kind of eventual consistency features here.

@sunaurus
Copy link
Collaborator

sunaurus commented Jul 7, 2023

There is a retry mechanism, but no guaranteed eventual consistency.

  • Retries are not unlimited, so if the destination server is not reachable for several retry attempts (which are spaced out quite a lot, by the way), then retrying will stop
  • There is no persistence for outgoing messages currently, so killing lemmy_server will lose all messages in the retry queue (and all unsent messages in general)

@RocketDerp

This comment was marked as abuse.

@james2432
Copy link

@RocketDerp

This comment was marked as abuse.

@airjer
Copy link

airjer commented Jul 11, 2023

Same issue here with my own instance on the latest release. Posts from over 24 hours ago still aren’t showing up on the alternate instance.

@RocketDerp

This comment was marked as abuse.

@alesito85
Copy link

Any tips on what to do if this is happening to new instances?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: federation support federation via activitypub bug Something isn't working
Projects
None yet
Development

No branches or pull requests