Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jump to end of reports on init, better handle missing next_token #360

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 36 additions & 19 deletions src/report/ReportPoller.ts
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ export class ReportPoller {
* https://matrix-org.github.io/synapse/latest/admin_api/event_reports.html
* "from" is an opaque token that is returned from the API to paginate reports
*/
private from = 0;
private from: number | null = null;
/**
* The currently-pending report poll
*/
Expand Down Expand Up @@ -60,26 +60,32 @@ export class ReportPoller {
}

private async getAbuseReports() {
let params: { dir: string, from?: number} = {
// short for direction: forward; i.e. show newest last
dir: "f",
}
if (this.from !== null) {
params["from"] = this.from;
}

let response_: {
event_reports: { room_id: string, event_id: string, sender: string, reason: string }[],
next_token: number | undefined
next_token: number | undefined,
total: number,
} | undefined;
try {
response_ = await this.mjolnir.client.doRequest(
"GET",
"/_synapse/admin/v1/event_reports",
{
// short for direction: forward; i.e. show newest last
dir: "f",
from: this.from.toString()
}
params,
);
} catch (ex) {
await this.mjolnir.logMessage(LogLevel.ERROR, "getAbuseReports", `failed to poll events: ${ex}`);
return;
}

const response = response_!;

for (let report of response.event_reports) {
if (!(report.room_id in this.mjolnir.protectedRooms)) {
continue;
Expand All @@ -104,18 +110,29 @@ export class ReportPoller {
});
}

/*
* This API endpoint returns an opaque `next_token` number that we
* need to give back to subsequent requests for pagination, so here we
* save it in account data
*/
if (response.next_token !== undefined) {
this.from = response.next_token;
try {
await this.mjolnir.client.setAccountData(REPORT_POLL_EVENT_TYPE, { from: response.next_token });
} catch (ex) {
await this.mjolnir.logMessage(LogLevel.ERROR, "getAbuseReports", `failed to update progress: ${ex}`);
}
let from;
if (this.from === null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that we want to do something to avoid displaying tens of thousands of old reports the first time we start the Report Poller on the server.

I believe that we should rather:

  • Upon startup of the ReportPoller.
    1. Load from from Mjölnir's account data.
    2. If there is no from
      a. Fetch a cutout date from the configuration, or some default date if not specified (say August 1st 2022).
      b. Start background enumerating all reports until the cutout date, without displaying them.
      c. Use this to initialize from. This may also give us a first few abuse reports that will need to be returned the first time we call getAbuseReports.
      d. Store this value of from in Mjölnir's account data.
  • Whenever we call getAbuseReports.
    1. Wait until background enumeration is complete before returning any new reports.
    2. By definition, we have a from (which may be null if the server has never encountered any abuse report).
    3. Proceed as usual.
    4. Whenever we update from, store it in Mjölnir's account data.

What do you think?

/*
* If this is our first call to this endpoint, we want to skip to
* the end of available reports, so we'll only consider reports
* that happened after we supported report polling.
*/
from = response.total;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds fishy. Isn't from supposed to be an opaque identifier?

} else {
/*
* If there are more pages for us to read, this endpoint will
* return an opaque `next_token` number that we want to provide
* on the next endpoint call. If not, we're on the last page,
* which means we want to skip to the end of this page.
*/
from = response.next_token ?? this.from + response.event_reports.length;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this.from + response.event_reports.length

isn't ideal because the from token is meant to be opaque, but i don't see a better way to do this. i've been in discussions with synapse devs to talk about how this endpoint could be better

}

this.from = from;
try {
await this.mjolnir.client.setAccountData(REPORT_POLL_EVENT_TYPE, { from: from });
} catch (ex) {
await this.mjolnir.logMessage(LogLevel.ERROR, "getAbuseReports", `failed to update progress: ${ex}`);
}
}

Expand Down