Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chat thread missing original message thread was created from #1265

Open
5 of 6 tasks
realkosty opened this issue Jul 30, 2024 · 6 comments
Open
5 of 6 tasks

Chat thread missing original message thread was created from #1265

realkosty opened this issue Jul 30, 2024 · 6 comments

Comments

@realkosty
Copy link

realkosty commented Jul 30, 2024

Version

v2.43.3

Flavor

CLI (Command-Line Interface)

Platform

MacOS

Export format

TXT

Steps to reproduce

https://discord.gg/XYVZf6we (public channel)

./DiscordChatExporter.Cli exportguild -t "$DISCORD_TOKEN" -g 621778831602221064 --after "2024-06-11 00:00" --before "2024-06-14 00:00" --include-threads all -f PlainText -o ~/Desktop/Chat-thread-missing-original-%G-%a-%b

Locate file

Details

Actual result:

Note [6/12/2024 6:03 PM] charonthegondolier missing

==============================================================
Guild: Sentry Community
Channel: Sentry / 🪀|chat / When grouping things based on callstack
After: 6/23/2023 12:00 AM
==============================================================

[6/12/2024 7:33 PM] dhrumilpm



[6/12/2024 7:33 PM] dhrumilpm
Hi do you mind providing some more details about your use case, what are things in your stacktrace that we should be avoiding?

You can check this doc out to see if custom fingerprinting can help solve your use:

https://docs.sentry.io/concepts/data-management/event-grouping/fingerprint-rules/

{Embed}
https://docs.sentry.io/concepts/data-management/event-grouping/fingerprint-rules/
Fingerprint Rules
Learn about fingerprint rules, matchers for fingerprinting, how to combine matchers, and using variables for fingerprinting.
https://images-ext-1.discordapp.net/external/jFgbTN27f4PjB2EGzrUacwBAzX1TBZ0D_7VPUdNhUmc/https/sentry-docs-cc5bz9ym0.sentry.dev/meta.jpg


[6/13/2024 6:45 AM] charonthegondolier
It's mostly the stuff at the very beginning,
FEngineLoop::Tick vs. FTaskGraphCompatibilityImplementation::ProcessThreadUntilRequestReturn

These are two originating points where the code can end up being called, but these are things like 30 steps down the frame, i don't care about the stuff *that* far back in the frame.


==============================================================
Exported 3 message(s)
==============================================================

Expected result

==============================================================
Guild: Sentry Community
Channel: Sentry / 🪀|chat / When grouping things based on callstack
After: 6/23/2023 12:00 AM
==============================================================

[6/12/2024 6:03 PM] charonthegondolier
When grouping things based on callstack, is there a way to tell sentry where is a useful place in callstacks to begin grouping from, and ignore everything before X?

[6/12/2024 7:33 PM] dhrumilpm
Hi do you mind providing some more details about your use case, what are things in your stacktrace that we should be avoiding?

You can check this doc out to see if custom fingerprinting can help solve your use:

https://docs.sentry.io/concepts/data-management/event-grouping/fingerprint-rules/

{Embed}
https://docs.sentry.io/concepts/data-management/event-grouping/fingerprint-rules/
Fingerprint Rules
Learn about fingerprint rules, matchers for fingerprinting, how to combine matchers, and using variables for fingerprinting.
https://images-ext-1.discordapp.net/external/jFgbTN27f4PjB2EGzrUacwBAzX1TBZ0D_7VPUdNhUmc/https/sentry-docs-cc5bz9ym0.sentry.dev/meta.jpg


[6/13/2024 6:45 AM] charonthegondolier
It's mostly the stuff at the very beginning,
FEngineLoop::Tick vs. FTaskGraphCompatibilityImplementation::ProcessThreadUntilRequestReturn

These are two originating points where the code can end up being called, but these are things like 30 steps down the frame, i don't care about the stuff *that* far back in the frame.


==============================================================
Exported 3 message(s)
==============================================================

Checklist

  • I have looked through existing issues to make sure that this bug has not been reported before
  • I have provided a descriptive title for this issue
  • I have made sure that this bug is reproducible on the latest version of the application
  • I have provided all the information needed to reproduce this bug as efficiently as possible
  • I have sponsored this project
  • I have not read any of the above and just checked all the boxes to submit the issue
@realkosty realkosty added the bug label Jul 30, 2024
@Tyrrrz
Copy link
Owner

Tyrrrz commented Aug 1, 2024

Hi. Can you check if this also happens when you export in HTML? I think the issue might be that the TXT format does not know how to render the "thread start" system event.

@realkosty
Copy link
Author

Hi @Tyrrrz, no it looks like HTML behaves the same way:

Thread - EXPECTED (Discord.com)

expected-thread

Thread - ACTUAL (thread export HTML)

Note how this is missing the full initial message, only has truncated version of it as the subject line

actual-thread-HtmlDark

Channel view on Discord.com

expected-channel

Channel export HTML

actual-channel-HtmlDark

@Tyrrrz
Copy link
Owner

Tyrrrz commented Aug 15, 2024

Thanks for the screenshots

@realkosty
Copy link
Author

@Tyrrrz
could you please point me to the area in the code to look at?
or alternatively lmk if you are interested in a commercial arrangement to prioritize this fix on your end

@Tyrrrz
Copy link
Owner

Tyrrrz commented Sep 4, 2024

@realkosty your points of interest are:

  • [Command("exportguild", Description = "Exports all channels within the specified server.")]
    public class ExportGuildCommand : ExportCommandBase
    {
    [CommandOption("guild", 'g', Description = "Server ID.")]
    public required Snowflake GuildId { get; init; }
    [CommandOption("include-vc", Description = "Include voice channels.")]
    public bool IncludeVoiceChannels { get; init; } = true;
    [CommandOption(
    "include-threads",
    Description = "Which types of threads should be included.",
    Converter = typeof(ThreadInclusionModeBindingConverter)
    )]
    public ThreadInclusionMode ThreadInclusionMode { get; init; } = ThreadInclusionMode.None;
    public override async ValueTask ExecuteAsync(IConsole console)
    {
    await base.ExecuteAsync(console);
    var cancellationToken = console.RegisterCancellationHandler();
    var channels = new List<Channel>();
    // Regular channels
    await console.Output.WriteLineAsync("Fetching channels...");
    var fetchedChannelsCount = 0;
    await console
    .CreateStatusTicker()
    .StartAsync(
    "...",
    async ctx =>
    {
    await foreach (
    var channel in Discord.GetGuildChannelsAsync(GuildId, cancellationToken)
    )
    {
    if (channel.IsCategory)
    continue;
    if (!IncludeVoiceChannels && channel.IsVoice)
    continue;
    channels.Add(channel);
    ctx.Status(Markup.Escape($"Fetched '{channel.GetHierarchicalName()}'."));
    fetchedChannelsCount++;
    }
    }
    );
    await console.Output.WriteLineAsync($"Fetched {fetchedChannelsCount} channel(s).");
    // Threads
    if (ThreadInclusionMode != ThreadInclusionMode.None)
    {
    await console.Output.WriteLineAsync("Fetching threads...");
    var fetchedThreadsCount = 0;
    await console
    .CreateStatusTicker()
    .StartAsync(
    "...",
    async ctx =>
    {
    await foreach (
    var thread in Discord.GetGuildThreadsAsync(
    GuildId,
    ThreadInclusionMode == ThreadInclusionMode.All,
    Before,
    After,
    cancellationToken
    )
    )
    {
    channels.Add(thread);
    ctx.Status(Markup.Escape($"Fetched '{thread.GetHierarchicalName()}'."));
    fetchedThreadsCount++;
    }
    }
    );
    await console.Output.WriteLineAsync($"Fetched {fetchedThreadsCount} thread(s).");
    }
    await ExportAsync(console, channels);
    }
  • public async ValueTask ExportChannelAsync(
    ExportRequest request,
    IProgress<Percentage>? progress = null,
    CancellationToken cancellationToken = default
    )
    {
    // Forum channels don't have messages, they are just a list of threads
    if (request.Channel.Kind == ChannelKind.GuildForum)
    {
    throw new DiscordChatExporterException(
    $"Channel '{request.Channel.Name}' "
    + $"of guild '{request.Guild.Name}' "
    + $"is a forum and cannot be exported directly. "
    + "You need to pull its threads and export them individually."
    );
    }
    // Check if the channel is empty
    if (request.Channel.IsEmpty)
    {
    throw new DiscordChatExporterException(
    $"Channel '{request.Channel.Name}' "
    + $"of guild '{request.Guild.Name}' "
    + $"does not contain any messages."
    );
    }
    // Check if the 'after' boundary is valid
    if (request.After is not null && !request.Channel.MayHaveMessagesAfter(request.After.Value))
    {
    throw new DiscordChatExporterException(
    $"Channel '{request.Channel.Name}' "
    + $"of guild '{request.Guild.Name}' "
    + $"does not contain any messages within the specified period."
    );
    }
    // Check if the 'before' boundary is valid
    if (
    request.Before is not null
    && !request.Channel.MayHaveMessagesBefore(request.Before.Value)
    )
    {
    throw new DiscordChatExporterException(
    $"Channel '{request.Channel.Name}' "
    + $"of guild '{request.Guild.Name}' "
    + $"does not contain any messages within the specified period."
    );
    }
    // Build context
    var context = new ExportContext(discord, request);
    await context.PopulateChannelsAndRolesAsync(cancellationToken);
    // Export messages
    await using var messageExporter = new MessageExporter(context);
    await foreach (
    var message in discord.GetMessagesAsync(
    request.Channel.Id,
    request.After,
    request.Before,
    progress,
    cancellationToken
    )
    )
    {
    try
    {
    // Resolve members for referenced users
    foreach (var user in message.GetReferencedUsers())
    await context.PopulateMemberAsync(user, cancellationToken);
    // Export the message
    if (request.MessageFilter.IsMatch(message))
    await messageExporter.ExportMessageAsync(message, cancellationToken);
    }
    catch (Exception ex)
    {
    // Provide more context to the exception, to simplify debugging based on error messages
    throw new DiscordChatExporterException(
    $"Failed to export message #{message.Id} "
    + $"in channel '{request.Channel.Name}' (#{request.Channel.Id}) "
    + $"of guild '{request.Guild.Name} (#{request.Guild.Id})'.",
    ex is not DiscordChatExporterException dex || dex.IsFatal,
    ex
    );
    }
    }
    // Throw if no messages were exported
    if (messageExporter.MessagesExported <= 0)
    {
    throw new DiscordChatExporterException(
    $"Channel '{request.Channel.Name}' (#{request.Channel.Id}) "
    + $"of guild '{request.Guild.Name}' (#{request.Guild.Id}) "
    + $"does not contain any matching messages within the specified period."
    );
    }
    }
  • public async IAsyncEnumerable<Message> GetMessagesAsync(
    Snowflake channelId,
    Snowflake? after = null,
    Snowflake? before = null,
    IProgress<Percentage>? progress = null,
    [EnumeratorCancellation] CancellationToken cancellationToken = default
    )
    {
    // Get the last message in the specified range, so we can later calculate the
    // progress based on the difference between message timestamps.
    // This also snapshots the boundaries, which means that messages posted after
    // the export started will not appear in the output.
    var lastMessage = await TryGetLastMessageAsync(channelId, before, cancellationToken);
    if (lastMessage is null || lastMessage.Timestamp < after?.ToDate())
    yield break;
    // Keep track of the first message in range in order to calculate the progress
    var firstMessage = default(Message);
    var currentAfter = after ?? Snowflake.Zero;
    while (true)
    {
    var url = new UrlBuilder()
    .SetPath($"channels/{channelId}/messages")
    .SetQueryParameter("limit", "100")
    .SetQueryParameter("after", currentAfter.ToString())
    .Build();
    var response = await GetJsonResponseAsync(url, cancellationToken);
    var messages = response
    .EnumerateArray()
    .Select(Message.Parse)
    // Messages are returned from newest to oldest, so we need to reverse them
    .Reverse()
    .ToArray();
    // Break if there are no messages (can happen if messages are deleted during execution)
    if (!messages.Any())
    yield break;
    // If all messages are empty, make sure that it's not because the bot account doesn't
    // have the Message Content Intent enabled.
    // https://github.com/Tyrrrz/DiscordChatExporter/issues/1106#issuecomment-1741548959
    if (
    messages.All(m => m.IsEmpty)
    && await ResolveTokenKindAsync(cancellationToken) == TokenKind.Bot
    )
    {
    var application = await GetApplicationAsync(cancellationToken);
    if (!application.IsMessageContentIntentEnabled)
    {
    throw new DiscordChatExporterException(
    "Provided bot account does not have the Message Content Intent enabled.",
    true
    );
    }
    }
    foreach (var message in messages)
    {
    firstMessage ??= message;
    // Ensure that the messages are in range
    if (message.Timestamp > lastMessage.Timestamp)
    yield break;
    // Report progress based on timestamps
    if (progress is not null)
    {
    var exportedDuration = (message.Timestamp - firstMessage.Timestamp).Duration();
    var totalDuration = (lastMessage.Timestamp - firstMessage.Timestamp).Duration();
    progress.Report(
    Percentage.FromFraction(
    // Avoid division by zero if all messages have the exact same timestamp
    // (which happens when there's only one message in the channel)
    totalDuration > TimeSpan.Zero
    ? exportedDuration / totalDuration
    : 1
    )
    );
    }
    yield return message;
    currentAfter = message.Id;
    }
    }
    }

In order of execution.

@Tyrrrz
Copy link
Owner

Tyrrrz commented Nov 7, 2024

The issue appears to be that the starting message has content: "", but contains the actual message as a referenced message:

{
  "type": 21,
  "content": "",
  "mentions": [],
  "mention_roles": [],
  "attachments": [],
  "embeds": [],
  "timestamp": "2024-11-07T16:40:27.109000+00:00",
  "edited_timestamp": null,
  "flags": 0,
  "components": [],
  "id": "1304123312372453390",
  "channel_id": "1304123293049294940",
  "author": {
    "id": "128178626683338752",
    "username": "tyrrrz",
    "avatar": "d32cfe56f68f523cda67c9c5a3ef57aa",
    "discriminator": "0",
    "public_flags": 0,
    "flags": 0,
    "banner": null,
    "accent_color": null,
    "global_name": "Tyrrrz",
    "avatar_decoration_data": null,
    "banner_color": null,
    "clan": null
  },
  "pinned": false,
  "mention_everyone": false,
  "tts": false,
  "message_reference": {
    "type": 0,
    "channel_id": "1304123107845476363",
    "message_id": "1304123293049294940",
    "guild_id": "866458392705105940"
  },
  "position": 0,
  "referenced_message": {
    "type": 0,
    "content": "Thread starting message with special characters ? \" / | >",
    "mentions": [],
    "mention_roles": [],
    "attachments": [],
    "embeds": [],
    "timestamp": "2024-11-07T16:40:22.502000+00:00",
    "edited_timestamp": null,
    "flags": 32,
    "components": [],
    "id": "1304123293049294940",
    "channel_id": "1304123107845476363",
    "author": {
      "id": "128178626683338752",
      "username": "tyrrrz",
      "avatar": "d32cfe56f68f523cda67c9c5a3ef57aa",
      "discriminator": "0",
      "public_flags": 0,
      "flags": 0,
      "banner": null,
      "accent_color": null,
      "global_name": "Tyrrrz",
      "avatar_decoration_data": null,
      "banner_color": null,
      "clan": null
    },
    "pinned": false,
    "mention_everyone": false,
    "tts": false,
    "thread": {
      "id": "1304123293049294940",
      "type": 11,
      "last_message_id": "1304123315022991390",
      "flags": 0,
      "guild_id": "866458392705105940",
      "name": "Thread starting message with special",
      "parent_id": "1304123107845476363",
      "rate_limit_per_user": 0,
      "bitrate": 64000,
      "user_limit": 0,
      "rtc_region": null,
      "owner_id": "128178626683338752",
      "thread_metadata": {
        "archived": false,
        "archive_timestamp": "2024-11-07T16:40:27.071000+00:00",
        "auto_archive_duration": 4320,
        "locked": false,
        "create_timestamp": "2024-11-07T16:40:27.071000+00:00"
      },
      "message_count": 1,
      "member_count": 1,
      "total_message_sent": 1
    }
  }
}

I think Discord uses the same approach for forwarded messages as well. We already parse this data, so we need to figure out how to render it properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants