Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application crashes with v0.11.0 #39

Closed
aiolos01 opened this issue Jun 23, 2024 · 35 comments
Closed

Application crashes with v0.11.0 #39

aiolos01 opened this issue Jun 23, 2024 · 35 comments
Assignees
Labels
bug Something isn't working

Comments

@aiolos01
Copy link

aiolos01 commented Jun 23, 2024

Several bugs to report. First of all I get a dialog saying the latest version is 0.11 despite already using 0.11

Second it crashes every time I choose refresh checked or refresh all tabs. If I select some tabs and choose refresh selected it doesn't crash.

Third, I got a different crash once.
Snap247

All this if I use OATH. If I use POESESSID there are no crashes.

@gerwaric gerwaric self-assigned this Jun 24, 2024
@gerwaric gerwaric added the bug Something isn't working label Jun 24, 2024
@gerwaric gerwaric changed the title Application crashes Application crashes with v0.11.0 Jun 24, 2024
@gerwaric
Copy link
Owner

gerwaric commented Jun 24, 2024

Thank you for the reports. I'll put together a manual release checklist for future updates, since I don't know how to automate UI testing. A lot changed under the hood witih this release, so I'm glad at least POESESSID still works for you.

I hope those issue don't take long to fix, because I really would like to fix pseudo-mod searching next.

@gerwaric
Copy link
Owner

gerwaric commented Jul 2, 2024

@aiolos01 When you have time, I have an update that should fix most of the bugs you found:

https://github.com/gerwaric/acquisition/releases/tag/v0.11.1-alpha.1

I was able to run all three kinds of refreshes on both POESESSID and OAuth without crashes, with a random variety of tabs checked. I also think I fixed the update notice logic. Finally, I've added extra error logging in places where I think I might be causing a crash, e.g. before dereferencing a pointer that might be null.

The one issue I haven't fixed is the bucket row out of bounds error, because I haven't been able to replicate it yet. Are there certain actions you take that reliabley cause that error?

Thanks for your patience.

@aiolos01
Copy link
Author

aiolos01 commented Jul 5, 2024 via email

@gerwaric
Copy link
Owner

gerwaric commented Jul 5, 2024

Heck. I thought I had guarded the most likely failures with error logging, but I clearly missed something. Could you turn up the logging to TRACE level and run one more time?

There is a submenu in the Setting menu to set the logging level. I'm hoping the additional logging will give a few more clues. In the meantime, I'll work on something like a debug mode to track exactly what's happening on a function-by-function basis.

(Acquisition also used to provide stack dumps when it crashed, but that was years ago. I'll look into this as well, since that could tell us exactly where the crash happened).

@gerwaric
Copy link
Owner

gerwaric commented Jul 5, 2024

UPDATE: I think I have figured out how to use Windows crash dumps, which will let me see exactly where acquisition crashed for you. This would mean no more trying to guess from the log files. If we can get it working, I'll know exactly which line in which function is causing the crash.

Are you open to chatting on Discord? I'm gerwaric there as well. There are some tools to automate crash reporting, but they would take time for me to figure out, integrate into acquisiton, and test. It will be faster and easier to start with Discord.

@aiolos01
Copy link
Author

aiolos01 commented Jul 7, 2024 via email

@gerwaric
Copy link
Owner

gerwaric commented Jul 8, 2024

Over the weekend I found a way to automatically report crashes:

https://github.com/gerwaric/acquisition/releases/tag/v0.11.1-alpha.2

I've tested it on Windows on my laptop. On the crash reporting website, I can see the file and line number of each crash, although there is no account information, so I won't know which crash came from which user without trying to compare upload times with log messages people report here.

If you can give it a try, I would love to confirm that it works for someone else.

@aiolos01
Copy link
Author

aiolos01 commented Jul 8, 2024 via email

@gerwaric
Copy link
Owner

gerwaric commented Jul 9, 2024

Yes, acquisition is using crashpad to send reports to BugSplat, where I'm hoping to get by with the free tier. Unfortunately that didn't help, however. The crash reporting isn't configured correctly yet.

Apologies this taking so long. I'd love to get OAuth working, but this is frustrating.

When you can get to it, here's a special release that is a zip file instead of an installer. You can just run it from where you unzip it:

https://github.com/gerwaric/acquisition/releases/tag/v0.11.1-alpha.trace.3
https://github.com/gerwaric/acquisition/releases/tag/v0.11.1-alpha-trace.4 (even more logging than trace.3)

That build will default to the TRACE log level. That's going to flood the log file with messages. I've added a lot of extra log messages at that TRACE level now, and I'm working on more. So far it's mostly focused on the code that executes before the login dialog is shown, but I want it to cover the entire codebase to help with debugging in the future.

If you could run the latest alpha-trace build and post the last 5 lines from your log file after a crash, I'm hoping this will narrow it down. Once this issue is fixed, I'll go back to building installers and setting the default logging level to INFO like it was before.

@aiolos01
Copy link
Author

aiolos01 commented Jul 10, 2024 via email

@gerwaric
Copy link
Owner

Yes. I can see a crash with alpha-trace.3 that looks plausible. This may have sold me on BugSplat, because it would have taken a long time to find the issue without it.

Specifically, the crash occured while trying to parse the colour of a stash tab after receiving the list of stashes in OAuth mode. That field is optional according to the GGG API developer documentation. My OAuth code assumed it was always present. This wasn't happening to me, so apparently all my tabs have colour.

If you were experiencing this in Standard, it's possible that stash tabs created in the early days of PoE don't have that colour field. This problem is also a non-issue when using POESESSID because the legacy API returns stash tabs in a different json format.

Long story short--this should fix that particular crash:
https://github.com/gerwaric/acquisition/releases/tag/v0.11.1-colour-fix

While you're testing that, I've been down in the guts of the items model trying to diagnose the "bucket row out of bounds error." It's making my head hurt, but that's the next thing I'm working on.

@aiolos01
Copy link
Author

aiolos01 commented Jul 11, 2024 via email

@gerwaric
Copy link
Owner

One of those crashes is showing up on BugSplat, and it looks the crash was outside acquisition in the Microsoft Visual C++ runtime library msvcp140.dll:

d:\a01\_work\12\s\src\vctools\crt\github\stl\src\mutex.cpp(103):
mtx_do_lock(_Mtx_internal_imp_t*, xtime const*)

Can you check to see which version(s) of the c++ runtime you have installed?

Here's what's on my development system:
image

@aiolos01
Copy link
Author

aiolos01 commented Jul 11, 2024 via email

@gerwaric
Copy link
Owner

I'll see if I can roll back to those versions and replicate the bugs you're seeing. If that works, then I can link to the latest installer from Microsoft and mention it in the release notes. I'll update you here.

@gerwaric
Copy link
Owner

Please delete those files from the installation folder (msvcp140.dll, msvcp140_1.dll, and msvcp140_2.dll) and let me know what happens.

I installed multiple versions of the visual c++ runtime one at a time, from 14.20.27519 to 14.40.33810, including 14.29.30139. Each time I tried running acquisition both with and without those three files from 14.29.30139, and acquisition crashed every time those .dlls were present in the installation folder.

@aiolos01
Copy link
Author

aiolos01 commented Jul 15, 2024 via email

@gerwaric
Copy link
Owner

Whew. I'm glad that SSID works for you again. I'll see what I can figure out from those crash reports and try to get another alpha release soon.

@gerwaric
Copy link
Owner

The next release has extra checks when parsing the OAuth json, so if that was causing the crashes you are seeing, they will at least show up in the log file as warnings or errors.

https://github.com/gerwaric/acquisition/releases/tag/v0.11.1-alpha.6

This update should also warn you if there are extra dll files in the application folder, since there may be other users experiencing this crash.

@aiolos01
Copy link
Author

aiolos01 commented Jul 16, 2024 via email

@aiolos01
Copy link
Author

aiolos01 commented Jul 16, 2024 via email

@gerwaric
Copy link
Owner

Scratch that. It crashes with SSID too during search.

Oof. I'm sorry. It's possible that once you did a refresh with OAuth the data that caused the crash was saved into the data files. This could be the reason you didn't see a crash with SSID at first.

Fortunately I can see those crashes in bugsplat.

When I get up in a couple hours (I'm on US mountain time), I'll dig into those reports. Hopefully they will help identify whatever elusive thing is happening.

Another thought I have is that if it's crashing in offline search for you, then I should be able to replicate the crash if I had your json data. I don't like the idea of asking you to share data files directly, so I might add an "export" function if the crash reports aren't enough.

Can't stress enough how grateful I am at your persistence, since I can't replicate these crashes on my own. Thank you very much.

@aiolos01
Copy link
Author

aiolos01 commented Jul 16, 2024 via email

@gerwaric
Copy link
Owner

I'm glad to hear the reboot helped. From the crash reports, I think we might be back to your "bucket row out of bounds" error. If you can reliably reproduce that error, a short writeup of how would be useful. I've been playing around with searches, but think I've only seen that error once, so I need to do some more poking around.

@gerwaric
Copy link
Owner

@aiolos01 I think I may have solved the "bucket row out of bounds error":

https://github.com/gerwaric/acquisition/releases/tag/v0.11.1-alpha.8

The issue I found is that acquisition always assumed that when the selected item changed, it was changing to a new item. However, it is possible that the change meant that no item was selected. When that happens, an internal row index is negative, which caused the error you screenshotted at the start of this thread. Now acquisition checks for that case and should gracefully do nothing.

When you can get around to testing this, let me know if it's working, or if there are any other new issues for you.

@gerwaric
Copy link
Owner

@aiolos01 I found a new place where the same bucket row error happens. If I missed that one, I likely missed several, so I'm going to go through the code with a fine-toothed comb tomorrow. I'll reply here when I have an update.

@gerwaric
Copy link
Owner

gerwaric commented Jul 22, 2024

I have a lot to say about this issue, but I think it might finally be fixed:

https://github.com/gerwaric/acquisition/releases/tag/v0.11.1-alpha.10

First, I was able to replicate a search crash by clicking on the "corrupted" checkbox, then selecting a search result that was not priced, then clicking on the "priced" checkbox. That should cause the item I had selected to be removed from the search results, but instead it was crashing. It turns out I was incorrectly clearing the selection object after the search was updated, so it was making calls with invalid arguments. This is fixed.

Second, I'm not sure that is the only source of those mysterious search crashes. However, the common issue seems to be certain function calls that lead to dereferencing an invalid pointer, which is a big no-no. To forestall any future crashes, I added extra code to make sure those functions aren't called with invalid arguments. Now, acquisition will log and error and continue running.

Finally, I think the guts of the item searching needs to be refactored. In the ten years since acquisition was created, people now have millions of items, which acquisition was not designed to manager. There are new frameworks that may be simpler, faster, and fewer lines of code. This would be a lot of work, so it's on the back burner for now.

@aiolos01
Copy link
Author

aiolos01 commented Jul 22, 2024 via email

@gerwaric
Copy link
Owner

That's good feedback. Since acquisition isn't popular these days, the only feedback I get about what features people use or care about comes from the ones willing to use GitHub to post issues.

Part of me wants to rebuild a new acquisition from scratch given changes GGG has made, and changes in technology over the last decade. The current guts are pretty complicated and tricky to understand. However, a more practical priority is to improve the mod searching, since it doesn't distinguish between implicit, explicit, and other modifier types the way that the trade site does. It also doesn't compute pseudo-mods at all. (It looks like it might have worked at some point in the past, but right now it's not).

@aiolos01
Copy link
Author

aiolos01 commented Jul 24, 2024 via email

@gerwaric
Copy link
Owner

gerwaric commented Jul 25, 2024

That's great to hear. Don't tell anyone just yet, but my long-term vision is to have three types of searches:

  1. Equipable items and jewels
  2. Stackable items, aka currency
  3. Unstackables, such as maps and logbooks

However, it's probably more immediately useful to get pseudo mods working. There's code that suggests they worked some time in the past, but I haven't looked into when that stopped working. Heck, it's even possible some of the early changes I made to acquisition did it, because I was learning c++ as well as acquisition's code base and GGG's API and data structures.

Coming back to the topic of this thread...

If you don't have any issues with the latest alpha, I'd like to release v0.11.1 in the next week or two. (Caveat: #50 may become a blocking issue).

@aiolos01
Copy link
Author

aiolos01 commented Jul 25, 2024 via email

@gerwaric
Copy link
Owner

gerwaric commented Jul 26, 2024

A few hours ago another user had their API access blocked at the network level due to a previously unknown bug in acquisition (#50). Details along with an update here:

https://github.com/gerwaric/acquisition/releases/tag/v0.11.1-alpha.12

I strongly recommend installing this update to avoid the risk of losing access to third-party tools until GGG has time to respond to any support requests you'd have to send.

@aiolos01
Copy link
Author

aiolos01 commented Jul 26, 2024 via email

@gerwaric
Copy link
Owner

I have an idea of what happened now. If this build fixes it I'll write it up in more detail:

https://github.com/gerwaric/acquisition/releases/tag/v0.11.1.1-alpha.1

I'm also closing this issue for now, since I think the issues causing the crashes you were reporting have been fixed:

  • "bucket row out of bounds"
  • incompatible MSVC libraries

If these issues come back, please reopen this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants