Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

locating decompression functions #1

Open
hur opened this issue Mar 8, 2023 · 5 comments
Open

locating decompression functions #1

hur opened this issue Mar 8, 2023 · 5 comments

Comments

@hur
Copy link

hur commented Mar 8, 2023

I stumbled across this repo as I was trying to follow the process of https://research.checkpoint.com/2021/security-probe-of-qualcomm-msm/ to decompress the compressed segments in Pixel 5 modem firmware.

I've managed to identify at least one compressed segment by analyzing in IDA a function which seems to check the integrity of compressed segments. However, I see no other references to this address, so unless I've generated the modem elf incorrectly, i'm not able to find any functions that refer to the address / or any q6zip / dlpager strings so far.

I was wondering if you had any tips to locate the decompression functions?

@Rot127
Copy link

Rot127 commented Mar 13, 2023

I can't remember anymore how I located the q6zip function in my binary (and of cause haven't documented it).

But I see a few possibilities:

  • Search for loading addresses listed in the ELF header (guess you did this already).
  • I think some Qualcomm code leaked recently (something around 2020? Don't know). It can be found online. So you could get it and create a Flirt signature from the function in question and search for it.
  • Make a Flirt signature from the function in the Pixel 2 binary and hope that they didn't touched the function since then.
  • Script something which searches for calls and jumps to not allocated address space. Align the lowest found address to some reasonable value where a segment might be loaded to. And search for references to this aligned address. Might reveal a few more possible function for segment loading/decompression.

I don't know about the scripting capabilities of IDA but I'd like to invite you to take a look at Rizin for this. I wrote the Hexagon disassembler for it and it was not hard to script something like this for Rizin with Python.

or any q6zip / dlpager strings so far

Strings are stored in a separated .bXX file/segment to my knowledge. You can check if the string segment was loaded to the correct offset in IDA and search for a reference to a string manually. I think I had some problems back then because there was some confusion between physical and virtual address space. Just because the ELF header had a physical instead of virtual address as entry point.
But if I remember correctly the functions around the q6zip algorithm didn't use any helpful strings anyway.

@hur
Copy link
Author

hur commented Mar 14, 2023

Thank you for the advice.

Script something which searches for calls and jumps to not allocated address space. Align the lowest found address to some reasonable value where a segment might be loaded to. And search for references to this aligned address. Might reveal a few more possible function for segment loading/decompression.

Doing this in Binja (which turned out to identify these automatically) helped me locate a few interesting functions. I'll definitely give Rizin a try, been using IDA and Binja for this project.

Also noticed one of the .bXX sections is an ELF / shared library CORE_USER.so, getting that correctly into Rizin/IDA/Binja might be helpful too.

@Rot127
Copy link

Rot127 commented Mar 14, 2023

Another idea would be to locate and dump the decompression function of the Pixel 2 and search for an algorithm which measures similarity between byte strings. This way you can compare all found functions with the Pixel 2 byte string and hope that you get a match.

I don't know much about the function analysis capabilities of Binja and IDA. But I think the plugins don't handle function prologues. And only Binja supports emulation of some instructions. So probably a lot of functions, which are only reached via indirect calls/jumps, get never analyzed. Please correct me if I am wrong here.
But you might have to script something which takes good guesses what not analyzed code could be an additional function.

@hur
Copy link
Author

hur commented Mar 14, 2023

That's an interesting approach. I'm still very inexperienced with reverse engineering so I appreciate the ideas.

So probably a lot of functions, which are only reached via indirect calls/jumps, get never analyzed. Please correct me if I am wrong here.

Yes, my understanding is the same.

@mzakocs
Copy link

mzakocs commented Jun 21, 2023

@hur Newer firmware binaries don't seem to use q6zip compression anymore. They have switched to CLADE compression which is mainly implemented in hardware. I have a script that can decompress it here. Hope it helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants