Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msdn_crawler.py not working on IDA for Linux (6.95) #47

Closed
kolad85 opened this issue May 13, 2017 · 11 comments
Closed

msdn_crawler.py not working on IDA for Linux (6.95) #47

kolad85 opened this issue May 13, 2017 · 11 comments

Comments

@kolad85
Copy link

kolad85 commented May 13, 2017

Hello,

I have tried generating the the XML database file using the msdn_crawler.py on a Linux machine (Ubuntu 16.04 x64) and it seems to have failed with the following errors (for both tilib and tilib64)

python ./MSDN_crawler/msdn_crawler.py "/home/<username>/extracted1033/" "/opt/bin/ida-6.95/tilib" "/opt/bin/ida-6.95/til/pc"
MSDN crawler based on zynamics msdn-crawler - Copyright 2010
WARNING:til_extractor:Error calling tilib.exe with /opt/bin/ida-6.95/til/pc/nlm.til -- Command '['/opt/bin/ida-6.95/tilib', '-l', '/opt/bin/ida-6.95/til/pc/nlm.til']' returned non-zero exit status 126
...SNIP...
WARNING:til_extractor:Error calling tilib.exe with /opt/bin/ida-6.95/til/pc/ntapi.til -- Command '['/opt/bin/ida-6.95/tilib', '-l', '/opt/bin/ida-6.95/til/pc/ntapi.til']' returned non-zero exit status 126
Traceback (most recent call last):
  File "./MSDN_crawler/msdn_crawler.py", line 413, in <module>
    main()
  File "./MSDN_crawler/msdn_crawler.py", line 398, in main
    (file_counter, results) = parse_files(msdn_directory, tilib_exe, til_dir)
  File "./MSDN_crawler/msdn_crawler.py", line 357, in parse_files
    const_enum = extract_til_constant_info.main(tilib_exe, til_dir)
  File "/opt/bin/flare-ida-master/MSDN_crawler/extract_til_constant_info.py", line 95, in main
    for enum_name, enum in enums.iteritems():
UnboundLocalError: local variable 'enums' referenced before assignment

I have also tried with the Windows version of IDA and failed like others have pointed out in other issue threads.

MSDN crawler based on zynamics msdn-crawler - Copyright 2010
Traceback (most recent call last):
  File "MSDN_crawler\msdn_crawler.py", line 413, in <module>
    main()
  File "MSDN_crawler\msdn_crawler.py", line 398, in main
    (file_counter, results) = parse_files(msdn_directory, tilib_exe, til_dir)
  File "MSDN_crawler\msdn_crawler.py", line 371, in parse_files
    result = parse_file(os.path.join(root, file), const_enum)
  File "MSDN_crawler\msdn_crawler.py", line 276, in parse_file
    return parse_new_style(file, content, const_enum)
  File "MSDN_crawler\msdn_crawler.py", line 183, in parse_new_style
    parsed_html.find_all(width='60%')]
TypeError: 'NoneType' object is not callable 

I tried obtaining the modified version of the crawler and already generated database but it's no longer there.

@kolad85
Copy link
Author

kolad85 commented May 14, 2017

I was able to partially fix the first issue by removing the shell=True (line 55), in the extract_til_constant_info.py file and the second issue by changing the import from beautifulsoup3 to beautifulsoup4
from bs4 import BeautifulSoup
inside the msdn_crawler.py file and explicitly defining the html.parser as it was complaining
parsed_html = BeautifulSoup(descriptions[i], 'html.parser')

The problem I'm facing now is that the generated XML database file does not contain any <value> entries which ultimately leads to no constants being renamed in the IDB file when run with the plugin. Is that because I'm using beautifulsoup4 or any of my modifications?

@mr-tz
Copy link
Contributor

mr-tz commented May 15, 2017

The script hasn't been tested on Linux and there are other existent issues you've already discovered. Can you please check if the following file works for you after unzipping it?
https://github.com/mr-tz/flare-ida/blob/master/MSDN_data/msdn_data.zip

@kolad85
Copy link
Author

kolad85 commented May 15, 2017

Thanks for getting back to me so quickly. I tried with your database file but had only a few constant values resolved to their names. At the same time there's no error reported in IDA's python console (except the missing information for few insignificant functions) so I'm not sure where the glitch is.

@mr-tz
Copy link
Contributor

mr-tz commented May 15, 2017

Great, it sounds like the plugin is running successfully now. It's likely that it won't rename all constants automatically. For some it might miss the MSDN information (which you can provide in additional data files in the data directory) and for some it might fail to track the arguments correctly.

@kolad85
Copy link
Author

kolad85 commented May 16, 2017

Unfortunately this is not the case. It was able to resolve just a few constant names but the vast majority of constants remained unchanged. I have manually checked the xml database file and all values were in there. If it had to do with handling the arguments incorrectly then the argument comments shouldn't be there I suppose.

@mr-tz
Copy link
Contributor

mr-tz commented May 16, 2017

If the arguments are annotated correctly it does not necessarily mean that the constants can be renamed successfully. Do you have a sample you can share?

@kolad85
Copy link
Author

kolad85 commented May 17, 2017

I can share an xtremerat idb file which I downloaded from the Internet. Please let me know how to transfer it to you.

@mr-tz
Copy link
Contributor

mr-tz commented May 17, 2017

I can try to test it if you can provide a hash.

@kolad85
Copy link
Author

kolad85 commented May 19, 2017

Here's the hash of the malicious file itself: 9E6B9D375DC5998E63F7376FEDF5CDF0

@mr-tz
Copy link
Contributor

mr-tz commented May 19, 2017

Thanks, what constants (offsets) are you expecting to be renamed?
One issue might be that functions are called via function thunks and not directly.

@mr-tz
Copy link
Contributor

mr-tz commented Oct 5, 2018

Potentially same issue as in #62. Please reopen this issue otherwise.

@mr-tz mr-tz closed this as completed Oct 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants