-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More i386 xrefs #899
base: master
Are you sure you want to change the base?
More i386 xrefs #899
Changes from 1 commit
afe9a48
555b606
d29ee1a
ef13b33
ce80cd8
c4458cb
47c79f1
606d01a
3fdef6c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -465,6 +465,34 @@ def get_struct_string_candidates(pe: pefile.PE) -> Iterable[StructString]: | |
# dozens of seconds or more (suspect many minutes). | ||
|
||
|
||
def get_raw_xrefs_rdata_i386(pe: pefile.PE, buf: bytes) -> Iterable[VA]: | ||
""" | ||
scan for raw xrefs in .rdata section | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what are raw xrefs? can you add an example disassembly listing and add some comments, please? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. from the screenshots in #885 (comment) I don't see if these are strings and it would help to have some comments explaining what the algorithm looks for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Raw xrefs refer to unprocessed xrefs in the binary file, indicating points where strings can be divided. I'll add an example with comments. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, can you share a few example binary hashes? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
""" | ||
format = "I" | ||
|
||
if not buf: | ||
return | ||
|
||
low, high = get_image_range(pe) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this appears to be the range of the entire file, not the |
||
|
||
# using array module as a high-performance way to access the data as fixed-sized words. | ||
words = iter(array.array(format, buf)) | ||
|
||
last = next(words) | ||
for current in words: | ||
address = last | ||
last = current | ||
|
||
if address == 0x0: | ||
continue | ||
|
||
if not (low <= address < high): | ||
continue | ||
|
||
yield address | ||
Arker123 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
def get_extract_stats( | ||
pe: pefile, all_ss_strings: List[StaticString], lang_strings: List[StaticString], min_len: int, min_blob_len=0 | ||
) -> float: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a few tests for these strings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This routine doesn't seem limited to
i386
so lets remove that from the function name. Otherwise, we should add a check to the PE architecture to restrict it to i386.If the data are virtual addresses (rather than RVAs), we could additionally use relocation entries to find pointers and/or verify this data is in fact a pointer.