Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add scripts for sparse elf generation #2078

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

soohyuk-cho
Copy link

SooHyuk: Working on sparse elf file generation to speed up the checkpoint process.
Added some scripts I wrote for elf file scan and linker script generation.

Copy link
Contributor

@abejgonzalez abejgonzalez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good 1st pass. I think the main thing I want to see is a top-level script that given a ELF file generates the final sparse ELF file. Users shouldn't have to know what sequence to run all the underlying scripts.

Comment on lines +19 to +21
void usage(const char *progname) {
std::cerr << "Usage: " << progname << " <input_elf_file> <chunk_size_in_bytes> <output_file>" << std::endl;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Move this into main since it's small, no need for an extra function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add more of a description here. This processes a RISC-V elf file serially in chunks (specified by the user) and emits a metadata file indicating which chunks (w/ addr + chunk size) are non-zero.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also call the output file, metadata file, in the code + desc.

Comment on lines 76 to 83
std::cout << "\nProcessing Segment " << i << ":" << std::endl;
std::cout << " Type: " << phdr.p_type << std::endl;
std::cout << " Offset: 0x" << std::hex << phdr.p_offset << std::dec << std::endl;
std::cout << " Virtual Address: 0x" << std::hex << phdr.p_vaddr << std::dec << std::endl;
std::cout << " Physical Address: 0x" << std::hex << phdr.p_paddr << std::dec << std::endl;
std::cout << " File Size: " << phdr.p_filesz << " bytes" << std::endl;
std::cout << " Memory Size: " << phdr.p_memsz << " bytes" << std::endl;
std::cout << " Flags: " << phdr.p_flags << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many program headers have you seen in existing ELFs that you've processed. If this number is large I think it would be better to not print (or maybe add a #ifdef DEBUG preprocessor command to disable prints, and have prints disabled by default).

Comment on lines 91 to 93
std::cout << " Segment Virtual Address: 0x" << std::hex << phdr.p_vaddr << std::dec << std::endl;
std::cout << " Segment Size (filesz): " << phdr.p_filesz << " bytes" << std::endl;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto with all debug prints in this loop.

Comment on lines +122 to +123
bool all_zero = std::all_of(segment_data.begin() + chunk_offset, segment_data.begin() + chunk_end,
[](unsigned char c) { return c == 0; });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misc. thought, does this get optimized into a check per larger word (i.e. 64b instead of 8b). If not, maybe this can optimized (low priority).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the script to compile this file?

# Define a single MEMORY region
f.write("MEMORY\n")
f.write("{\n")
f.write(" MEM (rwx) : ORIGIN = 0x80000000, LENGTH = 0x80000000\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This length should be auto-generated based on the regions. last_region_addr + chunk - origin_addr.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The start can also be autogenerated based on the 1st region addr.

try:
start = int(start_str, 16) # Start address in hexadecimal
size = int(size_str, 10) # Size in decimal
mem_region = "MEM" # Single memory region
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might confuse others why there is only a single memory region, also if this is the case (only 1 memory region) no need to have code to assign this to each region, just assume this in the upper function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this script since this was just for testing


def main():
if len(sys.argv) != 3:
print("Usage: python3 extract_regions.py <mem.elf> <regions.txt>")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, all scripts should use a shebang. See

#!/usr/bin/env bash
#!/usr/bin/env python
for examples.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script creates the .o files needed for the linker script that's generated... why is this a separate script. I would try to combine the two files together s.t. you do:

  1. Scan the memory regions to create the metadata file (C++ executable)
  2. Given the metadata and elf, create all the .o's/etc + linker script
  3. Pass that output to a automated way to generate the final elf file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants