Skip to content

Latest commit

 

History

History
55 lines (51 loc) · 4.5 KB

Readme.md

File metadata and controls

55 lines (51 loc) · 4.5 KB

A semblance of assembly

Learning an assembly language is very helpful for understanding machine operations at a low level, and knowing the ways that that machine translates expressive lines of C source (for example) into individual instructions can help one write more efficient code. Unfortunately, most tools dedicated to compilers and assembly languages are geared towards experts, and there are relatively few resources available to those trying to learn assembly from scratch.

The purpose of this project is to develop an interactive, browser-based explorer for C90 and x86-64 that illustrates the relationships between source and assembly and provides easy access to information about different assembly mnemonics and registers. A similar project developed by Matt Godbolt can be found at http://gcc.godbolt.com -- this appears to focus more on comparing the assembly generated by different compilers, while this tool is aimed more at teaching assembly language patterns and concepts to those unfamiliar.

Special thanks to Bob Dondero for his advising during this project, and the x86-64 resources he developed for Princeton's COS217: Systems Programming.

Limitations

  • At this point the project will be restricted to the same subset of x86-64 used by Princeton's COS217. Specifically this excludes floating point arithmetic and assumes that functions take no more than six arguments, and that structs are not passed as arguments.
  • It's possible that there are some cases of compiler optimization that have not been accounted for in the process of DWARF parsing, if you encounter an error of any kind, let me know!

Progress

  • (3/9) Compiled dictionary of assembly mnemonics in static/ref.json
  • (3/10) Boilerplate for uploading a file to the server
  • (3/11) Render uploaded source file in template
  • (3/11) Perform source-assembly line-matching on backend
  • (3/19) Assembly tokens parsed and each returned in a different div
  • (3/20) Tooltips on mnemonics are functional
    • All text generated on backend, visibility controlled via CSS
    • Current formatting is acceptable, could be improved
  • (3/20) Line-matching is functional
    • corresponding lines of c and asm are wrapped in divs of the same class
  • (3/20) Extraneous compiler-generated assembly labels and directives are ignored in output
  • (3/21) Fixed line-matching bug and line numbering
  • (3/21) Moved compilation to server-side
    • Now user only uploads source, both .s and .o files are generated on the server by gcc 4.8.1
    • Eliminates session bug resulting from having to upload multiple files
  • (3/21) Added parsing and syntax highlighting for registers and labels
    • Now can recognize mnemonics, registers, and labels.
  • (3/23) Began work on parsing debugging info
    • Currently have a dicitonary of memory locations by funciton and their corresponding symbols, declaration lines, and types. Unfortunately the memory location is still in an opaque format. Trying to figure out exactly the relationship between these numbers and stack/base pointer offsets.
  • (3/23) Fixed tokenizing bug with quotes and whitespace
  • (3/25) Started branch for more advanced animations/interactions with javascript
    • Added scrollbars to each panel
    • Working on horizontally aligning lines of C with corresponding blocks of assembly, to be triggered by a click event
    • Broke tooltip popups, need to re-style them
  • (3/27) Can successfully determine locations of variables stored on the stack and in registers
    • Markup not yet completed / matching not visible browser side
  • (3/28) Finished scrolling animations and fixed popup tooltips to work with scrolling
  • (4/4) Rewrote assembly operand parsing by subclassing pygments' RegexLexer
  • (4/4) Added C syntax highlighting with subclasses of pygments CLexer and HtmlFormatter
    • issue: lexing is done line-by-line, multiline comments are not lexed correctly.
  • (4/6) Implemented basic variable location lookups
    • formal parameters and local variables only
    • tooltip includes type and declaration line (only works for base types, still need to add support for enums, typedefs, pointer types/arrays etc.)
  • (4/6) Automatically clean uploads folder on startup
  • (4/11) Added support for pointers, typedefs, and enumerations for variable tooltips
  • (4/15) Added support and test file for arrays for variable tooltips
  • (4/15) Added optimization select to UI and fixed form submission
    • Still requires re-upload of file...
  • (4/15) Added filename display to UI
  • (4/22) Fixed back and front end for optimization and merged with master
  • (11/17) LIVE on assemblance.cs.princeton.edu