-
Excellent file structure, good work splitting up the different header files into folders to make your repo easier to navigate.
-
However, note that it's standard practice to split up the header files (.hpp) declaring your classes / functions and seperate implementation files (.cpp) which actually defines them. This improves build times, as changing a header implementation only requires one file to be recompiled (where the header is actually defined), rather than having to recompile each file the header is included in.
-
Additionally, your makefile doesn't track header file changes, meaning that if you do change a header file,
make
won't notice this automatically and you'd need tomake clean
andmake
the entire project again (see https://stackoverflow.com/a/30142139 on how to avoid this) -
[nitpick]
Your.gitignore
file seems to exclude a lot more than it needs to (why do you have a rule for Fortran modules 😂), but doesn't actually ignore some generated files like yourlexer.yy.cpp
orparser.tab.*
.
-
-
You have a lot of compiler warnings when building your project. It's best to try and address these, or (if you know what you're doing) to explicitly tell the compiler to ignore them (for example using the
[[maybe_unused]]
attribute) -
Excellent commit messages and good use of branches to add features.
-
Great work writing lots of your own custom test-cases, I'm sure this was very helpful in catching bugs.
-
It's impressive you went with the writing your own parser approach (and even more impressive that you scored so well doing this). Unfortunately, the C grammar is quite complex, and your parser did not fully capture it, making it the leading cause of your unseen test failures 😭 (21 failures due to parse error alone, accounting ~51% of errors). With a project like this, especially when you ended up supporting so many features, it would've been best to switch to the standard (ANSI-C), trimming down any unnecessary parts, rather than continuing to build on your own version.
-
There are issues with your operator precedence that may have caused you to loose other marks on some of the hidden testcases. For example, relation operators (
>
,<
, etc.) have a higher precedence that bitwise operators (&
,|
,^
- note that these don't even have the same precedence between each other), but in your grammar they all have equal precedence (see the C Operator Precedence) -
There seems to be a semantic error for
case
statements. You always allocate a newBlock
(which represents a scope) for case statements:
CASES : CASE EXPR ':' BODY { $$ = new case_statement($2, new Block($4)); } | DEFAULT ':' BODY { $$ = new default_statement( new Block($3)); }
However this is not the correct behaviour - case statements share the same scope, and new blocks should only be created if the case statement has scoped blocks succeeding it, i.e.
int f(int x) { int y = 0; switch(x) { case 0: y += 3; // No new scope here case 1: { // New scope here because of {} int y; y += 5; // Has no impact on return value (refers to y in new scope) } y++; } return y; }
- Nice work implementing the "lexer hack" to handle
typedef
s!
-
-
I like that your
Node
class doesn't have any member variables, and declaresprint
as a pure virtual function that your derived classes have to implement. However, ideally all of the other methods inNode
(getId
,getValue
, etc.) should also be made pure virtual (=0
), (especiallyriscv_asm
), or if they aren't implemented by every derived class, should be only defined in those relevant subclasses, and store pointers to those subclasses rather than generalNode*
where relevant. As they are, these methods don't even produce a runtime error (with the exception ofgetValue
), meaning that you could be callinggetId
on a completely invalid derived type and not notice immediately.- I'm not sure why you include a
std::map<std::string, std::vector<std::string>>& bindings
in yourriscv_asm
method, could this not be part ofHelper& helper
? Forstd::string datatype
, I think it would be better to define an overload ofriscv_asm
which takes it, and one which doesn't, so you can be more explicit about theNode
constructs which rely on type information and those which don't. Something like the following could be good:
enum class Type; // Encodes the types as an enum class rather than a string class Node { virtual void riscv_asm(std::ostream& dst, Helper& helper, const std::string& destReg) =0; // Default implementation (for derived constructs that require type info this could give a runtime error when called) virtual void riscv_asm(std::ostream& dst, Helper& helper, const std::string& destReg, [[maybe_unused]] Type type) { // Note: up to you to decide if simply logging a warning is sufficient or if calling the type-specialised version on a class which doesn't implement it should result in a runtime error std::cerr << "[Warning] Calling type-specialised riscv_asm on class which doesn't implement it" << std::endl; riscv_asm(dst, helper, destReg); } };
This approach isn't perfect as ideally, a class which derives from
Node
has the requirement to implement the non-type-specialised version ofriscv_asm
, and never needs to implement the type-specialised version. Ideally, derived classes should implement one or the other, shifting all the run-time errors in the implementation above to compile-time (always preferable). There might be a way to capture these semantics using some new C++20 features likeconcepts
, but I wasn't able to easily find a way to do this, so it's possible that the implementation above is the best you could do at the moment.-
I'm not sure why you've added the
evaluate
method to your compiler, this isn't really relevant for the purposes of this coursework. You've also implemenented it for many of your derived classes, but never actually use it anywhere meaningful in your program. -
[nitpick]
You should have made thedestReg
parameter inriscv_asm
aconst std::string&
rather than astd::string
. In this case I'm assuming that yourdestReg
strings are only 2 characters so it's not that much memory to copy (although it would be more than a const string reference - pointer, when considering the other private member variables ofstd::string
), but in general it's always good practice to pass these container types which can be expensive to copy by const reference instead of value. This is a bigger issue for the return type ofgetId()
, as this could lead to copying an aribtrarily long variable name which would significantly slow down your code. -
In general, avoid statements like
typedef const Node *NodePtr;
. Although it ensures you don't accidentally forget aconst
somewhere (although you'd get a compiler error if you do this), aliasingconst Node*
asNodePtr
doesn't save you particularly many characters, and only obfuscates your code. It might be worth it specifically fortypedef std::vector<NodePtr> List
as that name is starting to get long, but it's still debatable in that case (and I'd argue that the nameList
isn't descriptive enough). Similarly, I don't think theinit_list
andconcat_list
helper functions are adding any value.
- I'm not sure why you include a
-
You use
std::string
s instead of enums in many places, leading to quite unsafe code - especially forgetClass
andgetType
. You even seem to have created anenum _Types
(which should be anenum class
- see why) which is perfect for this, but don't really use it to it's full potential, instead converting it to a string. Sometimes you also don't throw a runtime error when given an invalid value, but rather return a value indicating failure. For example, consider the following function:std::string get_loop_labels(std::string where){ if (where == "Start"){ return loop_start.back(); } else if (where == "End"){ return loop_end.back(); } else if (where == "Update"){ return loop_update.back(); } else{ return "invalid location to jump"; } }
This is quite error prone, as even a slight typo such as calling
get_loop_lables("start")
would quitely return"invalid location to jump"
, which you might not even notice (although you'd probably get an assembler error). It would be much safer to either have seperate methods for each of the loop labels, or pass in an enum class which defines the 3 possibilities, and callingexit(1)
if one of these values isn't matched (along with the appropriate error message). If you do want to have the possibility of failure, it would be much safer to return astd::optional<std::string>>
(see here), which would allow you to safely perform checks on the returned value, and even throw if you ever accidentally use the result of the function when it's invalid. Your loop label vectors could also contain this optional type, meaning that you could use the null value for the update label in while loops, rather than relying on the string"None"
to replicate this behaviour. This would be a much nicer version of implementing the same semantics:enum class LoopLabel { START, END, UPDATE }; const std::optional<std::string>& GetLoopLabels(LoopLabel where) { switch (where) { // compiler will warn you if you miss any cases here case LoopLabel::START: return loop_start.back(); case LoopLabel::END: return loop_end.back(); case LoopLabel::UPDATE: return loop_update.back(); } std::cerr << "Unexpected loop label (should be impossible)" << std::endl; exit(1); }
An even better solution would've been to have a data
struct
tying all of these loop labels together:struct LoopLabel { std::string start; std::string end; std::optional<std::string> update; // While loop doesn't have update label }; class Helper { private: // ... std::vector<LoopLabel> loop_labels; public: // ... void EnterLoop(const std::string& start, const std::string& end, const std::optional<std::string>& update = {}) { // Constructs LoopLabel object in-place (more efficient - eliminates need to call copy/move constructor) // Note that this will still need to make copies of all the labels when constructing the LoopLabel instance, it just won't need to copy the instance itself. If you didn't need to use the labels passed into this function elsewhere you could always pass them in by r-value reference instead (which would now allow you to use a move instead of a copy when constructing the LoopLabel instance) loop_labels.emplace_back(start, end, update); } void ExitLoop() { if (loop_labels.empty()) { std::cerr << "Tried to exit loop when there are no loops" << std::endl; exit(1); } loop_labels.pop_back(); } const LoopLabel& GetLoopLabels() const { return loop_labels.back(); } };
This ensures that the loop start, end and update labels cannot get out of sync with each other (which is a risk in your code), and also gives you a very clean syntax when trying to access loop labels, i.e.
void Continue::riscv_asm(std::ostream& dst, Helper& context, ...) { const auto& labels = context.GetLoopLabels(); const std::string& target = labels.update.value_or(labels.start) // sets target to labels.update if it's defined, otherwise labels.start dst << "beq zero, zero, " << target << std::endl; }
Hopefully you can see where these sort of concepts could be useful in other parts of your code to help make things safer 😄!
-
You use
std::map
s in many places where you should actually be usingstd::unordered_map
s.std::map
s are implemented as red-black binary trees, meaning that insertion and element lookup take$O(\log{n})$ , whilststd::unordered_map
s are implemented as hash tables, so these same operations only take$O(1)$ time (amortised in the case of insertion).-
[nitpick]
You use the.count
method to check if your maps contain a particular element. While works fine, you should ideally use.contains
instead (particularly since you're already compiling under C++20 which includes it).
-
-
Your
Helper
class does a lot of really smart things to make your life easier, although I was a little surprised to see that you rely on a fixed stack size for your functions (128 bytes == 32 words). Although this is probably sufficient for most functions you need to compile, it's definitely possible for you to run out of space, an ideal approach would've been to compute how much memory you need as you build up your AST and use this value to decrement your stack. It's good to see that you did consider the possibility of running out of memory, and produce an appropriate runtime error when you do.-
Your class has some
protected
members, as well as avirtual
destructor, even though no class inherits fromHelper
. It would've been better to declare these membersprivate
. In general it's bad practice to haveprotected
members - see here. -
[nitpick]
I thinkHelper
isn't the best name for your compiler context, as it's a bit too generic, I find something likeContext
would be better. You could even split your helper class into aLabelManager
,StackManager
, etc. calling the overall classContextManager
- this might be ideal as your class is quite large.
-
-
[nitpick]
In general, it's better to putprivate
members of a class at the bottom of a class definition andpublic
members at the top. This is because you usually want the most relevant information at the top of a file (since we read top down) - and for a class this is the public interface, since as a user of the class you don't really care about the implementation details. -
[nitpick]
In general, classes (in C++) are named usingUpperCamelCase
. You seem to have a mix of this,snake_case
,Upper_Snake_Case
andalllowercase
- try to be consistent with your naming approach for all classes in a project. -
[nitpick]
Using#pragma once
instead of#ifndef ... #define ... #endif
is better for header guards as it is less prone to manual error.
You passed 161 / 202, giving you an overall score of 80%, with a seen to unseen split as follows:
-
87 / 87 seen (100%)
-
74 / 115 unseen (64%)
Although a lot of the feedback that I've provided here is constructive, your compiler is very impressive, and you should both be incredibly proud of your results for this coursework, passing 100% of seen testcases is no small feat so amazing job to both of you -- in fact no other group from this year managed to achieve this. It's a shame that your parser cost you so many marks, I really think you both deserve more, but keep in mind that your overall score is still extremely high relative to the rest of the cohort so great work! I'm sure you both learnt a lot from this project, and I hope that the feedback above can give you some ideas on improving your software engineering skills even further 🚀🌔 .