Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next cpp setup #4

Draft
wants to merge 26 commits into
base: main
Choose a base branch
from
Draft

Next cpp setup #4

wants to merge 26 commits into from

Conversation

hobovsky
Copy link

@hobovsky hobovsky commented Jan 22, 2024

Very rudimentary prototype of a setup with snippets preprocessor.

Directory snippets contains snippets of a kata as received from user or read from database: solution.snippet, test_solution.snippet, and preloaded.snippet.

Directory preprocessor contains source code of a homegrown preprocessor cwpp.

The snippet are preprocessed with cwpp in a following way:

  • solution.snippet is searched for a header marker //! CW_SOLUTION_H and implementation marker //! CW_SOLUTION_CPP.
    • if no marker is present, then solution.snippet gets copied to both include/solution.h and src/solution.cpp,
    • if only header marker is present, then solution.snippet gets copied to include/solution.h and an empty src/solution.cpp is created,
    • if only cpp marker is present, then solution.snippet gets copied to src/solution.cpp and an empty include/solution.h is created,
    • if both markers are present, then solution.snippet gets split into both include/solution.h and src/solution.cpp in a way described with the markers.
  • The same procedure is repeated for preloaded.snippet.
  • After snippets are converted into source files and distributed into expected locations, configure and run-tests are called.

see distribute-snippets.sh for invocation of the CW preprocessor. Use run-all.sh to run all the steps: preprocessing, build, and tests.

@kazk
Copy link
Member

kazk commented Jan 23, 2024

The runner works like the following for the Codewars style challenge with max 3 files:

  1. Receives a payload with the snippets
  2. Convert the snippets into files array (type File = {path: string; contents: string};)
  3. Create a container
  4. Copy the files into the container under the project directory
  5. Start the container

So in the actual implementation, the preprocessor to split the solution/preloaded will be in step 2, outside the container.


For the marker, can't we use #ifndef NAME_INCLUDED/#endif // NAME_INCLUDED?

#ifndef SOLUTION_INCLUDED
#define SOLUTION_INCLUDED
// solution.h
#endif // SOLUTION_INCLUDED
// anything after is `solution.cpp`
  • If a line with #endif // SOLUTION_INCLUDED was found:
    • Anything up to that line is include/solution.h.
    • Anything after that line is src/solution.cpp.
  • Otherwise, everything is src/solution.cpp (same as if it were found at line 0). Is there a reason to also copy it to include/solution.h?

@hobovsky
Copy link
Author

hobovsky commented Jan 23, 2024

For the marker, can't we use #ifndef NAME_INCLUDED/#endif // NAME_INCLUDED?

We can, and it would be even better, with a small caveat that if we wanted to use a language-neutral #endif (without a trailing comment), then logic of the parser would get a little bit more complex because #if/#ifdef directives can nest. It's not a big problem because preprocessor grammar is extremely simple and a simple nesting counter would suffice. If we decide that closing comment is literally #endif // NAME_INCLUDED, or #endif NAME_INCLUDED, then it's absolutely no problem.

  • Is there a reason to also copy it to include/solution.h?

Not saving a header when is not there in the snippet would be indeed clearer. I saved it because I was trying to build on top of Unnamed's idea to let author choose if the sippet should be used as a header or as an implementation if they choose to not use markers. The idea works (see snippets-Cafeteria), but feels a bit hacky to me, yes. I would support the idea of not saving a header file if there is no header markup. If we agree that we do preprocessing, then saving a snippet as both header and cpp is not necessary. It would be helpful if we decided to go without preprocessing.

@kazk
Copy link
Member

kazk commented Jan 23, 2024

I think the following works:

const splitSnippet = (code: string, name: string): [header: string, impl: string] => {
  const lines = code.split(/(?<=\n)/); // Keep LF in each line
  const marker = new RegExp(
    `^\\s*#\\s*ifndef\\s+${name.toUpperCase()}_INCLUDED`
  );
  const start = lines.findIndex((line) => marker.test(line));
  // No header file unless there's a start marker.
  if (start === -1) return ["", code];

  let counter = 1;
  const end = lines.findIndex((line, i) => {
    if (i > start) {
      if (/^\s*#\s*endif\s*/.test(line)) return --counter === 0;
      if (/^\s*#\s*if(?:n?def)?\s+/.test(line)) ++counter;
    }
    return false;
  });
  // Put everything in header unless there's a matching `#endif`.
  // The error should make more sense that way.
  return end === -1
    ? [code, ""]
    : [lines.slice(0, end + 1).join(""), lines.slice(end + 1).join("")];
};

It's only #if, #ifdef and #ifndef, right? # can have whitespace before and after.

Used like:

const [solutionH, solutionCpp] = splitSnippet(solution, "solution");
const files = [
  {path: "include/solution.h", contents: solutionH},
  {path: "src/solution.cpp", contents: solutionCpp},
  // ...
];

With this convention, if users want to solve locally, they can just use separate files and concatenate on Codewars without adding anything proprietary.

@hobovsky
Copy link
Author

It's only #if, #ifdef and #ifndef, right?

Looking at the reference, there's a few more, but yes, these three begin a new nesting level.

# can have whitespace before and after.

I think the latter is not necessary. IIRC a preprocessor directive must follow its starting #, but I think it does not help much in our case? Maybe simplifies one or two regex a bit.

Important part is to make sure that both files, h and cpp, exist after preprocessing, even if they end up empty.

Copy link
Member

@kazk kazk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying this gtest setup locally now.

gtest/snippets/OriginalExample/preloaded.snippet Outdated Show resolved Hide resolved
gtest/snippets/OriginalExample/preloaded.snippet Outdated Show resolved Hide resolved
gtest/snippets/UniqueInOrder/solution.snippet Outdated Show resolved Hide resolved
gtest/snippets/UniqueInOrder/solution.snippet Outdated Show resolved Hide resolved
@hobovsky
Copy link
Author

hobovsky commented Nov 8, 2024

I just checked the preprocessor reference again and it seems that C++ 23 adds new directives which cwpp would have to account for: #elifdef and #elifndef: https://en.cppreference.com/w/cpp/preprocessor/conditional

It seems to be not relevant for C++20, but I think we might want to support them even now to reduce future maintenance when we introduceC++23?

@kazk
Copy link
Member

kazk commented Nov 13, 2024

Do we need to? Aren't those basically specialized #elif? Not creating a new nesting level, right?

@hobovsky
Copy link
Author

Do we have to?

It depends. Theoretically, if we use #ifndef SOLUTION_INCLUDED to mark a beginning of a block, then the block can be ended not just by #endif, but also by #else or any of the #elif[[n]def]. Question is, what should our preprocessor do if our starting tag is closed by anything else than #endif. Maybe we do not need to add any special handling, but I think that we need to at least make sure that in such cases a failure, if any, is handled gracefully.

Another tricky case is when an author or a user uses #if defined or #if !defined instead of #if[n]def. But I think that all in all it is a matter of flexibility vs. convention, and whether we enforce specific form of tags, or allow users and authors to change them to equivalent forms. I have no firm opinion on which is better.

@kazk
Copy link
Member

kazk commented Nov 13, 2024

Question is, what should our preprocessor do if our starting tag is closed by anything else than #endif. Maybe we do not need to add any special handling, but I think that we need to at least make sure that in such cases a failure, if any, is handled gracefully.

If the starting tag is not closed properly, the preprocessor simply puts everything in the header file, which should be the expected outcome because it's missing the end of a header file. Then we'll get error: unterminated conditional directive in include/solution.h. The error output is noisy, but it's probably graceful enough.

Another tricky case is when an author or a user uses #if defined or #if !defined instead of #if[n]def. But I think that all in all it is a matter of flexibility vs. convention, and whether we enforce specific form of tags, or allow users and authors to change them to equivalent forms. I have no firm opinion on which is better.

I can't think of any reason for flexibility here. Also, I believe using #ifndef is the convention for the include guards. I guess the only cost is a slightly more complex regular expression to detect the marker, so I'm not against supporting them, but I don't really see the point.

The current preprocessor is simply:

  1. If #ifndef SOLUTION_INCLUDED or #ifndef PRELOADED_INCLUDED exists, split the file at the end of this condition or the end of the file.
  2. Otherwise, no header file.

One thing the preprocessor can include is #line lino "file" to make it easier to find the relevant line mentioned.

@hobovsky
Copy link
Author

As I said, I do not think we have to change anyrhing from the initial idea, I just wanted to make sure that all possibilities are considered so we do not miss anything important. If you checked it and it works reasonably well for you, then great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants