Before you create a pull request, please create a new issue first to coordinate.
It might be that we are already working on the same or similar feature, but we haven't made our work visible yet.
We usually develop in a virtual environment. To create one, change to the root directory of the repository and invoke:
python -m venv venv
You need to activate it. On nix (Linux, Mac, *etc.):
source venv/bin/activate
and on Windows:
venv\Scripts\activate
Once you activated the virtual environment, you can install the development
dependencies using pip
:
pip3 install --editable .[dev]
The --editable option is necessary so that all the changes made to the repository are automatically reflected in the virtual environment (see also this StackOverflow question).
Please make yourself familiar with a general literature on compiler design.
For example, Crafting Interpreters is a good book.
Please also read relevant publications related to the aas-core-codegen:
- https://www.researchgate.net/publication/375497058_Empowering_Industry_40_with_Generative_and_Model-Driven_SDK_Development
- https://www.researchgate.net/publication/356039469_Generative_and_Model-driven_SDK_development_for_the_Industrie_40_Digital_Twin,
- https://www.researchgate.net/publication/373325991_Generation_of_Digital_Twins_for_Information_Exchange_Between_Partners_in_the_Industrie_40_Value_Chain,
Always write explicit types in function arguments.
If you really expect any type, mark that explicitly with Any
.
Also always mark your local variables with a type if it can not be deduced.
For example:
lst = [] # type: List[str]
For files, use typing.IO
.
We prefer to put types in comments if they are short for readability. However, put them in code when they are multi-line:
some_map: Optional[
Dict[
SomeType,
AnotherType
]
] = None
Put _set
for sets.
Prefer to designate the key with _by_
suffix.
For example, our_types_by_name
is a mapping string (a name) 🠒 OurType
.
Do not put get_
in method names.
If you want to make sure that the reader understands that some method is going to take longer than just a simple getter, prefix it with a meaningful verb such as compute_...
, collect_...
or retrieve_...
.
Do not duplicate module (package) or class names in the property names.
For example, if you have a class called Features
, and want to add property to hold feature names, call the property simply names
and not feature_names
.
The caller code would otherwise redundantly read Features.feature_names
or features.feature_names
.
Do not call your modules, classes or functions ..._helpers
or ..._utils
.
A general name is most probably an indicator that there is either a flaw in the design (e.g., tight coupling which should be decoupled) or that there needs to be more thought spent on the naming.
If you have shared functionality in a module used by all or most of the submodules, put it in common
submodule.
- Prefer functional programming to object-oriented programming.
- Better be explicit about the data flow than implicit.
- Prefer namespaced functions in a (sub)module instead of class methods.
- Side effects are difficult to trace.
- Context of a function is immediately visible when you look at arguments. A function is much easier to isolate and unit test than a class method.
- Use inheritance only when you need polymorphism.
- Do not use inheritance to share implementation; use namespaced functions for that.
- Prefer simplicity with a small number of classes; see http://thedailywtf.com/articles/Enterprise-Dependency-The-Next-Generation
- Use stateful objects in moderation.
- Some thoughts: https://medium.com/@cscalfani/goodbye-object-oriented-programming-a59cda4c0e53
Do not split script-like parts of the code into small chunks of one-time usage functions.
Use comments or regions to give overview.
It's ok to have long scripts that are usually more readable than a patchwork of small functions. Jumping around a file while keeping the context in head is difficult and error-prone.
Do not ever use stateful singletons. Pass objects down the calling stack even if it seems tedious at first.
Very common symbols such as Error
or Identifier
can be imported without prefix.
Usually, these symbols reside in aas_core_codegen.common
.
In addition, do not prefix typing
symbols such as List
or Mapping
, and the assertion functions from icontract design-by-contract library (see below).
Otherwise, the code would be completely unreadable.
All other symbols should be imported with an aliased prefix corresponding to the module. For example:
The indention constants (I
, II
etc.) are the only aliases allowed for symbols.
No other symbol should be aliased.
Use pathlib
, not os.path
.
Use design-by-contract as much as possible. We use icontract library.
Prefer immutable to mutable objects and structures.
Distinguish between internally and externally mutable structures. Annotate for immutability if the structures are only internally mutable.
For example, aas_core_codegen.intermediate._types
are all marked as immutable since they should not be mutated after the intermediate translation phase.
They are, however, mutated within aas_core_codegen.intermediate._translate
.
Double-asterisks are unpredictable for the reader, as all the keys need to be kept in mind, and overridden keys are simply ignored.
Please do not use **
operator unless it is utterly necessary, and explain in the comment why it is necessary.
Check for overwriting keys where appropriate.
Always use classes in the code.
Use TypedDict
only if you have to deal with serialization (e.g., to JSON).
We intensively use PyCharms # region ...
and # endregion
to structure code into regions.
Mark notes with # NOTE ({github username}, {date in ISO 8601}):
.
No # TODO
in the code, please.
Comment only where the comments really add information. Do not write self-evident comments.
Comments should be in proper English. Write in simple present tense; avoid imperative mood.
Be careful about the capitals. Start the sentence with a capital. If you list bullet points, start with a capital, and do not forget conjectures:
# * We ...,
# * Then, ..., and finally
# * We ...
The abbreviations are to be written properly in capitals (e.g., JSON
and not json
).
No code is allowed in the comments since it always rots.
You can write full-blown Sphinx docstrings, if you wish.
In many cases, a short docstring is enough.
We are not religious about :param ...:
and :return
fields.
Follow PEP 287. Use imperative mood in the docstrings.
Write unit tests for everything that can be obviously tested at the function/class level.
For many inter-dependent code regions, writing unit tests is too tedious or nigh impossible to later maintain.
For such parts of the system, prefer integration tests with comparisons against initially recorded and reviewed golden data.
See, for example, tests/csharp/test_main.py
or tests/intermediate/test_translate.py
.
The golden test data resides in test_data/
.
The structure of the test data directory follows in general the test module structure.
We provide a battery of pre-commit checks to make the code uniform and consistent across the code base.
We use black to format the code and use the default maximum line length of 88 characters.
To run all pre-commit checks, run from the root directory:
python continuous_integration/precommit.py
You can automatically re-format the code and fix certain files automatically with:
python continuous_integration/precommit.py --overwrite
The pre-commit script also runs as part of our continuous integration pipeline.
We follow Chris Beams' guidelines on commit messages:
- Separate subject from body with a blank line
- Limit the subject line to 50 characters
- Capitalize the subject line
- Do not end the subject line with a period
- Use the imperative mood in the subject line, full sentences in the body
- Wrap the body at 72 characters
- Use the body to explain what and why vs. how
If you are merging in a pull request, please squash before merging. We want to keep the Git history as simple as possible, and the commits during the development are rarely insightful later.
We need to frequently back-propagate test data from aas-core-meta repository. To facilitate fetching and re-recording test output whenever the meta-models change, we wrote a couple of scripts in dev_scripts/.
The scripts are hopefully self-explaining. Please let us know if you need more information so that we can improve this documentation accordingly.