Skip to content

Commit

Permalink
iox-eclipse-iceoryx#136 Address review findings
Browse files Browse the repository at this point in the history
Signed-off-by: Simon Hoinkis <[email protected]>
  • Loading branch information
mossmaurice committed Jan 8, 2021
1 parent a2dca05 commit 73f608c
Show file tree
Hide file tree
Showing 5 changed files with 96 additions and 92 deletions.
58 changes: 30 additions & 28 deletions doc/design/error-handling.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ The logger is responsible for logging information about the state of the system

The logger is thread-safe and can hence be safely used from multiple threads concurrently. Currently the logger is synchronous.

### Log Levels
### Log levels

The following log levels are supported, ordered by the amount of information displayed from highest to lowest.

Expand All @@ -25,7 +25,7 @@ The following log levels are supported, ordered by the amount of information dis

For ``ERR`` and ``FATAL`` see also error levels ``MODERATE``, ``SEVERE`` (logged with ``LogErr``) and ``FATAL`` (logged with ``LogFatal``) in [Error Levels](#Error-Levels). The levels ``ERR`` and ``FATAL`` are only supposed to be used together with the error handler, i.e. need to be accompanied with a corresponding error handler call (currently this cannot be enforced).

## Error Handling
## Error handling

Errors are considered to be system states that should not be reached regularly and usually are the result of an external failure, such as when the OS is unable to provide a certain resource (e.g. a semaphore) or an application does not respond. In contrast, regular behavior such as a receiver receiving no data when none was sent is not an error. On the other hand, losing data that was sent would be considered an error.

Expand Down Expand Up @@ -68,27 +68,27 @@ It is not supposed to be called by applications at any time.
* When the reaction on an error is just logging, computation in other threads shall not be influenced.
* If the error does not require termination, the error handler must return eventually.

#### Error Levels
### Error Levels
The following error levels are supported.

* ``MODERATE``
* ``SEVERE``
* ``FATAL``

##### MODERATE
#### MODERATE
A recoverable error. Leads to an error log entry (``LogErr``) and continues execution.

**Example:**
1) RouDi receives an unexpected message and discards it. The remaining communication proceeds normally.
2) A port requested by an application cannot be provided due to e.g. resource exhaustion.

##### SEVERE
#### SEVERE
RouDi may continue but applications may be compromised or the functionality reduced. Leads to an error log entry (``LogErr``) and assert, terminating execution in debug mode. The handler must return to be able to continue execution in release mode.

**Example:**
A message queue is overflowing and messages are lost. RouDi can continue but lost data may affect applications.

##### FATAL
#### FATAL
RouDi cannot continue and will shut down. Leads to an error log entry (``LogFatal``), assert and calls ``std::terminate``, terminating execution in debug and release mode.
Before calling terminate, a callback is invoked (if configured), which can execute specific error handling code (e.g. call a 3rd party error handler).
The handler is not required to return here (since this may not be always possible or reasonable). The reporting code should still try to proceed to a safe state if possible in order to improve testability in case of such errors.
Expand All @@ -114,7 +114,7 @@ Although Expects end Ensures behave the same, the former is used to signify a pr

Examples include expecting pointers that are not null (as input, intermediate or final result) or range checks of variables.

#### cxx::expected
### cxx::expected

``cxx::expected<T, E>`` is a template which either holds the result of the computation of type ``T`` or an object of error type ``E``. The latter can be used to obtain additional information about the error, e.g. an error code.
In a way this extends error codes and may act as kind of a replacement of exceptions. It is usually used as return type of functions which may fail for various reasons and should be used if the error is supposed to be delegated to the caller and handled in the caller context.
Expand All @@ -125,12 +125,14 @@ If the error cannot be handled at a higher level by the caller, the error handle

Examples include wrapping third party API functions that return error codes or obtaining a value from a container when this can fail for some reason (e.g. container is empty). If no additional information about the error is available or required, ``cxx::optional<T>`` can be used instead.

#### Error Handling in posh
### Error Handling in posh

Error logging shall be done by the logger only, no calls to ``std::cerr`` or similar should be performed.

All the methods presented (``cxx::expected``, ``Expects`` and ``Ensures`` and the error handler) can be used in posh. The appropriate way depends on the type of error scenario (cf. the respective sections for examples). The error handler should be considered the last option.

#### Error Handling in utils
### Error Handling in utils

Error logging is currently done by calls to ``std::cerr``. In the future those might be redirected to the logger.

The error handler cannot be used in utils.
Expand All @@ -143,9 +145,9 @@ It should be noted that since currently Expects and Ensures are active at releas

The Error handler as well as logger shall be able to use or redirect to 3rd party error handling or logging libraries in the future. Currently this is neither fully supported nor used. The error handler has a callback function which can in principle be used to call 3rd party code.

### Usage
## Usage

#### Logger
### Logger
The logger can be used similar to the streams in the C++ standard API.
To select the log level, the corresponding logger has to be used, e.g. ``LogErr``, ``LogWarn`` etc.

Expand All @@ -154,7 +156,7 @@ LogWarn() << "log message " << someValue << "log message continued";
```
A line break is inserted implicitly at the end (after "log message continued" in the example).

#### Error Handler
### Error Handler
The most general use case is the following
```
if(noError()) {
Expand All @@ -175,7 +177,7 @@ If no callback but an error level is desired, a *nullptr* has to be provided for
errorHandler(Error::kSOME_ERROR_CODE, nullptr, ErrorLevel::MODERATE);
```

#### Expects and Ensures
### Expects and Ensures

Assume myAlgorithm is part of an inner API and not supposed to be called with a ``nullptr``. We may have used a reference here, this is just for illustration.
In addition the value pointed to is assumed to be in the range (-1024, 1024). While we could check this every time, this can be avoided if we specify that the caller is responsible to ensure that these conditions hold.
Expand Down Expand Up @@ -203,7 +205,7 @@ int32_t myAlgorithm(int32_t* ptr) {
Note that in the case of ``nullptr`` checks it is also an option to use references in arguments (or ``not_null`` if it is supposed to be stored since references are not copyable). It should be considered that ``not_null`` incurs a runtime cost, which may be undesirable.
When Expects and Ensures are implemented to leave no trace in release mode, we do not incur a runtime cost using them. For this reason, it is advised to use them to document and verify assumptions where appropriate.

#### cxx::expected
### cxx::expected
This example checks the arguments and if they are valid proceeds to compute a result and returns it.
Otherwise it creates an Error object from an errorCode and returns it.

Expand Down Expand Up @@ -245,17 +247,17 @@ auto errorFunc = [](Error& error) {
func(arg).and_then(successFunc).or_else(errorFunc);
```

### Open Points
## Open Points

#### Centralized Error Handling
### Centralized Error Handling

It may be desirable to have a centralized error handling instance where runtime errors on application side are logged and (maybe) handled.
This could also be done in RouDi (by sending information to RouDi), but RouDi already has too much responsibility. Preferably this should be done by a separate application with this sole purpose.
If the application cannot reach the central handler, it shall try to handle the error locally if possible (at least log it).

However, it might be too slow if this would rely on error transmission and responses. If this is to be implemented, the exact mechanism has to be decided on.

#### 3rd Party Error Handling
### 3rd Party Error Handling
We need to decide how to provide an interface for 3rd party error handling, especially for the runtime. This interface will probably rely on hooks/callbacks. The signature and callsites of these need to be discussed.
This is related to centralized error handling as well.

Expand All @@ -267,51 +269,51 @@ time with no or few runtime artifacts). It could be an option to e.g. disable al

This is also related to the hooks for 3rd party error handling we may want to provide.

#### Return in Case of Fatal Error
### Return in Case of Fatal Error
The reporting code does not need to be able to continue properly in case of a fatal error, but there needs to be a return after the error handler call. While the error handler is not required to return, it still might under certain circumstances (e.g. a mock error handler in a test case).

The (complete) intended behavior of the error handler requires some further clarification, especially in the case of fatal errors. In the case of non-fatal errors the code invoking the error handler must be able to continue after the error handler returns.

#### Error Handling vs. Logging
### Error Handling vs. Logging
Does it make sense to have ``LogErr`` without an error handler call? If an error occurs it should probably be enforced that the handler is called and not just lead to a log entry.
One reason for this is that currently it is not possible to provide an additional error message to the error handler.

#### Additional Error Information
### Additional Error Information
It would be desirable to allow the possibility to provide additional messages (or even general functions/arguments) to the error handler, which is currently missing.
This can be combined with the addition of error location to the error handler.

An optional stack-trace (at least in debug mode) may also prove very useful.
What is needed to have a limited stack-trace even in release mode?

#### Debug vs. release mode
### Debug vs. release mode
We need to further clarify behavior in release and debug mode of the error handler and ``Expects`` and ``Ensures`` (and maybe the logger as well). Can we have a release build with additional information? (e.g. symbols for a stack-trace).

#### Assert
### Assert
Do we want an Assert in addition to ``Expects`` and ``Ensures``? If so, shall it possibly be active in release mode or only debug mode?

In principle with a sufficiently powerful Assert or ``Expects`` (resp. ``Ensures``), this should not be needed (they are equivalent in their functionality).

#### Errors in utils
### Errors in utils
Currently there are a few occurrences in utils where terminate is called directly in case of an error. We need to evaluate whether it is possible to replace them all with assert-like constructs such as ``Expects``, ``Ensures`` or ``assert`` or something else.

### Future improvements
## Future improvements
In this section we briefly describe ways to potentially improve or extend functionality of existing error handling components. This list is by no means exhaustive and all suggestions are up for discussion and may be refined further.

#### Logger
### Logger
1. The logger could be extended to include logging over a network to a remote server or similar.
2. Support asynchronous logging.

#### Error Handler
### Error Handler
1. Allow customization for ``MODERATE`` and ``SEVERE`` errors to continue according to a user defined configuration.
2. Add file, line and function information (using ``__FILE__``, ``__LINE__`` and ``__func__``). This would require using macros for the error handler call in a future implementation.
3. Allow generalized callbacks with variadic arguments.
4. Change the order of arguments in a future design (callback and additional arguments last). Providing the callback and potential arguments can be made fully optional this way.
5. If deactivation or reduced operation (e.g. not handling ``MODERATE`` errors) is desired, this partial deactivation should cause no (or at least very little) runtime overhead in the deactivated cases.

#### Expects and Ensures
### Expects and Ensures
Allow deactivation in release mode, but it should still be possible to leave them active in release mode as well if desired. Deactivation in debug mode can also be considered but is less critical. Deactivation should eliminate all runtime overhead (i.e. condition evaluation).

#### cxx::expected
### cxx::expected
1. Consider renaming ``cxx::expected`` to ``cxx::result``, which is more in line with languages such as Rust and conveys the meaning more clearly.
2. Add ``has_value`` (in addition to ``has_error``) for consistency with ``cxx::optional``.
3. Improve the monadic error handling (beyond ``and_then``, ``or_else``) to allow for better pipelining of multiple consecutive calls (especially in the success case). This requires careful consideration of supported use cases and intended behavior but can reduce control flow code (``if ... else ...``) in error cases considerably.
31 changes: 0 additions & 31 deletions doc/website/advanced/installation-for-developers.md

This file was deleted.

65 changes: 65 additions & 0 deletions doc/website/advanced/installation-guide-for-contributors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Installation guide for contributors

## Build and run tests

While developing on iceoryx you want to know if your changes are breaking existing functions or if your newly written googletests are passing.
For that purpose we are generating cmake targets which are executing the tests. But first we need to build them:
```
cmake -Bbuild -Hiceoryx_meta -DBUILD_TEST=ON
cmake --build build
```
Cmake is automatically installing googletest as dependency and build the tests against it. Please note that if you want to build tests for extensions like the C-Binding you need to enable that in the cmake build. To build all tests simply add `-DBUILD_ALL` to the cmake command

Now lets execute the all tests:
```
cd iceoryx/build
make all_tests
```
Some of the tests are timing critical and needs a stable environment. We call them timing tests and have them in a separate target available:
```
make timing_tests
```
In iceoryx we distinguish between different testlevels. The most important are: Moduletests and Integrationtests.
Moduletests are basically Unit-tests where the focus is on class level with black-box testing.
In Integrationtests are multiple classes within one component (e.g. iceoryx_posh) tested together.
The sourcecode of the tests is placed into the folder `test` within the different iceoryx components. You can find there at least a folder `moduletests` and sometimes ``integrationtests`.

when you now want to create a new test you can place the sourcefile directly into the right folder. Cmake will automatically detect the new file when doing a clean build and will add it to a executable. There is no need to add a gtest main function because we already provide it.
For every test level are executables created, for example `posh_moduletests`. They are placed into the corresponding build folder (e.g. `iceoryx/build/posh/test/posh_moduletests`).

If you want to execute only individual testcases then you can use these executables and a gtest filter. Let's assume you want to execute only the `ServiceDescription_test` from the posh_moduletests, then you can do the following:
```
./build/posh/test/posh_moduletests --gtest_filter="ServiceDescription_test*"
```

## Use Sanitizer Scan

Due to the fact that iceoryx works a lot with system memory it should be ensured that errors like memory leaks are not introduced.
To prevent these, we use the clang toolchain which offers several tools for scanning the codebase. One of them is the [Address-Sanitizer](https://clang.llvm.org/docs/AddressSanitizer.html) which checks for example on dangling pointers.

In iceoryx below sanitizers are enabled at the moment.
- [AddressSanitizer](https://clang.llvm.org/docs/AddressSanitizer.html)
AddressSanitizer is a fast memory error detector.
**NOTE :** AddressSanitizer exits on the first detected error, which means there could be more errors in the codebase when this error is reported.
- [LeakSanitizer](https://clang.llvm.org/docs/LeakSanitizer.html)
LeakSanitizer is a run-time memory leak detector. In iceoryx , it runs as part of the AdderssSanitizer.
- [UndefinedBehaviorSanitizer](https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html)
UndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior detector. Iceoryx uses default behaviour ie `print a verbose error report and continue execution`

In iceoryx are scripts available to do the scan on your own. Additionally the Scans are running on the CI in every Pull-Request.
As Requirement you should install the clang compiler:
```
sudo apt install clang
```

Then you need to compile the iceoryx with the sanitizer flags:
```
./tools/iceoryx_build_test.sh build-strict build-all sanitize clang clean
```
After that we can run the tests with enabled sanitizer options:
```
cd build
../tools/run_all_tests.sh
```
When the tests are running without errors then it is fine, else an error report is shown with a stacktrace to find the place where the leak occurs. If the leak comes from an external dependency or shall be handled later then it is possible to set a function on a suppression list.
This should be only rarely used and only in coordination with an iceoryx maintainer.
Loading

0 comments on commit 73f608c

Please sign in to comment.