Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SerialCommHub: several improvements #731

Merged
merged 10 commits into from
Jul 8, 2024

Conversation

mhei
Copy link
Contributor

@mhei mhei commented Jun 18, 2024

Describe your changes

This PR accumulates several minor fixes and some improvements:

  • the currently internal retry handling is made public and configurable
  • the internal error handling is refactored so that logging can be improved and managed centrally
  • some additional minor stuff

Issue ticket number and link

This PR is considered a fix for #523, in combination with #730.

Checklist before requesting a review

  • I have performed a self-review of my code
  • I have made corresponding changes to the documentation
  • I read the contribution documentation and made sure that my changes meet its requirements
  • chargebyte internal review approved this PR

@mhei mhei force-pushed the cb-serialcommhub-improvements branch from 5ed073b to a3e5cb4 Compare June 18, 2024 20:11
@barsnick
Copy link
Contributor

@hikinggrass, @corneliusclaussen, @Pietfried
Ping for review (pre-release).

@barsnick barsnick force-pushed the cb-serialcommhub-improvements branch from a3e5cb4 to 796450d Compare June 21, 2024 12:00
Copy link
Contributor

@corneliusclaussen corneliusclaussen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See uncaught exception, the rest is just my taste.


namespace tiny_modbus {
std::string FunctionCode_to_string_with_hex(FunctionCode fc) {
std::stringstream ss;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could probably be written more readable with fmt::format, but just my taste ;-)

try {
response = modbus.txrx(device_address, function, first_register_address, register_quantity,
config.max_packet_size, wait_for_reply, request);
} catch (tiny_modbus::TinyModbusException& e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be const

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.

modules/SerialCommHub/tiny_modbus_rtu.cpp Show resolved Hide resolved
@barsnick
Copy link
Contributor

I have added the fixes. I used four separate commits, for easier review, and so that they can be squashed with their respective commits (and the whole bunch will be squashed anyway later).

@barsnick barsnick force-pushed the cb-serialcommhub-improvements branch from e8bb53f to d686fee Compare June 25, 2024 15:52
@mhei
Copy link
Contributor Author

mhei commented Jun 25, 2024

Regarding the uncaught exception: usually this should never trigger, since the if-constructs a few lines above already cover it and in case the exceptions is raised, then the code review process failed. That said, adding UnimplementedFunction as child of TinyModbusException is the wrong way to go in my eyes because it is a complete different error domain and has nothing in common with the other TinyModbusException exceptions. I see two options: either just drop raising an exception here, or catch this specific exception (std::logic_error), too - in this case, spamming the log with messages would be fine with me 😇.

Regarding failed write: since this indicates major system trouble, it should be caught on the highest level possible and in turn force a shutdown (as clean as possible in this case) and reboot of the whole system. But according to my understanding, this still requires more infrastructure around and parts are already in progress.

Just my 2 cts, back in holiday mode again 🌞.

@corneliusclaussen
Copy link
Contributor

If it never triggers, we don't need to throw it ;-) The problem that we need to change our minds a bit how to handle errors in modules in general (not just here). If a module cannot perform its work anymore (e.g. because the modbus device disappeared as someone unplugged the USB dongle), it should go into an error mode but it should never kill the process. I think the clean solution here would be what we start using in other modules: Define an error (e.g. CommunicationFault to align with other modules) to indicate to the modules that are using modbus that the connection is not available. All calls will then return with a failure.
Other modules using the modbus will also need to go into a fault mode etc.
This way, we have proper error reporting through ErrorHistory Module, OCPP etc and we do not kill everything.

There are many things that are still functional even if one device cannot be reached on the modbus anymore: E.g. the charger may be configured to still charge for free etc and cloud communication still works and reports that error properly instead of getting stuck in a reboot loop.

So I disagree that this always needs to result in a system reboot.

@corneliusclaussen
Copy link
Contributor

A better way of handling full system reboots could for example be a separate module that observes the errors of all modules and if they do not self recover (which all modules should try as best as they can) e.g. within 24h or so it reboots. This way it could also do this when e.g. no charging session is active and the behaviour could be configured (e.g. to select which errors/modules are uncritical and should be ignored in the actual setup

@barsnick
Copy link
Contributor

barsnick commented Jul 2, 2024

I think the clean solution here would be what we start using in other modules: Define an error (e.g. CommunicationFault to align with other modules) to indicate to the modules that are using modbus that the connection is not available. All calls will then return with a failure.

This requires somewhat of an infrastructure though, including interfaces like in your PR #724 (why was that closed), and handling in EvseManager, right? I guess this is something which goes beyond this PR right now, even though that is some of the thought behind it.

@corneliusclaussen
Copy link
Contributor

I think the clean solution here would be what we start using in other modules: Define an error (e.g. CommunicationFault to align with other modules) to indicate to the modules that are using modbus that the connection is not available. All calls will then return with a failure.

This requires somewhat of an infrastructure though, including interfaces like in your PR #724 (why was that closed), and handling in EvseManager, right? I guess this is something which goes beyond this PR right now, even though that is some of the thought behind it.

Yes, this is not for this PR

throw IncompletePacketException("Result data not completely in received message.");
}

// check if result is completely in received data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a copy paste error

}

TinyModbusRTU::~TinyModbusRTU() {
if (fd)
if (fd != -1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are bunch of cases in this file where we still check if fd <= 0 . can you make them consistent?

@barsnick barsnick force-pushed the cb-serialcommhub-improvements branch 2 times, most recently from 95990bb to 9b7db02 Compare July 4, 2024 14:18
@barsnick
Copy link
Contributor

barsnick commented Jul 4, 2024

I resolved most comments, squashed the fix-ups with their respective commits, and rebased to current main HEAD.

I will look at the byte count tomorrow.

mhei and others added 9 commits July 5, 2024 10:18
At the moment, an automatic retry up to 2 times is done in case the
Modbus operation fails. With some devices or in same use-cases, this
is not desired, so let us make the retry count configurable via
manifest.

So mimic the current behavior and to retain backwards compatibility,
let's use a default value of 2.

While at, we slightly change the debug messages and unify the
loop counting code.

Signed-off-by: Michael Heimpold <[email protected]>
open returns -1 on error, any other value has to be considered a
valid fd. So we have to initialize with -1 instead of zero
(which is stdin usually).
This must also be considered in the destructor to not close stdin
by accident.

Signed-off-by: Michael Heimpold <[email protected]>
When devices do not respond or some other problems occur,
then the the log output might be flooded with messages.

To have some more central control over the log output, let's
refactor the module:
- introduce various exceptions and use then in places where
  previously the log output was directly created
- introduce a common helper function which performs the
  Modbus function which also increases code sharing between
  different high-level callbacks
- emit only a warning in case the last trial of all configured
  retries fails finally (so in case the call can be completed
  successfully within the configure retries, only debug
  messages are created, but no usual user visible output)

Signed-off-by: Michael Heimpold <[email protected]>
For registers, the byte_cnt should always be even.
But for coils/inputs, the byte_cnt could be odd, so let's
cover this by ensuring that we round upwards.

Signed-off-by: Michael Heimpold <[email protected]>
Since Modbus functions which operates on registers (which are defined
as 16-bit wide) are expected to return full registers, we can and
should check the received response for this.

Signed-off-by: Michael Heimpold <[email protected]>
Signed-off-by: Moritz Barsnick <[email protected]>
Ignoring the return code of write results in a compiler warning
as theoretically the write could return without having written
the complete packet. So handle it accordingly.

In case of error, we can throw an exception.

Signed-off-by: Michael Heimpold <[email protected]>
This is assumed to be fatal and non-recoverable, therefore logging is
protected against flooding. A code path for possible recovery is
implemented nevertheless.

This should be replaced by a proper error reporting and handling solution,
yet the error interfaces and handling for this are not in place yet.

Signed-off-by: Moritz Barsnick <[email protected]>
When converting a reply with an odd number of bytes of data ("read coils" or
"read discrete inputs"), the byte-to-word conversion was reading beyond the
end of the response message buffer.

This is fixed by checking for the number of remaining bytes, and copying
only what is left.

Signed-off-by: Moritz Barsnick <[email protected]>
@barsnick barsnick force-pushed the cb-serialcommhub-improvements branch from f831bc1 to fd2c256 Compare July 5, 2024 08:20
@barsnick
Copy link
Contributor

barsnick commented Jul 5, 2024

I believe all issues are resolved with my latest commits.
I fixed the reading beyond the end of the buffer, and rebased on top of current main HEAD.

Copy link
Contributor

@corneliusclaussen corneliusclaussen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now!

@barsnick barsnick merged commit bae72ef into EVerest:main Jul 8, 2024
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants