ssh connection troubleshooting #8718

alexandrejbr · 2024-08-13T15:50:39Z

Is your feature request related to a problem? Please describe.
For troubleshooting purposes, we would like to be able to have more detailed information of setup of the an SSH connection. For instance the proposed algorithms by each role and the authentication methods. This would be useful for both connections that are established successfully and for connections that fail to be established.

What we have in mind is to be able to debug a connection like one would do in openssh with the verbose modes, but since that is perhaps quite ambitious, starting with what each role is proposes for the connection and information about the authentications attempts would already be great.
Describe the solution you'd like
It's hard to imagine what would be the best way to obtain this information, but I imagine that a callback could work for both roles, even though for the client role the error would come from the result of ssh:connect and then a callback function would be called as well.

Logging also works, but perhaps a bit less flexible.

Do you think it would be interesting to have such troubleshooting capabilities in the ssh application?

u3s · 2024-08-27T07:40:48Z

have you seen ssh_dbg module? it is pretty powerful.

for seeing result of algorithm negotiation I would use:
ssh:start(), ssh_dbg:on([alg]).

and for also getting SSH messages leading to negotiation result:
ssh:start(), ssh_dbg:on([alg, ssh_messages]).

to get all debug:
ssh:start(), ssh_dbg:on().

apply above before establishing connection. should work for both roles.
it is based on tracing feature.

alexandrejbr · 2024-08-27T08:33:19Z

I'll have a look and see if it will work for our use case. What we want is to every time an SSH connection is established we want to have this information stored/logged so we can afterwards investigate why a connection could not be established

If we use the traces I imagine we would need to activate this, capture the traces we are interested and later on disable the tracing. Could work.

u3s · 2024-08-27T08:54:30Z

you disable it with ssh_dbg:off.

it might be slightly challenging to predict which connection will fail ... traces and production systems are tricky to combine I'm afraid.

ssh_dbg is not documented feature so you would need to read source to understand how to it works.

u3s · 2024-09-24T18:08:10Z

ping

alexandrejbr · 2024-09-25T07:54:50Z

I think I agree with you, while the ssh_dbg module is extremely powerful it seems to me that it can be a bit of an hack to create features on top of it.

If one could get similar information in callbacks or some other stream of events it would be great.

Being pragmatic, for the client role we have some information already with the connection info, but that's for successful connections, if for failed connections one could get more information and a way to correlate with the ssh:connect call would be great.

For the server role it would be interesting to have the same capabilities, so to know that a connections was established and in which terms and if it ends why it ended. The same for a connection attempts that existed, why have those failed to establish the connection.

u3s · 2024-09-25T10:00:34Z

If one could get similar information in callbacks or some other stream of events it would be great.

There was a discussion about some generic solution related to tracing of OTP apps (@rickard-green). Maybe even an OTP behavior.
Unfortunately they're having lower backlog priority and no details were agreed at this stage.

I guess your thoughts lean towards a possibility for subscribing to troubleshooting events, which you could then process in whatever way you like or store it for later.

alexandrejbr · 2024-09-25T11:03:31Z

I guess your thoughts lean towards a possibility for subscribing to troubleshooting events, which you could then process in whatever way you like or store it for later.

Yes, I think you phrased it quite well. I think this could replace the existing logs and if one prefers to have the logs instead we could have a built in subscriber of these events that would translate them into info level logs.

u3s · 2024-10-11T08:59:21Z

What about this callbacks already present?

https://www.erlang.org/doc/apps/ssh/ssh.html#t:callbacks_daemon_options/0

u3s · 2024-11-25T09:30:05Z

ping?

alexandrejbr · 2024-11-25T12:14:04Z

@u3s I have not being so active because I wanted to come up with a more concrete proposal, but maybe we can keep a bit more high level discussing the problems before we reach that state.

We use both ssh client and server and build on top of it, so our use case of debugging an ssh connection is not that we are going to open an erlang shell to do the debugging. It's more like problems occurred and we want to look at what why after the fact.

I can give you one example of one situation where the error return lost the debug information and it's only to logs that it goes, which is in the ssh_transport:handle_kexinit_msg. In the client role the user code gets just the default text for the error.

Probably if the user could supply a custom log function and we would place the error code and the extra information in a proplist we would already be extremely good as far as error scenarios.

Then in cases all is good with a connection there may be some interest in knowing to know what was actually negotiated. For the client role one can use the ssh:connection_info function, for the server role there's way unless we use tracing or the ssh_dbg module, isn't that right?

In a nutshell I think one part of this feature request is a programatic way of customising the logs in case of error and other part is to be able to do some introspection in connections that were successfully established, maybe the connectfun (daemon callbacks) could have the connection "handler"/pid so we can call connection_info on it or just have the connection information there as a parameter of the callback.

Do you think any of these are something you would consider having?

u3s · 2024-11-27T07:12:11Z

Do you think any of these are something you would consider having?

Your argumentation sounds convincing to me.
Unfortunately this has to wait until I have more time to investigate it further.

alexandrejbr · 2024-11-27T13:09:54Z

@u3s would it help if we do a draft PR? Something incomplete, but just enough to make it easier for you reflect about it.

u3s · 2024-11-27T13:12:37Z

sure. if you have a concept code, please share as a draft.

alexandrejbr added the enhancement label Aug 13, 2024

IngelaAndin added the team:PS Assigned to OTP team PS label Aug 14, 2024

u3s self-assigned this Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ssh connection troubleshooting #8718

ssh connection troubleshooting #8718

alexandrejbr commented Aug 13, 2024

u3s commented Aug 27, 2024 •

edited

Loading

alexandrejbr commented Aug 27, 2024

u3s commented Aug 27, 2024

u3s commented Sep 24, 2024

alexandrejbr commented Sep 25, 2024

u3s commented Sep 25, 2024 •

edited

Loading

alexandrejbr commented Sep 25, 2024

u3s commented Oct 11, 2024

u3s commented Nov 25, 2024

alexandrejbr commented Nov 25, 2024

u3s commented Nov 27, 2024

alexandrejbr commented Nov 27, 2024

u3s commented Nov 27, 2024 •

edited

Loading

ssh connection troubleshooting #8718

ssh connection troubleshooting #8718

Comments

alexandrejbr commented Aug 13, 2024

u3s commented Aug 27, 2024 • edited Loading

alexandrejbr commented Aug 27, 2024

u3s commented Aug 27, 2024

u3s commented Sep 24, 2024

alexandrejbr commented Sep 25, 2024

u3s commented Sep 25, 2024 • edited Loading

alexandrejbr commented Sep 25, 2024

u3s commented Oct 11, 2024

u3s commented Nov 25, 2024

alexandrejbr commented Nov 25, 2024

u3s commented Nov 27, 2024

alexandrejbr commented Nov 27, 2024

u3s commented Nov 27, 2024 • edited Loading

u3s commented Aug 27, 2024 •

edited

Loading

u3s commented Sep 25, 2024 •

edited

Loading

u3s commented Nov 27, 2024 •

edited

Loading