Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssh connection troubleshooting #8718

Open
alexandrejbr opened this issue Aug 13, 2024 · 13 comments
Open

ssh connection troubleshooting #8718

alexandrejbr opened this issue Aug 13, 2024 · 13 comments
Assignees
Labels
enhancement team:PS Assigned to OTP team PS

Comments

@alexandrejbr
Copy link

Is your feature request related to a problem? Please describe.
For troubleshooting purposes, we would like to be able to have more detailed information of setup of the an SSH connection. For instance the proposed algorithms by each role and the authentication methods. This would be useful for both connections that are established successfully and for connections that fail to be established.

What we have in mind is to be able to debug a connection like one would do in openssh with the verbose modes, but since that is perhaps quite ambitious, starting with what each role is proposes for the connection and information about the authentications attempts would already be great.
Describe the solution you'd like
It's hard to imagine what would be the best way to obtain this information, but I imagine that a callback could work for both roles, even though for the client role the error would come from the result of ssh:connect and then a callback function would be called as well.

Logging also works, but perhaps a bit less flexible.

Do you think it would be interesting to have such troubleshooting capabilities in the ssh application?

@IngelaAndin IngelaAndin added the team:PS Assigned to OTP team PS label Aug 14, 2024
@u3s u3s self-assigned this Aug 27, 2024
@u3s
Copy link
Contributor

u3s commented Aug 27, 2024

have you seen ssh_dbg module? it is pretty powerful.

for seeing result of algorithm negotiation I would use:
ssh:start(), ssh_dbg:on([alg]).

and for also getting SSH messages leading to negotiation result:
ssh:start(), ssh_dbg:on([alg, ssh_messages]).

to get all debug:
ssh:start(), ssh_dbg:on().

apply above before establishing connection. should work for both roles.
it is based on tracing feature.

@alexandrejbr
Copy link
Author

I'll have a look and see if it will work for our use case. What we want is to every time an SSH connection is established we want to have this information stored/logged so we can afterwards investigate why a connection could not be established

If we use the traces I imagine we would need to activate this, capture the traces we are interested and later on disable the tracing. Could work.

@u3s
Copy link
Contributor

u3s commented Aug 27, 2024

you disable it with ssh_dbg:off.

it might be slightly challenging to predict which connection will fail ... traces and production systems are tricky to combine I'm afraid.

ssh_dbg is not documented feature so you would need to read source to understand how to it works.

@u3s
Copy link
Contributor

u3s commented Sep 24, 2024

ping

@alexandrejbr
Copy link
Author

I think I agree with you, while the ssh_dbg module is extremely powerful it seems to me that it can be a bit of an hack to create features on top of it.

If one could get similar information in callbacks or some other stream of events it would be great.

Being pragmatic, for the client role we have some information already with the connection info, but that's for successful connections, if for failed connections one could get more information and a way to correlate with the ssh:connect call would be great.

For the server role it would be interesting to have the same capabilities, so to know that a connections was established and in which terms and if it ends why it ended. The same for a connection attempts that existed, why have those failed to establish the connection.

@u3s
Copy link
Contributor

u3s commented Sep 25, 2024

If one could get similar information in callbacks or some other stream of events it would be great.

There was a discussion about some generic solution related to tracing of OTP apps (@rickard-green). Maybe even an OTP behavior.
Unfortunately they're having lower backlog priority and no details were agreed at this stage.

I guess your thoughts lean towards a possibility for subscribing to troubleshooting events, which you could then process in whatever way you like or store it for later.

@alexandrejbr
Copy link
Author

I guess your thoughts lean towards a possibility for subscribing to troubleshooting events, which you could then process in whatever way you like or store it for later.

Yes, I think you phrased it quite well. I think this could replace the existing logs and if one prefers to have the logs instead we could have a built in subscriber of these events that would translate them into info level logs.

@u3s
Copy link
Contributor

u3s commented Oct 11, 2024

What about this callbacks already present?

https://www.erlang.org/doc/apps/ssh/ssh.html#t:callbacks_daemon_options/0

@u3s
Copy link
Contributor

u3s commented Nov 25, 2024

ping?

@alexandrejbr
Copy link
Author

@u3s I have not being so active because I wanted to come up with a more concrete proposal, but maybe we can keep a bit more high level discussing the problems before we reach that state.

We use both ssh client and server and build on top of it, so our use case of debugging an ssh connection is not that we are going to open an erlang shell to do the debugging. It's more like problems occurred and we want to look at what why after the fact.

I can give you one example of one situation where the error return lost the debug information and it's only to logs that it goes, which is in the ssh_transport:handle_kexinit_msg. In the client role the user code gets just the default text for the error.

Probably if the user could supply a custom log function and we would place the error code and the extra information in a proplist we would already be extremely good as far as error scenarios.

Then in cases all is good with a connection there may be some interest in knowing to know what was actually negotiated. For the client role one can use the ssh:connection_info function, for the server role there's way unless we use tracing or the ssh_dbg module, isn't that right?

In a nutshell I think one part of this feature request is a programatic way of customising the logs in case of error and other part is to be able to do some introspection in connections that were successfully established, maybe the connectfun (daemon callbacks) could have the connection "handler"/pid so we can call connection_info on it or just have the connection information there as a parameter of the callback.

Do you think any of these are something you would consider having?

@u3s
Copy link
Contributor

u3s commented Nov 27, 2024

Do you think any of these are something you would consider having?

Your argumentation sounds convincing to me.
Unfortunately this has to wait until I have more time to investigate it further.

@alexandrejbr
Copy link
Author

@u3s would it help if we do a draft PR? Something incomplete, but just enough to make it easier for you reflect about it.

@u3s
Copy link
Contributor

u3s commented Nov 27, 2024

sure. if you have a concept code, please share as a draft.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement team:PS Assigned to OTP team PS
Projects
None yet
Development

No branches or pull requests

3 participants