Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No way to have SFTP connections load_host_keys() via transport_params #715

Open
3 tasks done
Kache opened this issue Aug 16, 2022 · 4 comments
Open
3 tasks done

No way to have SFTP connections load_host_keys() via transport_params #715

Kache opened this issue Aug 16, 2022 · 4 comments

Comments

@Kache
Copy link
Contributor

Kache commented Aug 16, 2022

Problem description

While currently system host keys are loaded: https://github.com/RaRe-Technologies/smart_open/blob/v5.2.1/smart_open/ssh.py#L91

There's currently no way to load_host_keys() for verifying the host. It could be added via transport_params at paramiko client instantiation. For example, something like:

        ssh = _SSH[key] = paramiko.client.SSHClient()
        ssh.load_system_host_keys()
        if 'load_host_keys' in transport_params:
            ssh.load_host_keys(transport_params['load_host_keys'])
        ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())

Steps/code to reproduce the problem

Currently unable to verify host key using a local known_hosts file.

Versions

Darwin-20.6.0-x86_64-i386-64bit
Python 3.7.6 (default, Nov 24 2021, 00:59:23)
[Clang 13.0.0 (clang-1300.0.29.3)]
smart_open 5.2.1

Checklist

Before you create the issue, please make sure you have:

  • Described the problem clearly
  • Provided a minimal reproducible example, including any required data
  • Provided the version numbers of the relevant software
@Kache
Copy link
Contributor Author

Kache commented Aug 16, 2022

If you're open to accepting a PR, I'd be willing to create a PR based on the example above.

@mpenkov
Copy link
Collaborator

mpenkov commented Aug 21, 2022

if 'load_host_keys' in transport_params:
    ssh.load_host_keys(transport_params['load_host_keys'])

Can't the user do this prior to the smart_open.open call?

@Kache
Copy link
Contributor Author

Kache commented Aug 22, 2022

Yes, but notice the first line in my example -- that solution involves accessing smart_open's private cache ,_SSH, to "grab" the instance of the paramiko client, which is not a good software engineering practice.

smart_open does not normally (and rightfully, IMO) expose the underlying paramiko client to the user

@mpenkov
Copy link
Collaborator

mpenkov commented Aug 23, 2022

I wonder if there's a better way to do this without having smart_open know all these paramiko details. I don't want to handle more transport parameters than absolutely necessary.

How about:

def ssh_client_init():  # user's code
    client = paramiko.client.SSHClient()
    # additional ssh config goes here
    return client

transport_params = {'ssh_client_init': ssh_client_init}
with smart_open(url, 'rb', transport_params=transport_params) as fin:
    ...

I think it's better to pass a callable instead of the client itself because we can use the callable to create a new client whenever we get disconnected.

If there is no callable passed, then we can use the default client settings, e.g. what is currently being done.

Yet another way is to expose the underlying client. I'm not opposed to that idea, either. Hiding implementation details is a good thing in general, but here it's getting in the way of the user achieving what they want, so it isn't something we have to strictly stick to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants