-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling of local encoding in gesftpserver #5
Comments
This sounds like a bug in sshd or its configuration to me. LC_CTYPE is how Unix programs expect to determine the encoding of all text including filenames, so in general we expect callers to set it appropriately. |
Unix programs are free to use whatever encoding they want in filenames, regardless of LC_CTYPE. As well, different users may set LC_CTYPE or LANG in their shell startup scripts, and sshd has no way of knowing that when it starts gesftpserver. I guess it would be nice to have some better error handling here if the filenames cannot be decoded. Right now, the gesftpserver daemon just aborts. |
What can I say? Applications have to make some decision about filename encoding. "Assume UTF-8" is indeed one possible policy, but I'm not ready to desupport users of other encodings, and the most widespread other approach I've ever found anything using is to honor LC_CTYPE. So that's the policy adopted here, and I don't see any need to change it. I'll look into improving the error behavior, however, you're right that terminating the server is unfriendly. |
Sure, it makes sense to use LC_CTYPE if it is set. It just sucks to use ASCII if it is unset. Maybe we could default to UTF-8 in that case? For example:
Also, I started a bug report at Gentoo to see if we can get the system default locale setting to be set by pam_env, which should mostly resolve this problem. |
I tried out the latest
gesftpserver
code on my distro of choice (Gentoo Linux).As a test, I downloaded some files using WinSCP with SFTP v6 enabled. Transferring files with ASCII filenames works fine, but transferring filenames with characters outside the ASCII range fails and the connection gets dropped.
I debugged the
gesftpserver
process, and I was hitting a fatal error insftp_send_path
. This ends up callingsftp_iconv
to translate the path from the "local encoding" to UTF-8. I store all my filenames in UTF-8 on disk, so this doesn't make much sense.Looking into it, I see that
sshd
(OpenSSH) hasLANG=en_US.UTF-8
set in the environment, which it inherits fromsystemd
. However, when sshd forks to start a new login session, it wipes the environment, includingLANG
. In other words, sftp_iconv fails fails due toLANG
andLC_CTYPE
being unset in the environment.I was able to get the transfer to succeed by setting
LANG=en_US.UTF-8
via the pam_env module, which gets invoked after the new session is created bysshd
.It seems like there must be a better way to make this work. I know that Linux doesn't really keep track of the encoding used for filenames. However, maybe the gesftpserver program could check to see if the string is already a valid UTF-8 sequence before throwing an encoding error?
The text was updated successfully, but these errors were encountered: