-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
URDF Parsers' code is implicitly assuming that "C" locale is set #288
Comments
Note that actually checking if the parsing of the number is successful and halting the parser if this is not the case it would make much easier to spot this kind of locale-related problems. Related link : http://www.cplusplus.com/articles/D9j2Nwbp/ . |
Note that for some of the numerical attributes of URDF, the [1] https://github.com/ros/urdfdom/blob/master/xsd/urdf.xsd#L47 |
cc @diegoferigo that is also working on XML parsers. |
The error in iDynTree classes and functions should be solved by #289 . The |
Oww this is really nasty, thank for this debugging, it'll bear in mind for the XML parser (robotology-playground/xsens-mvn#12) we'll use for |
Problem solved in the iDynTree's parser, and the urdfdom-parser are in the process of being deprecated, and the upstream issue have been opened. Closing. |
Related (old) YARP issue: https://sourceforge.net/p/yarp0/bugs/52/ . |
Related stackoverflow question: https://stackoverflow.com/questions/13919817/sscanf-and-locales-how-does-one-really-parse-things-like-3-14 . |
Updated the original issue with some new alternative solutions. |
Related PR: fkanehiro/openhrp3#144 . |
Related issue: gazebosim/gz-rendering#136 . |
Related comment: https://github.com/JuliaComputing/llvm-cbe/pull/83/files#r537553875 . |
An interesting comparison of string to double routines: https://github.com/shaovoon/floatbench . |
Related issue: robotology-legacy/gym-ignition#301 . |
Another occurence of a similar bug: google-deepmind/mujoco#131 . |
Related issue on rapidcsv: d99kris/rapidcsv#102 . |
Some more issues: |
Some more issues for my collection: |
More issues for my collection: |
Even on Twitter: |
Problem statement
In the code of the URDF parsers, we use a number of standard C/C++ functions for conversion from string to double/integer, in particular the
std::atof
function.This function works fine, but according to the C/C++ standard its behavior depends on the locale currently set in the process, using the
std::setlocale
function. The problem is that (even if this choice is not actually documented anywhere in the URDF specification or documentation) URDF assume that all the numeric attributes are encoded according to the"C"
locale, i.e. the default locale setting for C/C++ . The problem is that some libraries such as Qt on some platforms *actually change the process-wide locale to be""
, i.e. the user-preferred locale, see [1] .I noticed this issue because the iDynTree URDF parser was failing to work properly when used in an application using Qt.
Rejected solutions
To solve this issue, we need to make sure that each conversion from string to double uses the correct locale, i.e.
"C"
for URDF files. In SDFormat a similar issue [2] was fixed by introducing a call tostd::setlocale
inside the library [3], but I feel that this solution is a prone to error because the on some platforms it modifies the process-wide locale setting, and so it is possible that another thread in the same process sets callsstd::setlocale("")
before the conversion is actually performed.Preferred solution
The solution that I prefer at the moment is to use an instance of
std::stringstream
for actually performing this numeric conversions for the URDF parser, and to manually set the locale for just that object (that is a local variable of the function) to"C"
(or more specifical theclassic()
locale) using theimbue(const std::locale& loc)
method (see [4]):I wonder if there could be some performance by using
std::stringstream
and always setting the desired locale, but given that the parser code is typically called only once at configuration time, I will ignore the performance at the moment (even if I would be really interested in seeing some benchmarks).Additional valid solutions
std::from/to_chars
In C++20, the "fast" conversion functions
std::from_chars
andstd::to_chars
will be available, and they will always be using the C locale, regardless of any other locale setting. However, note thatstd::from_chars
just accepts a limited subset of all the string that are considered floating point numbers in other string conversion functions: for example, it does not support a trailing+
, so it cannot be used as it is (without string preprocessing) for parsing XML floating point strings (thanks @francesco-romano for correctly spotting this).Custom functions or libraries
If you are using
C
, you want to optimize for efficiency or for any reson you cannot use the standard C++ library, there are several custom string <---> double libraries, that may just support the C locale regardless of any other locale setting. See the following lists for some pointers regarding this libraries:An interesting (but C++ only) library is
google/double-conversion
.Refs
[1] http://stackoverflow.com/questions/25661295/why-does-qcoreapplication-call-setlocalelc-all-by-default-on-unix-linux
[2] https://bitbucket.org/osrf/sdformat/issues/60
[3] https://bitbucket.org/osrf/sdformat/pull-requests/147/fix-problems-with-latin-locales-and/diff#Lsrc/Param.ccT145
[4] http://stackoverflow.com/questions/1333451/locale-independent-atof
The text was updated successfully, but these errors were encountered: