Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URDF Parsers' code is implicitly assuming that "C" locale is set #288

Closed
traversaro opened this issue May 5, 2017 · 23 comments
Closed

URDF Parsers' code is implicitly assuming that "C" locale is set #288

traversaro opened this issue May 5, 2017 · 23 comments

Comments

@traversaro
Copy link
Member

traversaro commented May 5, 2017

Problem statement

In the code of the URDF parsers, we use a number of standard C/C++ functions for conversion from string to double/integer, in particular the std::atof function.

This function works fine, but according to the C/C++ standard its behavior depends on the locale currently set in the process, using the std::setlocale function. The problem is that (even if this choice is not actually documented anywhere in the URDF specification or documentation) URDF assume that all the numeric attributes are encoded according to the "C" locale, i.e. the default locale setting for C/C++ . The problem is that some libraries such as Qt on some platforms *actually change the process-wide locale to be "", i.e. the user-preferred locale, see [1] .

I noticed this issue because the iDynTree URDF parser was failing to work properly when used in an application using Qt.

Rejected solutions

To solve this issue, we need to make sure that each conversion from string to double uses the correct locale, i.e. "C" for URDF files. In SDFormat a similar issue [2] was fixed by introducing a call to std::setlocale inside the library [3], but I feel that this solution is a prone to error because the on some platforms it modifies the process-wide locale setting, and so it is possible that another thread in the same process sets calls std::setlocale("") before the conversion is actually performed.

Preferred solution

The solution that I prefer at the moment is to use an instance of std::stringstream for actually performing this numeric conversions for the URDF parser, and to manually set the locale for just that object (that is a local variable of the function) to "C" (or more specifical the classic() locale) using the imbue(const std::locale& loc) method (see [4]):

std::string textToConvert;
double convertedDouble;
std::stringstream ss(textToConvert);
ss.imbue(std::locale::classic());
ss >> convertedDouble;

I wonder if there could be some performance by using std::stringstream and always setting the desired locale, but given that the parser code is typically called only once at configuration time, I will ignore the performance at the moment (even if I would be really interested in seeing some benchmarks).

Additional valid solutions

std::from/to_chars

In C++20, the "fast" conversion functions std::from_chars and std::to_chars will be available, and they will always be using the C locale, regardless of any other locale setting. However, note that std::from_chars just accepts a limited subset of all the string that are considered floating point numbers in other string conversion functions: for example, it does not support a trailing +, so it cannot be used as it is (without string preprocessing) for parsing XML floating point strings (thanks @francesco-romano for correctly spotting this).

Custom functions or libraries

If you are using C, you want to optimize for efficiency or for any reson you cannot use the standard C++ library, there are several custom string <---> double libraries, that may just support the C locale regardless of any other locale setting. See the following lists for some pointers regarding this libraries:

An interesting (but C++ only) library is google/double-conversion.

Refs

[1] http://stackoverflow.com/questions/25661295/why-does-qcoreapplication-call-setlocalelc-all-by-default-on-unix-linux
[2] https://bitbucket.org/osrf/sdformat/issues/60
[3] https://bitbucket.org/osrf/sdformat/pull-requests/147/fix-problems-with-latin-locales-and/diff#Lsrc/Param.ccT145
[4] http://stackoverflow.com/questions/1333451/locale-independent-atof

@traversaro traversaro changed the title Parsers code is using implicitly assuming that "C" locale is being set URDF Parsers code is using implicitly assuming that "C" locale is being set May 5, 2017
@traversaro traversaro changed the title URDF Parsers code is using implicitly assuming that "C" locale is being set URDF Parsers code is implicitly assuming that "C" locale is being set May 5, 2017
@traversaro
Copy link
Member Author

traversaro commented May 5, 2017

Note that actually checking if the parsing of the number is successful and halting the parser if this is not the case it would make much easier to spot this kind of locale-related problems. Related link : http://www.cplusplus.com/articles/D9j2Nwbp/ .

@traversaro traversaro changed the title URDF Parsers code is implicitly assuming that "C" locale is being set URDF Parsers code is implicitly assuming that "C" locale is set May 5, 2017
@traversaro
Copy link
Member Author

traversaro commented May 5, 2017

Note that for some of the numerical attributes of URDF, the xsd file available at [1] specifies that they are of type xsd::double [2] . I don't know if format double accepted by C++ functions with the
"C" locale is totally compatible with the xsd::double, but it would be worth checking.

[1] https://github.com/ros/urdfdom/blob/master/xsd/urdf.xsd#L47
[2] https://www.w3.org/TR/xmlschema-2/#double .

@traversaro
Copy link
Member Author

cc @diegoferigo that is also working on XML parsers.

@traversaro
Copy link
Member Author

The error in iDynTree classes and functions should be solved by #289 . The urdfdom library is subject as well to the problem, and it is used in the legacy KDL-based classes. For that we opened an upstream issue in ros/urdfdom#98 .

@traversaro traversaro changed the title URDF Parsers code is implicitly assuming that "C" locale is set URDF Parsers' code is implicitly assuming that "C" locale is set May 5, 2017
@diegoferigo
Copy link
Member

Oww this is really nasty, thank for this debugging, it'll bear in mind for the XML parser (robotology-playground/xsens-mvn#12) we'll use for mvnx files.

@traversaro
Copy link
Member Author

Problem solved in the iDynTree's parser, and the urdfdom-parser are in the process of being deprecated, and the upstream issue have been opened. Closing.

@traversaro
Copy link
Member Author

Related (old) YARP issue: https://sourceforge.net/p/yarp0/bugs/52/ .

@traversaro
Copy link
Member Author

@traversaro
Copy link
Member Author

@traversaro
Copy link
Member Author

@traversaro
Copy link
Member Author

Updated the original issue with some new alternative solutions.

@traversaro
Copy link
Member Author

Related PR: fkanehiro/openhrp3#144 .

@traversaro
Copy link
Member Author

Related issue: gazebosim/gz-rendering#136 .

@traversaro
Copy link
Member Author

@traversaro
Copy link
Member Author

An interesting comparison of string to double routines: https://github.com/shaovoon/floatbench .

@traversaro
Copy link
Member Author

Related issue: robotology-legacy/gym-ignition#301 .

@traversaro
Copy link
Member Author

Another occurence of a similar bug: google-deepmind/mujoco#131 .

@traversaro
Copy link
Member Author

Related issue on rapidcsv: d99kris/rapidcsv#102 .

@traversaro
Copy link
Member Author

traversaro commented May 14, 2024

@traversaro
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants