-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to a path interface instead of using std::string
for file paths
#51
Comments
AFAIK these utf8 conversion problems are all fixed already. We added some extensive tests to ensure files with unicode characters can be loaded. The folder
Perhaps the call site is missing the proper locale initialization? See https://github.com/NativeInstruments/ni-media/blob/master/audiostream/test/ni/media/test_helper.cpp#L46. This is needed to properly interface with windows APIs (which use utf16 encoded wchar_t). Deep down in boost::iostreams, the sources / sinks will convert char <-> wchar_t and expect the proper conversion to take place. I am not sure switching to Maybe we should document this as a requirement on Windows? Or maybe we should use our own facet inside of ni-media, but I don't know if this even works, and if it's a good idea. Tbh I lost a bit track about the current state of utf8 in cpp, and what's best practice. Actually, I just noticed that https://en.cppreference.com/w/cpp/locale/codecvt_utf8_utf16 is already deprecated in C++17... that didn't last long :) |
CC @C4rsten
Yes. This is how we fixed it on the user side. If we use our own facet, it's one less detail the user has to worry about, so I would be up for that. Though it seems like we would have to roll our own, as you pointed out There's a PR open right now related to this issue as well where a |
Yes I agree, this is a hidden requirement and totally not obvious for clients. I would treat this issue separately from the C++ 17 std::filesystem support though. I would suggest the following approach:
|
Heyhey, thanks for looking into this! I would like to help execute this but I'm not yet 100% clear on how these steps would look like exactly. For Step 1 you're envisioning something like this? (really not sure about the formatting here) class aiff_file_source : public aiff_source<boost::iostreams::file_descriptor_source>
{
using base_type = aiff_source<boost::iostreams::file_descriptor_source>;
#if BOOST_OS_WINDOWS
using converter = std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>>;
#endif
public:
explicit aiff_file_source( const std::string& path )
#if BOOST_OS_WINDOWS
: base_type( converter{}.from_bytes(path), BOOST_IOS::binary | BOOST_IOS::in )
{}
#else
: base_type( path, BOOST_IOS::binary | BOOST_IOS::in )
{}
#endif
}; Step 2 is then using fs::path internally: class aiff_file_source : public aiff_source<boost::iostreams::file_descriptor_source>
{
using base_type = aiff_source<boost::iostreams::file_descriptor_source>;
public:
explicit aiff_file_source( const std::string& path )
: base_type( std::filesystem::u8path(path), BOOST_IOS::binary | BOOST_IOS::in )
{}
}; and Step 3 is then just using fs::path as the input param type as done in the PR but without using Is that somewhat along the lines of what you had in mind or am I completely going in the wrong direction? |
As stated initially I would avoid string conversions entirely. Instead on Windows use |
On Windows, conversions between
std::string
andpath
interfaces results in UTF-8 strings being converted to ANSI, resulting in non-ANSI characters being lost in the conversion process. This results in files not being loaded correctly because they'll fail our valid path checks.The text was updated successfully, but these errors were encountered: