-
Notifications
You must be signed in to change notification settings - Fork 462
Description
Along with other recent attempts to make these libraries actually usable on Windows (MSVC-related fixes, workarounds etc.), I'd like to have first-class UTF-8 support also on this platform (because it basically "just works" everywhere else). As you may already know, there is no such thing as using UTF-8 in WINAPI directly, you either use ANSI or UTF-16. Both are ugly and nasty, but UTF-16 is clearly the way to go (although there is very little support for it in standard C++).
Recommended reading (the rest of the issue relies on decisions from this manifesto): http://utf8everywhere.org/
So, the goal is to have all text in const char/std::string and in UTF-8, not to force UTF-16 onto the users like in Qt, Java and elsewhere. The actual problems:
- File/directory operations (
Corrade::Utility::Directory) -- accepting UTF-8 filenames, converting them to UTF-16 internally and explicitly using*WWindows APIs (instead of these goddamn macros), for directory listing converting the returned UTF-16 strings back to UTF-8. mosra/corrade@a1061d0 - Loading of plugins from UTF-8 directories (
Corrade::PluginManager) mosra/corrade@2328ba3 - UTF-8 environment variables in
Corrade::Utility::Argumentsmosra/corrade@49be6d0 - UTF-8 filenames in
Corrade::Utility::Configurationmosra/corrade@f2a9f1c - UTF-8 filenames passable to
corrade_add_resource()mosra/corrade@b90f5b4- Test for that (Test corrade_add_resource() with Unicode filenames corrade#49, currently causing infinite recursion w/ Ninja 😢)
-
obsoleted by Compilation time, CI time and executable size improvements #293, we moved to C I/O in mosra/corrade@c1a5eed (and it's UTF-8 aware as well)std::[io]fstream-- MSVC has non-standard constructor andopen()that takes UTF-16 filename as a parameter, sadly nothing like that on MinGW (but there are some solutions). This is nothing clean, so the final solution is probably to move away from STL streams in importers etc. altogether and handle that some other way instead (which would be also far more portable to platforms w/o filesystem access). Done for Corrade in mosra/corrade@a1061d0 (ensuring nobody else usesfstreamdirectly).-
Ensure no other library usesimpossible to check, we'll be using our own file API if possiblefstreamor C I/O directly (plugins and 3rd party libs, especially)
-
- UTF-8
argcandargvparameters. As with everything else, these "just work" everywhere except on Windows. I could probably do something similar toSDLmain,QtMainthat would haveint wmain()internally, converts the UTF-16 arguments to UTF-8 and calls the user-providedmain()afterwards. I need to look into that more closely, because standard C++ allows some variations onmain()such as implicit return statement and omittingargc/argv. Any tips on this would be greatly appreciated. MinGW doesn't supportwmain(), but there is a workaround. -- New CorradeMain library corrade#37 - Proper
WinMain()/main()wrapper on Windows. Currently it's only possible to create console applications using theMAGNUM_APPLICATION_MAIN()macro, because it usesmain(). Callingadd_executable(something WIN32 ...)then makes the MSVC linker complain about missingWinMain(). -- New CorradeMain library corrade#37 -
Unicode standard output (AFAIK, this would mean redirecting to a file would make it UTF-16 encoded, which is definitely not wanted. Instead, set the output encoding to UTF-8 like in dart-lang/sdk@92b746c#diff-5c4ad2f03f9aac0f124bf4e6dba66156. Tools like CTest already enable that on their own.std::wcout/std::wprintfetc.) that does not depend on some random codepage setting in the terminal (whose "clever" idea was that, anyway). Most of (all?) the output is currently handled withCorrade::Utility::Debug, so that means just reworking the internals to produce UTF-16 on Windows. Standard input is probably not an issue here (these libraries have different scope).
As UTF-8/UTF-16 conversion would be needed only on Windows, I'll use WINAPI functions to do the conversion and won't make any public API for this, because it really shouldn't be needed anywhere else. I can't employ the (horrifically convoluted) std::codecvt etc., because (if I'm not mistaken) it is not supported in GCC < 4.9 (and probably also elsewhere), making it currently useless. (It's also awfully slow and the headers are bloated.)
Looking at the bigger picture, because it's apparently impossible to portably use std::fstream, std::cout etc. with proper UTF-8 filename/output on all platforms, I might as well get rid of the stream library altogether and have far lighter executables as a result.
Metadata
Metadata
Assignees
Labels
Projects
Status