UTF-8 output on Windows console(Windows 控制台上的 UTF-8 输出)
问题描述
以下代码显示了我机器上的意外行为(在 Windows XP 上使用 Visual C++ 2008 SP1 和在 Windows 7 上使用 VS 2012 测试):
The following code shows unexpected behaviour on my machine (tested with Visual C++ 2008 SP1 on Windows XP and VS 2012 on Windows 7):
#include <iostream>
#include "Windows.h"
int main() {
SetConsoleOutputCP( CP_UTF8 );
std::cout << "xc3xbc";
int fail = std::cout.fail() ? '1': '0';
fputc( fail, stdout );
fputs( "xc3xbc", stdout );
}
我只是用 cl/EHsc test.cpp
编译.
Windows XP: 控制台窗口中的输出是ü0ü
(翻译成Codepage 1252,最初显示一些线图默认代码页中的字符,可能是 437).当我更改设置时在控制台窗口中使用Lucida Console"字符集并运行我的再次test.exe,输出改为1ü
,表示
Windows XP: Output in a console window is
ü0ü
(translated to Codepage 1252, originally shows some line drawing
charachters in the default Codepage, perhaps 437). When I change the settings
of the console window to use the "Lucida Console" character set and run my
test.exe again, output is changed to 1ü
, which means
- 字符
ü
可以使用fputs
及其UTF-8编码C3 BC
std::cout
不管什么原因都不起作用- 流
failbit
在尝试写入字符后设置
- the character
ü
can be written usingfputs
and its UTF-8 encodingC3 BC
std::cout
does not work for whatever reason- the streams
failbit
is setting after trying to write the character
Windows 7: 使用 Consolas 的输出是 0ü
.更有趣.可能写入了正确的字节(至少在将输出重定向到文件时)并且流状态正常,但两个字节作为单独的字符写入).
Windows 7: Output using Consolas is ��0ü
. Even more interesting. The correct bytes are written, probably (at least when redirecting the output to a file) and the stream state is ok, but the two bytes are written as separate characters).
我试图在Microsoft Connect"上提出这个问题(参见 这里),但 MS 并没有很有帮助.你不妨看看这里一>因为之前有人问过类似的问题.
I tried to raise this issue on "Microsoft Connect" (see here), but MS has not been very helpful. You might as well look here as something similar has been asked before.
你能重现这个问题吗?
我做错了什么?std::cout
和 fputs
不应该是一样的吗?效果?
What am I doing wrong? Shouldn't the std::cout
and the fputs
have the same
effect?
解决:(有点)按照 mike.dld 的想法,我实现了一个 std::stringbuf
在 中执行从 UTF-8 到 Windows-1252 的转换sync()
并用这个转换器替换了 std::cout
的流缓冲(见我对 mike.dld 回答的评论).
SOLVED: (sort of) Following mike.dld's idea I implemented a std::stringbuf
doing the conversion from UTF-8 to Windows-1252 in sync()
and replaced the streambuf of std::cout
with this converter (see my comment on mike.dld's answer).
推荐答案
现在是时候关闭它了.Stephan T. Lavavej 说这种行为是设计使然",尽管我无法理解这个解释.
It's time to close this now. Stephan T. Lavavej says the behaviour is "by design", although I cannot follow this explanation.
我目前的知识是:UTF-8 代码页中的 Windows XP 控制台不适用于 C++ iostreams.
My current knowledge is: Windows XP console in UTF-8 codepage does not work with C++ iostreams.
Windows XP 现在已经过时了,VS 2008 也是如此.我很想知道这个问题在较新的 Windows 系统上是否仍然存在.
Windows XP is getting out of fashion now and so does VS 2008. I'd be interested to hear if the problem still exists on newer Windows systems.
在 Windows 7 上 效果可能是由于 C++ 流输出字符的方式.正如对 在 Windows 控制台中正确打印 utf8 字符 的回答中所见,打印一个字节时,UTF-8 输出失败并带有 C stdio一个接一个像 putc('xc3');putc('xbc');
也是如此.也许这就是 C++ 流在这里所做的.
On Windows 7 the effect is probably due to the way the C++ streams output characters. As seen in an answer to Properly print utf8 characters in windows console, UTF-8 output fails with C stdio when printing one byte after after another like putc('xc3'); putc('xbc');
as well. Perhaps this is what C++ streams do here.
这篇关于Windows 控制台上的 UTF-8 输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Windows 控制台上的 UTF-8 输出
基础教程推荐
- 为什么语句不能出现在命名空间范围内? 2021-01-01
- 如何在不破坏 vtbl 的情况下做相当于 memset(this, ...) 的操作? 2022-01-01
- 如何“在 Finder 中显示"或“在资源管理器中显 2021-01-01
- 在 C++ 中循环遍历所有 Lua 全局变量 2021-01-01
- Windows Media Foundation 录制音频 2021-01-01
- 如何使图像调整大小以在 Qt 中缩放? 2021-01-01
- 管理共享内存应该分配多少内存?(助推) 2022-12-07
- 为 C/C++ 中的项目的 makefile 生成依赖项 2022-01-01
- 从 std::cin 读取密码 2021-01-01
- 使用从字符串中提取的参数调用函数 2022-01-01