Convert from UTF-8 to unicode c++(从 UTF-8 转换为 unicode C++)
问题描述
如何在 c++ 应用程序中转换 ú,其中应用程序接收字符为 UTF-8 编码 %C3%BA 并将其存储为 unicode 等效 %FA.我只想知道我将如何编写代码来执行此编码过程
How do I convert ú within a c++ application where the application receives the character as UTF-8 encoding %C3%BA and store it as the unicode equivalent %FA. I just want to know how I would go about writing code to perform this encoding process
推荐答案
我昨天刚刚写了一些代码来做到这一点...
I just wrote some code to do this yesterday...
我并不是说这是做到这一点的完美"方式,但它似乎适用于我运行过的所有测试用例(我为此编写了两个方向).
I'm not saying this is the "perfect" way to do this, but it appears to work for all testcases I've run through it (I wrote both directions for that purpose).
我会让你把 "%NN" 转换成一个整数值.
I'll leave it to you to translate "%NN" to an integer value.
#include <iostream>
#include <deque>
std::deque<int> unicode_to_utf8(int charcode)
{
std::deque<int> d;
if (charcode < 128)
{
d.push_back(charcode);
}
else
{
int first_bits = 6;
const int other_bits = 6;
int first_val = 0xC0;
int t = 0;
while (charcode >= (1 << first_bits))
{
{
t = 128 | (charcode & ((1 << other_bits)-1));
charcode >>= other_bits;
first_val |= 1 << (first_bits);
first_bits--;
}
d.push_front(t);
}
t = first_val | charcode;
d.push_front(t);
}
return d;
}
int utf8_to_unicode(std::deque<int> &coded)
{
int charcode = 0;
int t = coded.front();
coded.pop_front();
if (t < 128)
{
return t;
}
int high_bit_mask = (1 << 6) -1;
int high_bit_shift = 0;
int total_bits = 0;
const int other_bits = 6;
while((t & 0xC0) == 0xC0)
{
t <<= 1;
t &= 0xff;
total_bits += 6;
high_bit_mask >>= 1;
high_bit_shift++;
charcode <<= other_bits;
charcode |= coded.front() & ((1 << other_bits)-1);
coded.pop_front();
}
charcode |= ((t >> high_bit_shift) & high_bit_mask) << total_bits;
return charcode;
}
int main()
{
int charcode;
for(;;)
{
std::cout << "Enter unicode value:" << std::endl;
std::cin >> charcode;
auto x = unicode_to_utf8(charcode);
for(auto c : x)
{
std::cout << "\x" << std::hex << c << " ";
}
std::cout << std::endl;
int c = utf8_to_unicode(x);
std::cout << "reversed:" << std::dec << c << std::hex << " in hex:" << c << std::endl;
}
}
这篇关于从 UTF-8 转换为 unicode C++的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:从 UTF-8 转换为 unicode C++


基础教程推荐
- 如何在 C++ 中初始化静态常量成员? 2022-01-01
- 如何将 std::pair 的排序 std::list 转换为 std::map 2022-01-01
- 静态库、静态链接动态库和动态链接动态库的 .lib 文件里面是什么? 2021-01-01
- 我有静态或动态 boost 库吗? 2021-01-01
- 在 C++ 中计算滚动/移动平均值 2021-01-01
- 常量变量在标题中不起作用 2021-01-01
- 这个宏可以转换成函数吗? 2022-01-01
- 如何检查GTK+3.0中的小部件类型? 2022-11-30
- 如何通过C程序打开命令提示符Cmd 2022-12-09
- C++结构和函数声明。为什么它不能编译? 2022-11-07