UnicodeEncodeError: #39;latin-1#39; codec can#39;t encode character(UnicodeEncodeError: latin-1 编解码器无法编码字符)
问题描述
当我尝试将外来字符插入数据库时,什么可能导致此错误?
>>UnicodeEncodeError: 'latin-1' codec can't encode character u'\u201c' in position 0: ordinal not in range(256)我该如何解决?
谢谢!
解决方案 Latin-1 (ISO-8859-1) 编码中不存在字符 U+201C 左双引号.
它出现在代码页 1252(西欧)中.这是一种基于 ISO-8859-1 的特定于 Windows 的编码,但会将额外的字符放入范围 0x80-0x9F.代码页 1252 经常与 ISO-8859-1 混淆,这是一种令人讨厌但现在是标准的 Web 浏览器行为,如果您将页面作为 ISO-8859-1 提供,浏览器会将它们视为 cp1252.然而,它们确实是两种不同的编码:
<预><代码>>>>u'他说\u201CHello\u201D'.encode('iso-8859-1')Unicode编码错误>>>u'他说\u201CHello\u201D'.encode('cp1252')'他说\x93Hello\x94'
如果您仅将数据库用作字节存储,则可以使用 cp1252 对 和 Windows 西方代码页中存在的其他字符进行编码.但是还有一些 cp1252 中不存在的 Unicode 字符会导致错误.
您可以使用 encode(..., 'ignore')
通过删除字符来抑制错误,但实际上在本世纪您应该在两个数据库中都使用 UTF-8和你的网页.这种编码允许使用任何字符.理想情况下,您还应该告诉 MySQL 您正在使用 UTF-8 字符串(通过设置数据库连接和字符串列的排序规则),以便它可以正确进行不区分大小写的比较和排序.
What could be causing this error when I try to insert a foreign character into the database?
>>UnicodeEncodeError: 'latin-1' codec can't encode character u'\u201c' in position 0: ordinal not in range(256)
And how do I resolve it?
Thanks!
Character U+201C Left Double Quotation Mark is not present in the Latin-1 (ISO-8859-1) encoding.
It is present in code page 1252 (Western European). This is a Windows-specific encoding that is based on ISO-8859-1 but which puts extra characters into the range 0x80-0x9F. Code page 1252 is often confused with ISO-8859-1, and it's an annoying but now-standard web browser behaviour that if you serve your pages as ISO-8859-1, the browser will treat them as cp1252 instead. However, they really are two distinct encodings:
>>> u'He said \u201CHello\u201D'.encode('iso-8859-1')
UnicodeEncodeError
>>> u'He said \u201CHello\u201D'.encode('cp1252')
'He said \x93Hello\x94'
If you are using your database only as a byte store, you can use cp1252 to encode "
and other characters present in the Windows Western code page. But still other Unicode characters which are not present in cp1252 will cause errors.
You can use encode(..., 'ignore')
to suppress the errors by getting rid of the characters, but really in this century you should be using UTF-8 in both your database and your pages. This encoding allows any character to be used. You should also ideally tell MySQL you are using UTF-8 strings (by setting the database connection and the collation on string columns), so it can get case-insensitive comparison and sorting right.
这篇关于UnicodeEncodeError: 'latin-1' 编解码器无法编码字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:UnicodeEncodeError: 'latin-1' 编解码器无法编码字符
基础教程推荐
- 使用pyodbc“不安全"的Python多处理和数据库访问? 2022-01-01
- Sql Server 字符串到日期的转换 2021-01-01
- 如何在 SQL Server 的嵌套过程中处理事务? 2021-01-01
- SQL Server 中单行 MERGE/upsert 的语法 2021-01-01
- 无法在 ubuntu 中启动 mysql 服务器 2021-01-01
- 在 VB.NET 中更新 SQL Server DateTime 列 2021-01-01
- ERROR 2006 (HY000): MySQL 服务器已经消失 2021-01-01
- SQL Server 2016更改对象所有者 2022-01-01
- SQL Server:只有 GROUP BY 中的最后一个条目 2021-01-01
- 将数据从 MS SQL 迁移到 PostgreSQL? 2022-01-01