Remove four byte UTF-8 characters in classic ASP/VBScript (MySQL related)(删除经典 ASP/VBScript 中的四字节 UTF-8 字符(MySQL 相关))
问题描述
我已经花了大约 18 个小时尝试不同的东西并四处寻找,最后我放弃了,不得不问你们.
I've spent about 18 hours of trying different things and searching around now, finally I give up and have to ask you guys.
背景故事:我终于将旧的 MS Access 数据库迁移到 MySQL(版本 5.6.16-log).
Backstory: I am finally migrating a old MS Access database to MySQL (version 5.6.16-log).
问题:Access 数据库中的某些 Unicode 文本包含四个字节 (UTF-8).
Problem: Some Unicode text in the Access database contain four bytes (UTF-8).
MySQL still 在插入四个字节的 UTF-8 字符时出现问题.这个问题越来越老了,我惊讶地发现它还没有修复:http://bugs.mysql.com/bug.php?id=67297
MySQL still has a problem with inserting four bytes UTF-8 characters. This problem is getting old and I was surprised to discover it's not fixed yet: http://bugs.mysql.com/bug.php?id=67297
我正在使用MySQL ODBC 5.3 Unicode 驱动程序"在数据库之间传输数据(最新的 beta 开发版本).无论我尝试什么,当我尝试插入带有 4 字节 UTF8 字符的字符串时(线程永远使用 100% CPU),该过程最终都会冻结.尝试了互联网上到处建议的所有解决方法,但没有任何效果.
I'm using "MySQL ODBC 5.3 Unicode Driver" to transfer data between databases (the latest beta development release). No matter what I try the process ends up freezing when I try to insert the string with 4 byte UTF8 characters (the thread uses 100% CPU forever). Have tried all workarounds suggested everywhere on the Internet, nothing works.
现在我将接受 MySQL 的限制:我不能存储所有 Unicode 字符.
Now I will just accept the limitations of MySQL: I can't store all Unicode characters.
所以我想在将文本插入数据库之前从文本中删除所有 4 字节 UTF8 字符.但我终其一生都无法在经典的 ASP 中找到一种方法.
So I want to remove all 4 byte UTF8 characters from the text before I insert it into the database. But I can't for the life of me find a way to do it in classic ASP.
有人可以帮忙吗?
(顺便说一句,我不能不使用 ASP,用不同的语言重写它的代码太多了.仅仅更改数据库是一项了不起的壮举;其中有几个,需要几天才能完成.)
(I can't not use ASP btw, there is way too much code to rewrite it in a different language. Just changing databases is a remarkable feat; there are several of them and it will take days to complete.)
JScript 中的解决方案也是可以接受的,因为它可以从 ASP 页面运行.
A solution in JScript is also acceptable, since it can be run from ASP pages.
推荐答案
这应该可行:
Function UTF8Filter(strString)
On Error Resume Next
For i = 1 to Len(strString)
charCode = AscW(Mid(strString, i, 1))
If charCode > 32 AND charCode <= 127 then ' here was OR
'Append valid character'
strString = Mid(strString, i, 1)
End If
Next
UTF8Filter = strString
On Error Goto 0
End Function
更新功能:
Function Remove4ByteUFT8(strString)
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.Global = True
objRegEx.IgnoreCase = True
objRegEx.Pattern = "/[xF0-xF7].../s"
Remove4ByteUFT8 = objRegEx.Replace(strString, "")
End Function
这篇关于删除经典 ASP/VBScript 中的四字节 UTF-8 字符(MySQL 相关)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:删除经典 ASP/VBScript 中的四字节 UTF-8 字符(MySQL 相关)
基础教程推荐
- 使用pyodbc“不安全"的Python多处理和数据库访问? 2022-01-01
- SQL Server 2016更改对象所有者 2022-01-01
- ERROR 2006 (HY000): MySQL 服务器已经消失 2021-01-01
- SQL Server 中单行 MERGE/upsert 的语法 2021-01-01
- 在 VB.NET 中更新 SQL Server DateTime 列 2021-01-01
- 无法在 ubuntu 中启动 mysql 服务器 2021-01-01
- SQL Server:只有 GROUP BY 中的最后一个条目 2021-01-01
- 将数据从 MS SQL 迁移到 PostgreSQL? 2022-01-01
- Sql Server 字符串到日期的转换 2021-01-01
- 如何在 SQL Server 的嵌套过程中处理事务? 2021-01-01