沃梦达 / 编程问答 / php问题 / 正文

PHP 清理粘贴的 Microsoft 输入

PHP to clean-up pasted Microsoft input(PHP 清理粘贴的 Microsoft 输入)

本文介绍了PHP 清理粘贴的 Microsoft 输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个网站,用户可以在其中使用 TinyMCE 的自定义实现发布内容(如在论坛、评论等中).他们中的很多人喜欢复制&从 Word 粘贴,这意味着他们的输入通常带有大量相关的 MS 内联格式.

I have a site where users can post stuff (as in forums, comments, etc) using a customised implementation of TinyMCE. A lot of them like to copy & paste from Word, which means their input often comes with a plethora of associated MS inline formatting.

我不能只是摆脱 因为 TinyMCE 依赖于 span 标签来进行一些格式化,我不能(也不想)强制说用户使用 TinyMCE 的从 Word 粘贴"功能(无论如何似乎效果不佳).

I can't just get rid of <span whatever> as TinyMCE relies on the span tag for some of it's formatting, and I can't (and don't want to) force said users to use TinyMCE's "Paste From Word" feature (which doesn't seem to work that well anyway).

有谁知道可以为我处理这个问题的库/类/函数?这一定是一个普遍的问题,虽然我找不到任何明确的东西.我最近一直在想,一系列寻找 MS 特定模式的蛮力正则表达式可能会奏效,但我不想重写可能已经可用的东西,除非我必须这样做.

Anyone know of a library/class/function that would take care of this for me? It must be a common problem, though I can't find anything definitive. I've been thinking recently that a series of brute-force regexes looking for MS-specific patterns might do the trick, but I don't want to re-write something that may already be available unless I must.

此外,修复卷曲引号、破折号等也不错.我现在有自己的东西要做,但我真的很想找到一个 MS 转换过滤器来统治所有这些.

Also, fixing of curly quotes, em-dashes, etc would be good. I have my own stuff to do this now, but I'd really just like to find one MS-conversion filter to rule them all.

推荐答案

HTML Purifier 将创建符合标准的标记和过滤器排除许多可能的攻击(例如 XSS).

HTML Purifier will create standards compliant markup and filter out many possible attacks (such as XSS).

为了不需要 XSS 过滤的更快清理,我使用 PECL 扩展 Tidy 这是一个Tidy HTML 实用程序的绑定.

For faster cleanups that don't require XSS filtering, I use the PECL extension Tidy which is a binding for the Tidy HTML utility.

如果这些对您没有帮助,我建议您切换到具有此功能的 FCKEditor 内置.

If those don't help you, I suggest you switch to FCKEditor which has this feature built-in.

这篇关于PHP 清理粘贴的 Microsoft 输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本文标题为:PHP 清理粘贴的 Microsoft 输入

基础教程推荐