How to do regex HTML tag replace in SQL Server?(如何在 SQL Server 中替换正则表达式 HTML 标签?)
问题描述
我在 SQL Server 2005 中有一个表,其中包含数百行 HTML 内容.某些内容具有 HTML,例如:
I have a table in SQL Server 2005 with hundreds of rows with HTML content. Some of the content has HTML like:
<span class=heading-2>Directions</span>
方向"随页面名称而变化.
where "Directions" changes depending on page name.
我需要将所有 和
标签更改为
> 和
标签.
I need to change all the <span class=heading-2>
and </span>
tags to <h2>
and </h2>
tags.
我过去编写此查询是为了进行内容更改,但由于结束 HTML 标记,它不适用于我当前的问题:
I wrote this query to do content changes in the past, but it doesn't work for my current problem because of the ending HTML tag:
Update ContentManager
Set ContentManager.Content = replace(Cast(ContentManager.Content AS NVARCHAR(Max)), 'old text', 'new text')
有谁知道我怎样才能完全在 T-SQL 中实现到 h2 的替换?我发现的一切都表明我必须进行 CLR 集成.谢谢!
Does anyone know how I could accomplish the span to h2 replacing purely in T-SQL? Everything I found showed I would have to do CLR integration. Thanks!
推荐答案
确实,T-SQL 本身并不支持正则表达式,在这种问题中,正则表达式将成为首选工具.首先,我会说解决方案的复杂程度在很大程度上取决于您的数据的一致性.例如,假设我们搜索具有以下标题的项目:
Indeed T-SQL does not natively support regular expressions and this is the sort of problem in which regular expressions would be the tool of choice. First, I'll say that the level of complication in the solution depends greatly on how consistent your data is. For example, suppose we search for items with the heading:
Select ..
From ...
Where HtmlContent Like '<span class="heading-2">%'
这假定 span
和 class
之间没有额外的间距,并且在结束括号之前的最后一个双引号之后没有额外的间距.我们可以编写 '%<span%class="heading-2"%>%'
来说明空格,但也会发现 div
标记为 heading-2
在与任何 span 标签相同的内容中.如果后面的场景不应该发生,但你可能有不同的空间,那么使用这个修改后的模式.我们真正会遇到麻烦的是结束标签.假设我们的内容如下所示:
This assumes no additional spacing between span
and class
as well as no additional spacing after the final double quote before the end bracket. We could write '%<span%class="heading-2"%>%'
to account for the spaces but that would also find div
tags marked as heading-2
in the same content as any span tag. If this later scenario shouldn't happen but you might have varying spaces, then use this revised pattern. Where we will really run into troubles is the closing tag. Suppose our content looks like so:
<span class="heading-2"> Foo <span class="heading-3">Bar</span> And Gamma Too</span> .... <span class="heading-4">Fubar Is the right way!</span>...
找到正确的结束 span
标记以更改为 </h2>
并不是那么简单.您不能简单地找到第一个 并将其更改为
.如果你知道你没有嵌套的
span
标签,那么你可以编写一个用户定义的函数来做到这一点:
It is not so simple to find the correct closing span
tag to change to </h2>
. You cannot simply find the first </span>
and change it to </h2>
. If you knew that you had no nested span
tags, then you could write a user-defined function that would do it:
Create Function ReplaceSpanToH2( @HtmlContent nvarchar(max) )
Returns nvarchar(max)
As
Begin
Declare @StartPos int
Declare @EndBracket int
Set @StartPos = CharIndex('<span class="heading-2">', @HtmlContent)
If @StartPos = 0
Return @HtmlContent
Set @HtmlContent = Replace(@HtmlContent, '<span class="heading-2">', '<h2>')
-- find next </span>
Set @StartPos = CharIndex('</span>', @HtmlContent, @StartPos)
Set @HtmlContent = Stuff(@HtmlContent, @StartPos, 7, '</h2>')
Return @HtmlContent
End
这篇关于如何在 SQL Server 中替换正则表达式 HTML 标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:如何在 SQL Server 中替换正则表达式 HTML 标签?
基础教程推荐
- ERROR 2006 (HY000): MySQL 服务器已经消失 2021-01-01
- 将数据从 MS SQL 迁移到 PostgreSQL? 2022-01-01
- 在 VB.NET 中更新 SQL Server DateTime 列 2021-01-01
- SQL Server:只有 GROUP BY 中的最后一个条目 2021-01-01
- 如何在 SQL Server 的嵌套过程中处理事务? 2021-01-01
- SQL Server 2016更改对象所有者 2022-01-01
- Sql Server 字符串到日期的转换 2021-01-01
- 无法在 ubuntu 中启动 mysql 服务器 2021-01-01
- SQL Server 中单行 MERGE/upsert 的语法 2021-01-01
- 使用pyodbc“不安全"的Python多处理和数据库访问? 2022-01-01