正则表达式,只替换第一次出现的HTML标记 - Regular Expression, only replace first occurrence of HTML tag

- 此内容更新于:2015-12-20
主题:

我有几个文件,有双标记(故意或偶然)。我只希望找到第一次出现的标记并添加额外的HTML代码。但是第二发生不应该受到影响。我用TextWrangler。我使用的正则表达式现在取代两个事件而不是第一个。文本:正则表达式我使用:目前的结果:这是添加我的附加代码两次。我想它只发生在第一个。

原文:

I've got several files that have double tags in them (either on purpose or by accident). I'm looking to find the first occurrence only of the tag and append it with additional HTML code. But the second occurrence shouldn't be affected. I'm using TextWrangler. The regex I'm using now replaces both occurrences rather than just the first.

Text:

<body someattribute=...>
existing content
<body onUnload=...>

RegEx I'm using:

Find: (\<body.*\>)

Replace with: 

\n\1
appended HTML code

Current result:

<body someattribute=...>
appended HTML code
existing content
<body onUnload=...>
appended HTML code

So it's adding my appended code twice. I just want it to happen to the first only.

网友:不要使用正则表达式的HTML

(原文:Don't use regex's with HTML)

网友:你应该更具体的和你的正则表达式,如果两个标签不是100%相同。如果,你的文本编辑器(我从未使用过)不只有一个选项来代替一个实例,没有很多的你可以做没有更长和更复杂的正则表达式。

(原文:You should be more specific with your regex, if the two tags aren't 100% identical. If they are, and your text editor (I've never used that one) doesn't have an option to only replace one instance at a time, there's not a whole lot you can do without a much longer and more complicated regex.)

网友:我认为这个问题是更多关于你的文本编辑器,而不是正则表达式。此刻我想不出任何正则表达式的功能,将有助于(除了包括在你的正则表达式,但我认为这不是一个选项)

(原文:I think this question is more about your text editor, rather than regular expressions. At the moment I can't think of any regex feature that would help here (except including someattribute in your regex, but I think this is not an option))

网友:@AndreaCorbellini哦,可以做这个正则表达式——例如,通过使用捕捉组,捕捉“现有内容”,检查下一个标记,然后用正则表达式添加新的文本同时保持旧的。但它是更复杂的比使用文本编辑器选项(应该有一个“取代”的替换而不是“替换所有”)。

(原文:@AndreaCorbellini Oh, it's possible to do this with regex- For instance, by using capture groups, capturing the "existing content" and checking down to the next tag, then using a regex to add the new text while keeping the old. But it's much more complex than just using text editor options (there should be one for "replace one" instead of all replaces being "replace all" after all) for sure.)

网友:@Kendra:哦,你是对的!

(原文:@Kendra: oh, you're right! (<body.*?>)(.*<body))

解决方案:
正则表达式替换:解释:使。新行字符匹配。如果没有这个,。性格会匹配所有字符,直到达到一个新行字符。发现的第一个“身体”和捕获组1(\1)。第一个“身体”,后发现一切都和捕获组2(\2)。替换所有被发现时,组1+新行+附加内容+新行+组2在notepad++进行测试
原文:

Regex:

(?s)(<body.*?>)(.*)

Replace:

\1\nappended content\n\2

Explanation:

  • (?s) makes the . character match new lines. Without this, the . character will match all characters until it hits a new line character.
  • (<body.*?>) Finds the first "body" and captures as group 1 (\1).
  • (.*) Finds everything after the first "body", and captures as group 2 (\2).
  • Replaces everything that was found with group 1 + new line + appended content + new line + group 2

Tested in Notepad++

楼主:感谢。我试过了,但是它添加了下面的“添加文本”第一个<身体>标记和第二个<身体>标记。我可以看到它应该做我所希望的,但它没有。

(原文:Thanks for this. I tried it out, but it added the "appended text" below the first <body> tag and the second <body> tag. I can see how it should do what I was hoping for, but it didn't.)

网友:@scotthorvath如果下面这个答案还附加中的正则表达式第二标记,它可能是你与正则表达式的文本编辑器并不是打得很好。事实上,它可能是,你没有选择设置匹配新行。有一个选择在你的文本编辑器呢?(如果没有,可能需要定期正则参数,但不幸的是我不知道那些随便的。)

(原文:@scotthorvath If the regex in this answer is still appending below the second tag as well, it might be that your text editor isn't playing very nicely with regex. Actually, it could be that you don't have the option set for . to match new lines. Is there an option for this in your text editor? (If there isn't, it might take regular regex arguments, but I unfortunately don't know any of those offhand.))

网友:你用的@scotthovath——应用程序/编辑?

(原文:@scotthovath - What application / editor are you using?)

网友:@tekim这说他们使用TextWrangler问题。

(原文:@tekim It says in the question they're using TextWrangler.)

网友:@scotthorvath尝试添加这个正则表达式的开始。前,这将使匹配新线路,因此使这个正则表达式的工作。

(原文:@scotthorvath Try adding (?s) to the start of this regex. Ex (?s)(<body.*?>)(.*) as this will make . match new lines, and therefore make this regex work.)

解决方案:
你可以从一开始就这么做。查找:替换:
原文:

You could just do it from the beginning.

Find: (?s-m)^.*?<body.*?>\K
Replace: HTML code