c#正则表达式定义的问题 - C# Regex definition issue

- 此内容更新于:2015-12-20
主题:

我对一个正则表达式:混淆,如果我把这个应用到这个句子:它应该是这样工作的:1)首先应该找到双引号的句子2)然后点选择每一个文字(没有换行符)和明星重复这个模式。所以结果应该是这样的:我认为完成句子,因为它包含所有文字没有换行第二双报价不影响句子,因为这个句子被完成。*,我认为我犯错误,我不理解它是如何工作的!任何一个可以解释我的程序吗?

原文:

I confuse about a Regex :".*" , if i apply this to this sentence : we have a problem with "string one" and "string two". Please respond. it should work like this :

1) first of all it should find double quotation in the sentence

2)then dot select every literal (without line break) and star * repeat this pattern. so the result should be like this :

 "string one" and "string two". Please respond.

i think .* finish the sentence, because it include all literals without line break so the second double quotation could not affect on sentence because the sentence was finished by .* , i think that i make mistake and i didn't understand how it works! any one can explain me the procedure?

网友:*之间贪婪,吃一切外部引用。

(原文:* is greedy and eats everything between outer quotes.)

解决方案:
你明白我的意思。上升,直到结束的字符串或第一个换行符。然后,它将回溯。看到,下一个regex令牌是必需的。所以你必须匹配的匹配成功。因此,将“放弃”一个字符,并试图匹配”将再次生成的字符串:这个失败,因此,*会放弃另一个等:啊哈,substring立即紧随其后”,所以它是消费和匹配成功:你可能想尝试ungreedy版本:。在这种情况下,将尝试匹配任何字符()尽可能几次成功的匹配。匹配成功,你仍然需要一个关闭”,因此将尝试使用字符版本直到发动机可以在模式中继续前进。结果你会得到:
原文:

You got that right. .* goes up until the end of the string or the first newline.

Then, it will backtrack.

See, the next regex token is a required ". So you have to match it for the match to succeed.

Therefore, the * will "give up" one character, and an attempt to match " will be made again on the resulting string:

"string one" and "string two". Please respond

This fails, so the * will give up another one etc:

"string one" and "string two". Please respon
"string one" and "string two". Please respo
"string one" and "string two". Please resp
"string one" and "string two". Please res
... snip ...
"string one" and "string two". P
"string one" and "string two". 
"string one" and "string two"
"string one" and "string two

Aha, that substring is immediately followed by ", so it is consumed and the match succeeds at:

"string one" and "string two"

You could want to try the ungreedy version: ".*?". In that case, *? will try to match any character (.) as few times as possible for a successful match.

For the match to succeed, you still need a closing ", so the .*? version will try to consume characters until the engine can proceed forward in the pattern. The result you'll get will be:

"string one"
网友:好解释。当我开始学习regex我困在回溯,得不到它如何工作。

(原文:Good explanation. When I was starting learning regexes I was stuck on backtracking and can not get how it worked.)

楼主:很好的解释了,谢谢+1

(原文:Really good explanation, thanks +1)

解决方案:
你有第二个”在你的正则表达式匹配最后文字”删除,只使用
原文:

You have a second " in your regex that is matching the last literal " Remove that and just use

".*