c#正则表达式定义的问题 - C# Regex definition issue

- 此内容更新于:2015-12-20



I confuse about a Regex :".*" , if i apply this to this sentence : we have a problem with "string one" and "string two". Please respond. it should work like this :

1) first of all it should find double quotation in the sentence

2)then dot select every literal (without line break) and star * repeat this pattern. so the result should be like this :

 "string one" and "string two". Please respond.

i think .* finish the sentence, because it include all literals without line break so the second double quotation could not affect on sentence because the sentence was finished by .* , i think that i make mistake and i didn't understand how it works! any one can explain me the procedure?


(原文:* is greedy and eats everything between outer quotes.)


You got that right. .* goes up until the end of the string or the first newline.

Then, it will backtrack.

See, the next regex token is a required ". So you have to match it for the match to succeed.

Therefore, the * will "give up" one character, and an attempt to match " will be made again on the resulting string:

"string one" and "string two". Please respond

This fails, so the * will give up another one etc:

"string one" and "string two". Please respon
"string one" and "string two". Please respo
"string one" and "string two". Please resp
"string one" and "string two". Please res
... snip ...
"string one" and "string two". P
"string one" and "string two". 
"string one" and "string two"
"string one" and "string two

Aha, that substring is immediately followed by ", so it is consumed and the match succeeds at:

"string one" and "string two"

You could want to try the ungreedy version: ".*?". In that case, *? will try to match any character (.) as few times as possible for a successful match.

For the match to succeed, you still need a closing ", so the .*? version will try to consume characters until the engine can proceed forward in the pattern. The result you'll get will be:

"string one"

(原文:Good explanation. When I was starting learning regexes I was stuck on backtracking and can not get how it worked.)


(原文:Really good explanation, thanks +1)


You have a second " in your regex that is matching the last literal " Remove that and just use