如何为包含特定字符的子字符串扫描吗 - How to scan for substrings with specific characters in them

- 此内容更新于:2016-02-15
主题:

这是一个后续问题。如何扫描并返回一组单词与特定的字符在Ruby中我们要扫描单词从一个特定的字母,然后返回一个数组。这样的:回到:我怎么这么做(我知道这是非常类似于昨天我问)?

原文:

This is a follow-up to this question. How to scan and return a set of words with specific characters in them in Ruby

We want to scan for words starting with a certain set of letters and then return them in an array. Something like this:

 b="h ARCabc s and other ARC12".scan(/\w+ARC*\w+/)

and get back:

["ARCabc","ARC12"]

How would I do this (and I know this is very similar to what I asked yesterday)?

解决方案:
使用下面的正则表达式:或(强调排除在匹配)查看演示regex正则表达式匹配:——一个字边界(仅在一个词)弧-固定字符序列0或多个字母,数字或下划线。注意:如果你只是想限制匹配字母和数字,取代这个\w*。\b-词(落后)边界的结束。看到IDEONE演示(输出:)。注2:如果你打算Unicode字符串匹配,考虑使用下面的regexp:这种变化将匹配单词下划线后弧——这regex将匹配单词,只有数字和Unicode弧后的信件。
原文:

Just use the following regex:

\bARC\w*\b

or (to exclude underscores from matching)

\bARC[[:alnum:]]*\b

See regex demo

The regex matches:

  • \b - a word boundary (ARC at the start of a word only)
  • ARC - a fixed sequence of characters
  • \w* - 0 or more letter, digits or underscores. NOTE: if you only want to limit the matches to letters and digits, replace this \w* with [[:alnum:]]*.
  • \b - end of word (trailing) boundary.

See IDEONE demo here (output: ARCabc and ARC12).

NOTE2: If you plan to match Unicode strings, consider using either of the following regexps:

  • \bARC\p{Word}*\b - this variation will match words with underscores after ARC
  • \bARC[\p{L}\p{M}\d]*\b - this regex will match words that only have digits and Unicode letters after ARC.
网友:thx-伟大的解释

(原文:thx so much - great explanation)

解决方案:
良好的可读性,可以把字符串分割成字,然后选择你想要的:如果单词必须开始:
原文:

For good readability, you could split the string into words and then select the ones you want:

str = "h ARCabc s and other ARC12"
target = "ARC"

str.split.select { |w| w.include?(target) }
  #=> ["ARCabc", "ARC12"] 

If the words must begin with target:

str.split.select { |w| w.start_with?(target) }