Regex en r
^ = Beginning of Line or String
$ = End of Line or String
. = Matches Anything like a Joker Card (inc blank spaces)
\\. = Escape the period when we search on an actual period
\\d or [::digit::] = 0, 1, 2, 3, ...
\\w or [::word::] = a, b, c, ..., 1, 2, 3, ..., _
[A-Za-z] or [::alpha::] = A, B, C, ... a, b, c, ...
[aeiou] = a, e, i, o, u
\\s or [::space::] = " ", tabs or line breaks
\\D = match all except digits
\\W = match all except word char (inc numbers and underscore)
\\S = match all except spaces, tabs or line breaks
[^A-Za-z] = match all except alphabet
\\d{2} = repeat digit exactly twice (double digit)
\\d{2, 3} = min repeat digit twice, max repeat 3 thrice
\\d{2,} = min repeat digit twice, no max
\\d+ = 1 or more repetitions
\\d* = 0, 1 or more repetitions
| = OR (pattern = "apple|apples") [Not limited to 2 options only]
# No spaces are allowed within the vertical bar(s)
? = make the preceding group or character optional or
make the preceding multiplier "lazy" instead of "greedy".
Similar as above when coded as (pattern = apples?)
Greedy and Lazy [See Example 5a and b]
pattern = ".*3" (match everything before the last digit of 3)[Greedy]
pattern = ".*?3" (match only everything before the first digit of 3)[Lazy]
Examples:
1) Find all digits and spaces
stringr::str_match_all(x, pattern = "[\\d\\s]")
2) List all strings that end with a space followed by a digit
data[stringr::str_detect(x, pattern = "\\s\\d$")]
3) List all strings that contain "Grey" or "Gray"
data[stringr::str_detect(x, pattern = "Gr[ae]y")]
4) List all strings with strange characters (no word or space)
data[stringr::str_detect(x, pattern = "[^\\w\\s]")]
5) Greedy vs Lazy
a) Greedy
>stringr::str_match("a 3 b 3 c", ".*3")
"a 3 b 3"
b) Lazy
>stringr::str_match("a 3 b 3 c", ".*?3")
"a 3"
6) Three or more word characters, followed by an at-sign,
one or more word characters and a ."com".
pattern = "\\w{3,}@\\w+\\.com"
Successful Salmon