Start using regex
The below definitions have been pulled from Regex Basics just to have a basic understanding before we start practising. I have just displayed some basic things used and would keep updating the post as the time permits.
Literal A literal is any character we use in a search or matching expression, for example, to find ind in windows the ind is a literal string - each character plays a part in the search, it is literally the string we want to find.
Metacharacter A metacharacter is one or more special characters that have a unique meaning and are NOT used as literals in the search expression, for example, the character ^ (circumflex or caret) is a metacharacter.
Target string This term describes the string that we will be searching, that is, the string in which we want to find our match or search pattern.
Search expression Most commonly called the regular expression. This term describes the search expression that we will be using to search our target string, that is, the pattern we use to find what we want.
Escape sequence An escape sequence is a way of indicating that we want to use one of our metacharacters as a literal. In a regular expression an escape sequence involves placing the metacharacter \ (backslash) in front of the metacharacter that we want to use as a literal, for example, if we want to find (s) in the target string window(s) then we use the search expression \(s\) and if we want to find \\file in the target string c:\\file then we would need to use the search expression \\\\file (each \ we want to search for as a literal (there are 2) is preceded by an escape sequence \).
Elements in array are matched only once, only the first match is taken care of.
"c:\\file".match('[1,2,3]')
=> nil
"c:\\file1".match('[1,2,3]')
=> #<MatchData "1">
"c:\\file123123".match('[12,2,3]')[0]
=> "1"
"c:\\file123123".match('[12,2,3,1,2,3]')
=> #<MatchData "1">
Search for \\file9
"c:\\file923123".match('\\\\file[0-9]')
=> #<MatchData "\\file9">
"c:\\file923123".match('[0-9]')
=> #<MatchData "9">
"c:\\file923123".match('[0-9A-C]')
=> #<MatchData "9">
"c:\\file923123".match('923[0-9A-C]')
=> #<MatchData "9231">
"c:\\file923-C".match('923-[0-9A-C]')
=> #<MatchData "923-C">
"c:\\file923f".match('923[0-9A-C]')
=> nil
"c:\\file923".match('923[0-9A-C]')
=> nil
The ^ (circumflex or caret) inside square brackets negates the expression
"c:\\file923v".match('923[^0-9A-C]')
=> #<MatchData "923v">
"c:\\file923ffsdfsdf".match('923[^0-9A-C]')
=> #<MatchData "923f">
"c:\\file9238".match('923[^0-9A-C]')
=> nil
The ^ (dollar) means look only at the start of the target string
"Mozilla fox".match('fox$')
=> #<MatchData "fox">
The $ (dollar) means look only at the end of the target string.
"fox Mozilla fox".match('^fox')
=> #<MatchData "fox">
The . (period) means any character(s) in this position, for example, ton. will find tons, tone and tonneau but not wanton because it has no following character.
"wanton".match('ton.')
=> nil
"wanton ".match('ton.')
=> #<MatchData "ton ">
? The ? (question mark) matches the preceding character 0 or 1 times only, for example, colou?r will find both color (0 times) and colour (1 time).
"color".match('colou?r')
=> #<MatchData "color">
"colour".match('colou?r')
=> #<MatchData "colour">
*
The * (asterisk or star) matches the preceding character 0 or more times, for example, tre* will find tree (2 times) and tread (1 time) and trough (0 times).
'treeeseeeeeeeeeeef'.match('tre*')
=> #<MatchData "treee">
'tree'.match('tre*')
=> #<MatchData "tree">
"treah".match('tre*')
=> #<MatchData "tre">
Diff between * and +. * checks for the previous character 0 or more times whereas + checks for 1 or more time.
'tr'.match('tre*')
=> #<MatchData "tr">
'tr'.match('tre+')
=> nil
Post comments if you feel that some more regex basics should be added. I would love to update the post.
Comments
Post a Comment