Skip to main content

Start using regex

 

The below definitions have been pulled from Regex Basics just to have a basic understanding before we start practising. I have just displayed some basic things used and would keep updating the post as the time permits.

Literal     A literal is any character we use in a search or matching expression, for example, to find ind in windows the ind is a literal string - each character plays a part in the search, it is literally the string we want to find.

Metacharacter     A metacharacter is one or more special characters that have a unique meaning and are NOT used as literals in the search expression, for example, the character ^ (circumflex or caret) is a metacharacter.

Target string     This term describes the string that we will be searching, that is, the string in which we want to find our match or search pattern.

Search expression     Most commonly called the regular expression. This term describes the search expression that we will be using to search our target string, that is, the pattern we use to find what we want.

Escape sequence     An escape sequence is a way of indicating that we want to use one of our metacharacters as a literal. In a regular expression an escape sequence involves placing the metacharacter \ (backslash) in front of the metacharacter that we want to use as a literal, for example, if we want to find (s) in the target string window(s) then we use the search expression \(s\) and if we want to find \\file in the target string c:\\file then we would need to use the search expression \\\\file (each \ we want to search for as a literal (there are 2) is preceded by an escape sequence \).



Elements in array are matched only once, only the first match is taken care of.
"c:\\file".match('[1,2,3]')
 => nil

"c:\\file1".match('[1,2,3]')
 => #<MatchData "1">

"c:\\file123123".match('[12,2,3]')[0]
 => "1"

"c:\\file123123".match('[12,2,3,1,2,3]')
 => #<MatchData "1">

Search for \\file9
"c:\\file923123".match('\\\\file[0-9]')
 => #<MatchData "\\file9">

"c:\\file923123".match('[0-9]')
 => #<MatchData "9">

"c:\\file923123".match('[0-9A-C]')
 => #<MatchData "9">

"c:\\file923123".match('923[0-9A-C]')
 => #<MatchData "9231">

"c:\\file923-C".match('923-[0-9A-C]')
 => #<MatchData "923-C">

"c:\\file923f".match('923[0-9A-C]')
 => nil

"c:\\file923".match('923[0-9A-C]')
 => nil

The ^ (circumflex or caret) inside square brackets negates the expression
"c:\\file923v".match('923[^0-9A-C]')
 => #<MatchData "923v">

"c:\\file923ffsdfsdf".match('923[^0-9A-C]')
 => #<MatchData "923f">

"c:\\file9238".match('923[^0-9A-C]')
 => nil

The ^ (dollar) means look only at the start of the target string
"Mozilla fox".match('fox$')
 => #<MatchData "fox">

The $ (dollar) means look only at the end of the target string.
"fox Mozilla fox".match('^fox')
 => #<MatchData "fox">


The . (period) means any character(s) in this position, for example, ton. will find tons, tone and tonneau but not wanton because it has no following character.

"wanton".match('ton.')
 => nil

"wanton ".match('ton.')
 => #<MatchData "ton ">

?     The ? (question mark) matches the preceding character 0 or 1 times only, for example, colou?r will find both color (0 times) and colour (1 time).

"color".match('colou?r')
 => #<MatchData "color">

"colour".match('colou?r')
 => #<MatchData "colour">


*    

The * (asterisk or star) matches the preceding character 0 or more times, for example, tre* will find tree (2 times) and tread (1 time) and trough (0 times).

'treeeseeeeeeeeeeef'.match('tre*')
 => #<MatchData "treee">

'tree'.match('tre*')
 => #<MatchData "tree">

"treah".match('tre*')
 => #<MatchData "tre">


Diff between * and +. * checks for the previous character 0 or more times whereas + checks for 1 or more time.

'tr'.match('tre*')
 => #<MatchData "tr">

 'tr'.match('tre+')
 => nil

Post comments if you feel that some more regex basics should be added. I would love to update the post.

Comments

Popular posts from this blog

Understanding TOP command and purpose

$top top - 12:24:34 up 9 days, 21:58, 0 users, load average: 5.98, 5.32, 4.30 Tasks: 13 total, 1 running, 12 sleeping, 0 stopped, 0 zombie %Cpu(s): 5.5 us, 1.5 sy, 0.0 ni, 92.6 id, 0.0 wa, 0.0 hi, 0.5 si, 0.0 st KiB Mem: 12969522+total, 11112360+used, 18571628 free, 135900 buffers KiB Swap: 0 total, 0 used, 0 free. 49328208 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 40 root 20 0 1466540 912540 12568 S 7.7 0.7 67:03.03 bundle 43 root 20 0 1413152 860252 11116 S 6.7 0.7 65:41.24 bundle The load averages indicate the average number of processes waiting for CPU time over the specified time periods. Shows running processes and their status. Buffer is the amount of data used while it's being written or read. The numbers are in KiB's showing the RAM available on system us - user process sy - system process process ID (PID), user, priority (PR), virtual memory usage (VIRT), resident memory usage (RES), shared memory usage (SHR), CPU usage (%...

RubyConf 2013 at pune..retrieved from drafts

A great experience while interacting with the ruby community from different places. Lots of knowledge and inspiration flows. Met many folks from india and abroad.