Here's a short article which could be called "RegEx for dummies" or beginners. The most important thing about all of this is to read the primer (below).
PRIMER: Regular Expressions (RegEx) is used to find text which matches certain criteria. RegEx does NOT find words or sentences. It finds single characters. So a regular expression like "gin" does not find the word "gin" it finds the letter "g" followed by the letter "i" followed by the letter "n". Make a not of this because it will become much easier this way.
So here is a few examples followed by explanation of some of the meta characters in RegEx.
RegEx: "gin" matches Beginners.
RegEx: "gin?" matches Begixxers. Because the n is optional (? = preceeding character is optional)
RegEx: "(ht|f)tp://" matches both http:// and ftp:// (| = an logical or between the enclosed character sequences)
RegEx: "[1-2][0-9]" matches any number sequence where the 1st digit is 1 or 2 and the following digit is between 0 and 9. So numbers between 10-29. But also 10000 or 294.51 or 14degrees.
RegEx: "/[a-z0-9]/" marches any sequence where a / is followed by letters from a-z or digits from 0 to 9 and then followed by a /
RegEx: "s.x" matches any sequence where a s is followed by ANY character and then followed by a x. So sex, six, sux, sax are all good matches.
RegEx "http:.*\.zip" matches anything followed by after the letters http: 0 or more times and then finally .zip. The * means reperition 0 or more times . means any character \. means the literal dot itself.
So here is a tiny explanation:
? = The preceeding character is optional. Can be there or not
(x|y) = A sub group where either x or y is required. One could make them both optional by (x|y)?, see ?
[a-z0-9/] = A range for 1 character which can be any letter between a-z, 0-9 or a /.
. = Means any character what so ever.
* = Like ? it tells that the preceeding character can be repeated 0 or an unlimited time.
\. = The literal dot itself. It is "escaped" by the \ to tell that we mean . and not the .(any char)
That's all. Download EditPad Pro and load/write some text, use the search (Ctrl+F) and write your first regular expressions to try searching.
In later articles we shall construct more "game on" kind of regular expressions. And explore tools like GREP which has a slightly different and less featured regex than EditPad Pro.
No comments:
Post a Comment