Send
Close Add comments:
(status displays here)
Got it! This site "robinsnyder.com" uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website. Note: This appears on each machine/browser from which this site is accessed.
Regular expressions: Introduction
1. Regular expressions: Introduction
A regular expression is an expression that can be used to recognized whether a string of characters matches a pattern specified by the string.
Note: The theory goes much deeper than this, but this is all we need to know for the present purposes.
We will now cover some applications, which are much simpler than the theory.
Many programming languages, including JavaScript, VBScript, PERL, etc., support regular expressions.
One use of regular expressions is to validate data entered by users into forms.
2. Common uses
Here are some common uses of regular expressions.
data validation (forms, etc.)
web server URL redirection and URL rewriting (e.g., Apache)
almost any type of pattern matching
3. Literals
Regular expression string literals in JavaScript are delimited by the forward slash character. The standard backslash characters in string literals apply (e.g.,
\n for newline).
Some examples are now used. For simplicity, we only cover a subset of the available patterns here.
^ABC matches any string starting with "ABC".
XYZ$ matches any string ending with "XYZ".
ab* matches any string with zero or more of the letters "ab" in it.
a+ matches any string with one or more of the letter "a" in it.
a? matches any string with zero or one of the letter "a" in it (optional).
4. Special characters
Here are some special characters.
The caret "^" matches the following characters at the beginning of a line.
The dollar sign "$" matches the preceding characters at the end of a line.
The decimal point "." matches any single character except the newline character.
To get a character within a literal, use the backslash character "
\" before that character.
5. Repeating characters
The asterisk "*" matches the preceding character 0 or more times.
The plus "+" matches the preceding character 1 or more times.
The question mark "?" matches the preceding character 0 or 1 times.
The pattern
{n}, where
n is a positive integer, matches exactly
n occurrences of the preceding character.
The vertical bar "
|" is used for disjunction (i.e., the logical "
or").
6. Character ranges
Square brackets delimit character ranges for matches.
The pattern [xyz] matches any of the enclosed characters.
The pattern [^xyz] matches any characters not enclosed in brackets.
The pattern [0-9] matches any digit character from 0 to 9.
The pattern [0-9] is the same as 0|1|2|3|4|5|6|7|8|9.
The pattern [A-Z] matches any character from A to Z.
The pattern [0-9,A-Z,a-z] matches any digit or uppercase or lowercase alphabetic character.
7. Grouped patterns
Any regular expression enclosed in parentheses is remembered and can be accessed via the array elements starting at 1 or via the predefined names $1 to $9.
8. Example pattern
What does the following pattern recognize?
[0-9]{5}(\-[0-9]{4})?
9. Getting started
One way to start is to create some patterns that are matched by the regular expression. For optional parts (zero or one) create an instance for each optional part.
12345
12345-6789
Note: One can work the other way to get a regular expression from patterns.
10. Example pattern
What does the following pattern recognize?
[A-Z,a-z][A-Z,a-z,0-9]*
11. Example pattern
What does the following pattern recognize?
bc(.)*xyz$
Try creating regular expressions for the following patterns.
12. Credit card numbers
credit card numbers of the form 9999-9999-9999-9999 where each 9 represents a digit.
13. Expiration year and month
expiration year and month of the form 99/99 where each 9 represents a digit.
14. Time
time of the form 99:99 where each 9 represents a digit, but the first digits of the hour must be from 0 to 2 and the first digits of the minute must be from 0 to 5.
15. Money amounts
money of the form $9999.99 where the dollar amount can range from 0 to 9999 dollars.
16. Telephone numbers
Telephone numbers of the form
999-9999 or
999-999-9999 or
9-999-999-9999 where each
9 represents a digit.
Write a regular expression to recognize these telephone numbers.
You might start with three different regular expressions.
999-9999
999-999-9999
9-999-999-9999
17. Getting started
One way to get started is to write regular expressions that match each desired case.
\d{3}-\d{4}
\d{3}-\d{3}-\d{4}
\d{1}-\d{3}-\d{3}-\d{4}
18. Grouping parentheses
Assume that we want to recover each group, so add parentheses as follows.
(\d{3})-(\d{4})
(\d{3})-(\d{3})-(\d{4})
(\d{1})-(\d{3})-(\d{3})-(\d{4})
19. Common part
The following is the common part on the right.
(\d{3})-(\d{4})
Add an optional part on the left as follows.
(.*)?(\d{3})-(\d{4})
20. More detail
The match for
(.*)? on the left needs to be completed.
This method is similar to the Towers of Hanoi problem in that a smaller problem is being solved. This can be done as follows.
((.*)?-((\d{3})-))?((\d{3})-(\d{4}))
21. Complete it
Now complete the most innermost part.
((\d{1})?-((\d{3})-))?((\d{3})-(\d{4}))
Now test it with the regular expression tester.
If you have trouble, try creating a test program and experiment with various regular expressions until you get something that works (see the following discussion on how you might do this).
To see a regular expression tester, see
Regular expressions: Tester .
Write a JavaScript regular expression to recognize the following pattern. omitted.
22. Exam help
23. End of page