C# - Regular Expressions

Regular Expressions

Using regular expressions, you can define the patterns of a text and match it against a string. In the .NET Framework, the System.Text.RegularExpressions namespace contains the RegEx class for manipulating regular expressions.

Searching for a Match

To use the RegEx class, first you need to import the System.Text.RegularExpressions namespace:

    using System.Text.RegularExpressions;

The following statements shows how you can create an instance of the RegEx class, specify the pattern to search for, and match it against a string:

string s = "This is a string";
Regex r = new Regex("string");
if (r.IsMatch(s))
{
Console.WriteLine("Matches.");
}

In this example, the Regex class takes in a string constructor, which is the pattern you are searching for. In this case, you are searching for the word " string " and it is matched against the s string variable. The IsMatch() method returns True if there is a match (that is, the string s contains the word " string " ).

To find the exact position of the text "string" in the variable, you can use the Match() method of the RegEx class. It returns a Match object that you can use to get the position of the text that matches the search pattern using the Index property:

string s = "This is a string";
Regex r = new Regex("string");
if (r.IsMatch(s))
{
Console.WriteLine("Matches.");
}
Match m = r.Match(s);
if (m.Success)
{
Console.WriteLine("Match found at " + m.Index);
//---Match found at 10---
}

What if you have multiple matches in a string? In this case, you can use the Matches() method of the RegEx class. This method returns a MatchCollection object, and you can iteratively loop through it to obtain the index positions of each individual match:

string s = "This is a string and a long string indeed";
Regex r = new Regex("string");
MatchCollection mc = r.Matches(s);
foreach (Match m1 in mc)
{
Console.WriteLine("Match found at " + m1.Index);
//---Match found at 10---
//---Match found at 28---
}

More Complex Pattern Matching

You can specify more complex searches using regular expressions operators . For example, to know if a string contains either the word " Mr " or " Mrs " , you can use the operator | , like this:

string gender = "Mr Wei-Meng Lee";
Regex r = new Regex("Mr|Mrs");
if (r.IsMatch(gender))
{
Console.WriteLine("Matches.");
}

The following table describes regular expression operators commonly used in search patterns.

Operator Description
. Match any one character
[ ] Match any one character listed between the brackets
[^ ] Match any one character not listed between the brackets
? Match any character one time, if it exists
* Match declared element multiple times, if it exists
+ Match declared element one or more times
{n} Match declared element exactly n times
{n,} Match declared element at least n times
{n,N} Match declared element at least n times, but not more than N times
^ Match at the beginning of a line
$ Match at the end of a line
\< Match at the beginning of a word
\> Match at the end of a word
\b Match at the beginning or end of a word
\B Match in the middle of a word
\d Shorthand for digits (0 – 9)
\w Shorthand for word characters (letters and digits)
\s Shorthand for whitespace

Some commonly used search patterns are described in the following table.

Pattern Description
[0-9] Digits
[A-Fa-f0-9] Hexadecimal digits
[A-Za-z0-9] Alphanumeric characters
[A-Za-z] Alphabetic characters
[a-z] Lowercase letters
[A-Z] Uppercase letters
[\t] Space and tab
[\t\r\n\v\f] Whitespace characters
\w+([ - +.’]\w+)*@\w+([ - .]\w+)*\.\w+([ - .]\w+)* Email address
http(s)?://([\w - ]+\.)+[\w - ]+(/[\w - ./?% & =]*)? Internet URL
((\(\d{3}\) ?)|(\d{3} - ))?\d{3} - \d{4} U.S. phone number
\d{3} - \d{2} - \d{4} U.S. Social Security number
\d{5}( - \d{4})? U.S. ZIP code