{infiniteZest}
// Articles. Tutorials. Utilities.
Home  |   Search  |   Login  
Categories Skip Navigation Links
New / All
AJAX
Apple
ASP.NET
.NET
Git
Google / Android
Python / IronPython
Miscellaneous
SQL Server
How to match and not match period and other non-alphanumeric characters?
Summary
The backspace escapes the special characters like period in the regular expressions. This article contains some examples of how to match and not match special characters like period.
 
Table of Contents

Removing Non-Alpha Numeric Characters

Removing Special Characters Selectively

Replacing the spaces and periods with dashes

 

Main class that provides the regular expression functionality in .net is the Regex class. This is in the System.Text.RegularExpressions namespace. To be able to access this functionality, use the following using directive:

using System.Text.RegularExpressions;

The Regex class in the above namespace contains several Replace methods. All these over-loaded methods are, of course, public methods; and a couple of them are static as well. When using a static method, you don’t need to create an instance of that class -- you can use this static method directly.

title = Regex.Replace(title, @"[^\w\s]", "");

In the above line, let’s say ’title’ is title of an article or a blog post. The above regular expression removes all the characters that are not alpha-numeric or spaces from the title. Meaning all the special characters like period (.), comma (,), quotes, etc are removed.

Removing Non-Alpha Numeric Characters

Let’s look at the regular expression string: [^\w\s]:

  • \w means ’match all the alphanumeric characters’
  • \s means ’match all the spaces’
  • ^ means ’negate the above’ meaning, don’t match alphanumeric characters or spaces
  • [] will match one character at a time.

So, with the above regular expression, we telling the Replace method to match all the characters that are NOT alphanumeric or spaces.

The Replace method has three parameters:

  • title - the string input on which we are applying the regular expression pattern
  • [^\w\s] - a regular expression pattern to look for in the string
  • "" - the replacement string. In this case it is an empty string, so, the characters will be deleted.

So, if the title string is: "Why use @blah?"

It becomes: "Why use blah"

Removing Special Characters Selectively

Now, let’s say we don’t want to remove all the ’special’/non-alpha-numeric characters. Let’s say we want to keep the periods (.) and dashes (-) in. In this case, we can explicitly say what we want to keep:

title = Regex.Replace(title, @"[^a-zA-Z0-9\.\-\s]", "");

In this case, the pattern we are looking for is: [^a-zA-Z0-9\.\-\s]

  • a-z : All the lower-case alphabet
  • A-Z : All the upper-case alphabet
  • 0-9 : All the numbers
  • \. : The period, using the escape \
  • \- : The dash
  • \s : The spaces

By explicitly mentioning the lower and upper case letters and numbers, we are matching only a part of the alpha-numeric characters.

Now with the above pattern, following would happen:

Title: "Isn’t asp.net cool?"

Becomes: "Isnt asp.net cool"

Replacing the spaces and periods with dashes

On the above string (the string without some of the special characters), if you use the following Replace:

title = Regex.Replace(title, @"[\s+\.]", "-");

You will get a nice title that’s search engine friendly.

How to match and not match period and other non-alphanumeric characters?"

Becomes: "how-to-match-and-not-match-period-and-other-non-alphanumeric-characters"

This is futher discussed in the following article:

Some Quick and Easy Ways to Rewrite URLs.

Bookmark and Share This

More Articles With Similar Tags
icon-replace-window.jpg
This article looks at the client-side way to replace a piece of matching string in a text column.
It's easy and convenient to obtain articles, blog posts, etc. via ID as query string. But the page generated with ID paramater has less of a chance of being indexed by search engine. Instead of ID, use a more search-engine-friendly name in the URL.
This article contains information on various search robots and robots.txt. Thoughts on which directories to be allowed access and which are to be kept away from the search engines.
About  Contact  Privacy Policy  Site Map