So, I've been reading up on my ASP.Net and trying to familiarize myself with everything real quick. I try searching for a Regular Expression tutorial, so that I can quickly brush up on the RegularExpressionValidator control. Of course I found the info on MSDN:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnpag2/html/paght000001.aspwhich led to a link with great information
http://www.regular-expressions.info/tutorial.htmlI know some of the regular expressions from my use in VI in Unix/Linux environments, but I guess a refresher is in order. Here is a basic run down (quickly tutorial) of regular expressions in ASP.Net:
^ - the beginning of the input string
$ - the end of the input string
Note: If you omit these markers, an attacker could affix malicious input to the beginning or end of valid content and bypass your filter.Note: If you try to represent a string of text with anything by just using ^,$, or ^$ then you will run into trouble if a newline is used? - represents a variable string that may or may not be included
Example: Will(iam)? would represent my name Will and William
\ - represents the escape sequence from the meta-character.
\b - matches before and/or after an alphanumeric sequence
example: ^\bWilliam\b find the whole word "William" by it self at the beginning.
\B - matches before and/or after that isn't an alphanumeric
More info: http://www.regular-expressions.info/wordboundaries.html\d - matches a single numeric value.
\D - matches anything that is not a numeric value.
[0-9] - represents one numeric digit
Example: \d\+\d would represent any number+number like 1+1 also the expression [0-9]\+[0-9]
\t - represents the non-printable representation of the tab.
\n - represents the non-printable representation of the line feed.
\r - represents the non-printable representation of the carriage return.
\s - represents a white space (tab, carriage return, etc..)
\S - represents not a white space.
\xHEX_VALUE - represents the hex representation of an ASCII character.
Example: \xA9 represents the copy-right symbol
More Info:http://www.regular-expressions.info/characters.html* - represents zero or more occurrences
+ - represents one or more occurrences
{min,max} - represents a range of numbers of occurrences
Example: \d{0,} represents one or more numeric values
Example: {1,} is the same as +
[a-z] - represents one lowercase letter
[A-Z] - represents one uppercase letter
Example: ^*[a-zA-Z0-9'.]$ represents a single string of zero or more occurrences of lowercase, uppercase, numeric, single-quote, and/or a dot
Example: [WB]ill represents the word of Will or Bill
[^] - represents the negated
Example: Will[^y] represents any string with Will and some other character(s) as long it isn't y. So Willie and Williy would be ok, but Will and Willy would not be.
| - (veritical line) represents or
Example: Will(iam|ie) represents William or Willie
$number - represents a way referencing a previous string
Example: ([w*])(abc)=$1$2 Will always be true in cases like abc=abc , wabc=wabc , wwabc=wwabc , etc....
Be careful when it comes to using the \ -backslash since it is also used in C#. As an advice use the @ to represent your regular expression string as is, or you would be force to write something like \\\\ to represent \\ the literal backslash in the regular expression. Some additional important info on using Regular Expressions in .Net environments:
http://www.regular-expressions.info/dotnet.htmlWell I hope these are correct, sometimes the wording might not be as correct as it should, making it ambiguously wrong.