Enhance document tagging with regular expressions
In Tagging, there a several places where you can specify patterns or regular expressions to constrain metadata that is extracted or tagged. Regular expressions enable you to apply formatting rules, check lengths, etc. to text to make sure they match a specific pattern. In essence, it validates the metadata before they are extracted from the document or tagged in SharePoint.
Here are some basic examples
Regular Expression | Example matches | Description |
---|---|---|
abc$ | abc, 123abc | Any text ending with abc |
^abc | abc, abc123 | Any text that starts with abc |
^[0-9]{5}$ | 11111, 12345, 99999 | Any 5 digit numbers |
\d{1,4} | 1, 24, 445, 3333 | Any number that is 1 to 4 digits |
[A-Za-z]{4}-\d{4} | ABCD-1234, GYDL-8450 | 4 letters followed by a dash, then 4 numbers |
[A-Za-z]{4}(- | _)\d{4} | |
[A-Za-z]{4}[\W_]\d{4} | ABCD-1234, ABCD_1234, ABCD 1234, ABCD+1234, ABCD#1234 | 4 letters followed by any non-word separator, then 4 numbers |
Below are a few useful resources to get you started with regular expressions:
Some useful regular expressions taken from the resources above:
Field | Regular Expression | Example matches | Description |
---|---|---|---|
Social Security Number | ^\d{3}-\d{2}-\d{4}$ | 111-11-1111 | Validates the format, type, and length of the supplied input field. The input must consist of 3 numeric characters followed by a dash, then 2 numeric characters followed by a dash, and then 4 numeric characters. |
Phone Number | ^[01]?[- .]?(\([2-9]\d{2}\) | [2-9]\d{2})[- .]?\d{3}[- .]?\d{4}$ | (425) 555-0123 |
425-555-0123 | |||
425 555 0123 | |||
1-425-555-0123 | Validates a U.S. phone number. It must consist of 3 numeric characters, optionally enclosed in parentheses, followed by a set of 3 numeric characters and then a set of 4 numeric characters. | ||
^(?(””)(””.+?””@) | (([0-9a-zA-Z]((\.(?!\.)) | [-!#\$%&’\*\+/=\?\^`\{\}|~\w])*)(?<=[0-9a-zA-Z])@))(?(\[)(\[(\d{1,3}\.){3}\d{1,3}\]) | |
ZIP Code | ^(\d{5}-\d{4} | \d{5} | \d{9})$ |
Currency (non- negative) | ^\d+(\.\d\d)?$ | 1.00 | Validates a positive currency amount. If there is a decimal point, it requires 2 numeric characters after the decimal point. For example, 3.00 is valid but 3.1 is not. |
Currency (positive or negative) | ^(-)?\d+(\.\d\d)?$ | 1.20 | Validates for a positive or negative currency amount. If there is a decimal point, it requires 2 numeric characters after the decimal point. |