Aware - Regular Expression Pattern Creation

This article contains information on creating and validating regular expression patterns in Aware, including examples for identifying sensitive data like PII and VAT IDs and custom patterns for industry-specific needs.

Regular Expressions are patterns that can trigger Events in Aware. They look for a specific combination of character, keyword, or number patterns within your connected Content Platform.

Account numbers, addresses, credit card numbers, national identification numbers, and other regular expressions are examples. We have created and validated many common patterns for you, and they are available in Aware. If you would like to create a pattern for something industry—or company-specific, like an employee ID or Customer Account Number, read below for some helpful tips.

Remember that your Customer Success Manager can help you build and validate a regular expression pattern for your use.

See  http://www.regexr.com or https://www.regex101.com for a simple regular expression tool.

Creating a Regular Expression

There are many ways to go about creating regular expressions, but one of the most common ways is to follow these instructions:

  • Identify the type of pattern you are trying to find within a message (e.g., 6-digit number separated by dashes ##-##-##).
  • Use a RegEx tool for help when creating your pattern. The regular expression for this pattern is: \b\d{2}-\d{2}-\d{2}\b.

    Validate in a RegEx tool or Aware that the pattern is correct and will bring back the intended content.

Below are some examples of Regular Expression Patterns that we do not have available in our product:

  • Personally Identifiable Information (PII): The following patterns match types of information that many countries consider to be personally identifiable.
Country Identity Number Regular Expression Pattern
China (PRC) Matches an 18-digit number. \b\d{18}\b
Finland Matches an 11-digit number where the last digit is sometimes a character. \b\d{10}\w\b
Ireland Matches a 7-digit number followed by two trailing characters. \b\d{7}[a-zA-Z]{2}\b
Israel Matches a 9-digit number. \b\d{9}\b.
Italy Matches 6 characters, followed by 9 digits with a final trailing character. \b[a-zA-Z]{6}\d{9}\w\b.
Poland Matches an 11-digit number. \b\d{11}\b.
South Korea Matches a 6-digit number followed by a dash and 7 trailing digits. \b\d{6}-\d{7}\b.
Sweden Matches a 6-digit number followed by a dash and 4 trailing digits. \b\d{6}-\d{4}\b.
Switzerland Matches an 11-digit number with two different groupings. AAA.BB.CCC.DDD or the newer 756.XXXX.XXXX.XY. \b\d{3}[.]\d{2}[.]\d{3}[.]\d{3}\b|\b756[.]\d{4}[.]\d{4}[.]\d{2}\b.
Spain Matches an 8-digit number followed by a dash and a trailing letter. ########-X. \b\d{8}-[a-zA-Z]\b.
Taiwan Matches a letter followed by 9 digits. \b[a-zA-Z]\d{9}\b.
Thailand Matches a 13-digit number separated by dashes. #-####-#####-##-#. \b\d{1}-\d{4}-\d{5}-\d{2}-\d\b.
Turkey Matches a 13-digit number. \b\d{13}\b.
United Kingdom Matches a 10-digit number separated by dashes or the placeholder equivalent. ###-###-#### or xxx-xxx-xxxx. \b\d{3}[-.]?\d{3}[-.]?\d{4}\b|xxx-xxx-xxxx.
United States
  • Matches a 9, followed by groupings of digits separated by dashes. 9##-##-#### or 9xx-xx-xxxx.
  • Matches 2 digits followed by a dash and 7 trailing digits. ##-#######.
  • \b9\d{2}[-.]?\d{2}[-.]?\d{4}\b|9xx-xx-xxxx.
  • \b\d{2}[-.]?\d{7}\b|xx-xxxxxxx.
Vietnam Matches a 9-digit number in groupings of 3 separated by dashes. ###-###-###. \b\d{3}[-.]?\d{3}[-.]?\d{3}\b|xxx-xxx-xxx.
Austria Matches ATU + 8 digits. \bATU\d{8}\b|U\d{8}
Belgium Matches BE + 10 digits. \bBE\d{10}\b|\d{10}
Bulgaria Matches BG + 9 to 10 digits. \bBG\d{9,10}\b|\d{9,10}
Croatia Matches HR + 11 digits. \bHR\d{11}\b|\d{11}
Cyprus Matches CY + 8 digits + 1 trailing character. \b(cy|CY)?\d{8}\w\b
Czech Republic Matches CZ + 8 to 10 digits. \b(cz|CZ)?\d{8,10}\b
Denmark Matches DK + 8 digits. \b(dk|DK)?\d{8}\b
Estonia Matches EE + 9 digits. \b(ee|EE)?\d{9}\b
Finland Matches FI + 8 digits. \b(fi|FI)?\d{8}\b
France Matches FR + 2 characters followed by 9 digits. \b(fr|FR)?[a-zA-Z]{2}\d{9}\b
Germany Matches DE + 9 digits. \b(de|DE)?\d{9}\b
Greece Matches EL + 9 digits. \b(el|EL)?\d{9}\b
Hungary Matches HU + 8 digits. \b(hu|HU)?\d{8}\b
Ireland Matches IE + 7 digits followed by 1 or two characters. \b(ie|IE)?\d{7}[a-zA-Z]{1,2}\b
Italy Matches IT + 11 digits. \b(yit|IT)?\d{11}\b
Latvia Matches LV + 11 digits. \b(lv|LV)?\d{11}\b
Lithuania Matches LT + 9 or 12 digits. \b(lt|LT)?\d{9}\b|LT\d{12}
Luxembourg Matches LU + 8 digits. \b(lu|LU)?\d{8}\b
Malta Matches MT + 8 digits. \b(mt|MT)?\d{8}\b
Netherlands Matches NL + 9 digits followed by the letter B and 2 more digits. \b(nl|NL)?\d{9}B\d{2}\b
Poland Matches PL + 10 digits.  ###-###-##-## or ###-##-##-###. \b(pl|PL)?\s\d{3}-\d{3}-\d{2}-\d{2}\b|PL\s\d{3}-\d{2}-\d{2}-\d{3}
Portugal Matches PT + 9 digits. \b(pt|PT)?\d{9}\b
Romania Matches RO + 2 to 10 digits. \b(ro|RO)?\d{2,10}\b
Slovakia Matches SK + 10 digits. \b(sk|SK)?\d{10}\b
Slovenia Matches SI + 8 digits. \b(si|SI)?\d{8}\b
Spain Matches ES + a character or a digit followed by 7 digits and a final character or a digit. \b(es|ES)?[a-zA-Z0-9]\d{7}[a-zA-Z0-9]\b
Sweden Matches SE + 10 digits followed by 01. \b(se|SE)?\d{10}01\b
Standard Matches GB + 9 digits separated in groupings of 3, 4, and 2. GB### #### ##. \b(gb|GB)?\d{3}\s\d{4}\s\d{2}\b
Branch Traders Matches GB + 9 digits, then a following block of 3 digits. GB######### ###. \b(gb|GB)?\d{9}\s\d{3}\b
Government Departments Matches GBGD + 3 digits. \b(gbgd|GBGD)?d{3}\b
Health Authorities Matches GBHA + 3 digits. \b(gbha|GBHA)?\d{3}\b

 

 

 

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.