Content Examination - Entities

This article contains information on using Content Examination definitions to search for message content using entities, simplifying sensitive content detection without writing regular expressions, and details on Non-Privacy and Privacy Pack entities.

Overview

With Content Examination definitions, you can search for message content using one or more entities. This is in addition to keywords, phrases, or regular expressions. Entities are records we maintain that include the regular expressions, keywords, and phrases required to find content. This simplifies the process by allowing you to search for sensitive content without writing regular expressions. You can use the following:

      • A single entity: Use this method if an entity has a low percentage of false positives or has broad criteria for matching the content.
      • Multiple entities: This method triggers a content match when two entities are found.
      • Operators (e.g., AND, OR).

        The AND operator in Content Examination helps ensure that multiple keywords are found within the same section of a message. When used, it will only score a weight value if both specified keywords are present in the same email section within proximity of each other default is 250 character but it can be specified, helping to reduce false positive detections and provide more precise matching. See Content Examination - Proximity Operators for more information.

This article must be read in conjunction with the following pages:

Entities fall into one of the following categories:

      • Non-Privacy Pack: These entities are available to all customers.
      • Privacy Pack: These entities are only available to customers with the privacy pack add-on enabled on their accounts.

Non-Privacy Pack Entities

The following entities do not require the Privacy Pack:

Entity Name Example Usage
NHS (National Health Service) Number
1 detect nhs
National Insurance Number
1 detect nin
CHI (Community Health Index) Number
1 detect chi
Passport Numbers. Passports of the following countries can also be specified by using their unique identifier:
Country Unique Id
Australia
Canada
Germany
Finland
France
Japan
Philippines
Sweden
Taiwan
United Kingdom
United States of America
au
ca
de
fi
fr
jp
pl
se
tw
uk
us
1 detect passport
1 detect passport_au
Date of Birth (DOB)
1 detect date_dob
UK Driver Licenses
1 detect drivers_license_uk
VIN Number (Vehicle Identification Number)
1 detect vin
Telephone Number. Telephone numbers from the following countries can also be specified by using their unique identifier:
Country Unique Id
United Kingdom
United States of America
Australia
UK
US
AU
1 detect phonenumber
1 detect phonenumber_uk
Email Address
1 detect email
Fax Number

As fax numbers are the same as telephone numbers, so the same unique identifiers are available.

1 detect phonenumber
1 detect phonenumber_au
UK Electoral Roll
1 detect uk_electoral_roll
Dates. You can use any of the following date formats:
Criteria Unique Id
Day / Month / Year
Month / Day / Year
Year / Month / Day
Month / Year
dmy
mdy
ymd
my
1 detect date
1 detect date_mdy
1 detect date_my
IP Address
1 detect ip
URL
1 detect url
IBAN Number (International Bank Account Number). Individual IBAN country codes can also be specified by using their unique identifier:
Country Unique Id Country Unique Id
Albania
Andorra
Austria
Azerbaijan
Bahrain
Belgium
Bosnia and Herzegovina
Brazil
Bulgaria
Costa Rica
Croatia
Cyprus
Czech Republic
Denmark
Faroe Islands
Greenland
Dominican Republic
Estonia
Finland
France
Georgia
Germany
Gibraltar
Greece
Guatemala
Hungary
Iceland
Ireland
Israel
Italy
Jordan
Kazakhstan
Kosovo
Kuwait
Latvia
al
ad
at
az
bh
be
ba
br
bg
cr
hr
cy
cz
dk
fo
gl
do
ee
fi
fr
ge
de
gi
gr
gt
hu
is
ie
il
it
jo
kz
xk
kw
lv
Lebanon
Liechtenstein
Lithuania
Luxembourg
Macedonia
Malta
Mauritania
Mauritius
Moldova
Monaco
Montenegro
Netherlands
Norway
Pakistan
Palestinian Territories
Poland
Portugal
Qatar
Romania
Saint Lucia
San Marino
Sao Tome & Principe
Saudi Arabia
Serbia
Slovakia
Slovenia
Spain
Sweden
Switzerland
Timor-Leste
Tunisia
Turkey
United Arab Emirates
United Kingdom
Virgin Islands (British)
lb
li
lt
lu
mk
mt
mr
mu
md
mc
me
nl
no
pk
ps
pl
pt
qa
ro
lc
sm
st
sa
rs
sk
si
es
se
ch
tl
tn
tr
ae
gb
vg
1 detect iban
1 detect iban_az
Credit Cards. Individual credit/debit card types can also be specified by using their unique identifier:
Credit Card Type Unique Id
American Express
Dankort
Diners Club
Discover
Forbrugsforeningen
JCB
Laser
MasterCard
Solo
Switch
Visa
americanexpress
dankort
diners
discover
forbrugsforeningen
jcb
laser
mastercard
solo
switch
visa
1 detect creditcard
1 detect visa
Global: Bank Identifier Code (BIC/SWIFT)
1 detect swift
Spain: Numero de Identificacion Fiscal (NIF) Number
1 detect nif_es
Japanese Corporate Number
1 detect corporate_number_jp
Japanese Individual Number (My Number)
1 detect my_num_jp
Portugese Numero de Identificacao Fiscal Number
1 detect nif_pt
Turkey Tax Identification Number
1 detect tr_vkn_cd
UK Unique Tax Identification Code
1 detect utr_uk
UK Company Registration Number (CRN)
1 detect crn_uk

Privacy Pack Entities

The privacy pack contains several different entities. These are split into the healthcare, PII (Personal Identifiable Information), financial, and entity group categories.

Healthcare Entities

Entity Name Example Usage
HIC (Healthcare Insurance Claim) Number. The following HIC numbers can also be specified by using their unique identifier:
HIC Number Type Unique Id
Railroad Retirement Board
Center for Medicare & Mediaid Services
rrb
cms
1 detect hicn
1 detect hicn_rrb
DEA (Drug Enforcement Agency) Number 1 detect dea
NDC (National Drug Code). The following drug code lists can also be specified by using their unique identifier:
Drug Code List Unique Id
Private Duty Nursing
Non-Private Duty Nursing
pdn
npdn
1 detect ndc
1 detect ndc_npdn
FDA-Approved Prescription Drugs
1 detect fdadrugs
Medicare Identifier
1 detect medicare_id
NPI (National Provider Identifier)
1 detect npi

PII (Personal Identifiable Information) Entities

Entity Name Example Usage
US Driver Licenses. Licenses of the following US states can also be specified by using their unique identifier:
State Unique Id State Unique Id
Alaska
Alabama
Arkansas
Arizona
California
Colorado
Connecticut
Delaware
Florida
Georgia
Hawaii
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Massachusetts
Maryland
Maine
Michigan
Minnesota
Missouri
Mississippi
ak
al
ar
az
ca
co
ct
de
fl
ga
hi
id
il
in
ia
ks
ky
la
ma
md
me
mi
mn
mo
ms
Montana
North Carolina
North Dakota
Nebraska
New Hampshire
New Jersey
New Mexico
Nevada
New York
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
Wisconsin
West Virginia
Wyoming
mt
nc
nd
ne
nh
nj
nm
nv
ny
oh
ok
or
pa
ri
sc
sd
tn
tx
ut
vt
va
dc
wi
wv
wy
1 detect drivers_license_us
1 detect drivers_license_us_ak
SSN (Social Security Number)
1 detect ssn
SIN (Canadian Social Insurance Number)
1 detect sin
South Africa ID
1 detect south_africa_id
Mexico Clave Única de Registro de Población (CURP) Number
1 detect mexico_curp
Mexico Registro Federal de Contribuyentes (RFC) Number
1 detect mexico_rfc
Mexico Voter Registration Code
1 detect mexico_voter_reg
Employer Identification Number
1 detect ein_us

Financial Entities

Entity Name Example Usage
ITIN (Individual Tax Payer Number)
1 detect us_itin
Tax Identification Numbers of the following countries can be specified by using their unique identifier:
Country Unique Id Country Unique Id
Austria
Belgium
at
be
Bulgaria
Croatia
bg
hr
1 detect at_tin
1 detect hr_tin
ABA Number (American Bankers Association Number)
1 detect aba
Australia Tax File Number
1 detect tfn
Austria Tax Id Number
1 detect at_tin
Belgium Tax Id Number
1 detect be_tin

Entity Groups

We provide a set of entity groups that contain either a list of search terms or a selection of entities that align to a particular area. These include:

Entity Group Description Example Usage
ICD9cm A list of medical terms used in the diagnosis of a medical condition.
1 detect icd9cm
1 detect icd9cm medicare_id
ICD10cm
1 detect icd10cm
1 detect icd10cm_codes
PII A collection of entities focusing on personal information, including:
  • Names
  • Date of Birth
  • Passport Number
  • US Driver License
  • SSN
  • IBAN
  • CreditCard
  • PhoneNumber
  • URL
  • VIN
  • IP
  • EmailAddress
1 detect pii
PHI A collection of entities focusing on healthcare-related personal information, including:
  • Names
  • DateDOB
  • SSN
  • MedicareID
  • PhoneNumber
  • VIN
  • IP
  • EmailAddress
  • URL
1 detect phi
Names A collection of first and surnames gathered from the US Social Security Administration.
1 detect names

Using this entity in conjunction with another entity or entity group is recommended to minimize the number of false positives.

Healthcare Common Procedure Code System (HCPCS) A collection of codes representing Medicare procedures, supplies, and services.
1 detect hcpcs
HIPAA Entities Wrapper class for all HIPAA entities.
1 detect hipaa
1 detect hipaa medicare_id

Each entity group can be used with other search terms or entities. For example:

1 (detect icd10cm) Proximity (detect Email)
1 (detect icd10cm) Proximity ("diagnosed with")

The contents of the icd10cm, FDA Prescription Drug, and Names entity groups cannot be viewed via the Administration Console due to the size of these lists.

No Keywords Entity Operator

This operator disables the context keyword matching associated with many entities we support. Using this operator increases the likelihood of false positives occurring but simplifies whether a match will likely be found. For example, using the NKW operator causes the checks for terms associated with social security numbers to be ignored, and the check will only look for a regular expression match:

1 detect SSN_NKW
Was this article helpful?
0 out of 2 found this helpful

Comments

0 comments

Please sign in to leave a comment.