6 STEPS TO PREPARE YOUR CUSTOMER DATA FOR GDPR - The Global Marketing ...
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
6 STEPS TO PREPARE
YOUR CUSTOMER DATA
FOR GDPR
By Roger Matus, Vice President of Global Marketing, and
Ed Barbeau, Vice President of Product Management
If you have a website with a form or an email list that
includes European customers, the new European General
Data Protection Regulation (GDPR) will require you to
be able to access, delete, and make corrections to all of
your data about any EU individual, even if you don’t have
offices in Europe. This paper outlines six steps that will
Golden Master help prepare your customer record management system to
Records help handle the demand.
companies to
GDPR, which goes into effect on May 25, 2018, applies
quickly find, edit, to anyone in the world who offers goods or services to
or delete all the European Union (EU) individuals or monitors the behavior
information they of EU individuals. (In June 2017, the United Kingdom
have about any EU confirmed that GDPR will form a part of UK law, even after
Brexit. Therefore, for this document, the term EU is meant
individual at any to include the UK.)
time, as required by
GDPR. Unlike previous privacy directives, GDPR applies to
companies even if the company does not have a physical
presence in the EU. Many companies, including Internet-
only companies, that had never concerned themselves with
the handling of EU personal data must do so now.
The penalty for a failure to comply could mean a fine of up
to 20-million Euros (about US$24-million) or four percent of
worldwide revenues, whichever is greater.
Global-Z International | www.globalz.com | t: +1.802.445.1011 | gdpr_info@globalz.com“Personal data” is defined by the Euro- • The Right to Rectify: Individuals
pean Commission as “any information may require companies to correct
relating to an individual, whether it inaccurate data and may object to
relates to his or her private, professional any profiling that may result in
or public life. It can be anything from a discrimination.
name, a home address, a photo, an email
address, bank details, posts on social This list covers only a small subset of
networking websites, medical informa- the rights and regulations in the GDPR,
tion, or a computer’s IP address.” Al- which consists of 99 articles and 173
most anything collected on a website, recitals over 261 pages in English.
including behavioral activity, related to Finding All The Data
an EU person is regulated.
To meet these GDPR requirements, it
An “individual” includes both citizens will be necessary for organizations to
p2 and residents of any EU member state. be able to quickly find and modify all of
the information about an individual on
GDPR gives EU individuals meaningful
demand.
rights over the personal data you collect,
including the following: The most effective way to do this is to
create a searchable record management
• The Right to Consent: Individuals
system with a Golden Master Record for
must explicitly consent for you to use
each customer containing both the most
their personal data. Opt-outs and
recent information and a key that lets
pre-checked opt-in boxes are no longer
you find all of the source records.
sufficient. In addition, the uses for the
data must be disclosed at the time of However, building a Golden Master
collection. Consent may be withdrawn Record from multiple sources for each
or limited to specific purposes at any customer is not simple because the data
time. You must be able to prove that in different databases or even differ-
they have given consent. ent records in the same database rarely
match exactly. (See Finding Data Exam-
• The Right to Access: Individuals ple on the next page.)
may obtain a copy of the information
held about them. They also have the For global organizations, the challenge
right to know where the information of accurately matching a name increases
is stored, who can access it, how they when a name is written in a different
access it, and the reasons for access. character set in one country (e.g., Cyril-
lic or Greek) than another (e.g., Latin/
• The Right to Be Forgotten: Roman/Western). When an EU resident
Individuals may request that any or all of Greece writes his or her name and
information be deleted. address in Greece, he/she may use the
Global-Z International | www.globalz.com | t: +1.802.445.1011 | gdpr_info@globalz.comResolving Identity
FINDING DATA EXAMPLE
The solution to the problem of making
If the famous footballer (soccer player)
accurate matches is to use as many cus-
Therese Sjögran asked for a copy of all of
tomer attributes as possible and to estab-
the personal data your company main-
lish a tolerance level for each attribute
tained about her, would your system know
to accept a match. Local knowledge is
if the names listed below belong to Ms.
critical to do the matching properly be-
Sjögran?
cause the significance of each attribute
Therese Sjögran may vary by location.
Thérèse Sjögran
The best practice followed by those who
Teresa Sjogren
specialize in customer identity resolu-
Terry Sjogran
tion is to take the following six steps:
T. Sjögran
p3 KIT Sjögran 1. Gathering the data
Theresa Sjogrann
2. Data parsing
Kirstin Sjögran
Kerstin Ingrid Therese Sjögran 3. Postal address hygiene
4. Email address, phone number, and
The answer is that any of these could be other data hygiene (if available)
Ms. Sjögran, whose real name is Kerstin In-
5. Matching and merging (reconcilia-
grid Therese Sjögran. The name she might
tion)
submit on a Right To Access request might
not match all of the names used in all of her 6. Metadata management
transactions with your organization. These steps are explained below in more
detail:
Greek character set, which is different
from the way it would be written in the
rest of Europe. Generally, matching 1. Gathering The Data
systems use transliterations of names
for matching when character sets are The first step in the process of resolving
mixed. However, since transliterations a person’s identity is to gather all of the
vary, a matching system cannot rely on person’s data. Taking an inventory to
exact matches across countries. determine where personal data is stored
often involves large parts of an orga-
However, a system that is flexible in one nization to identify all of the potential
country could incorrectly match differ- source files. Examples of places where
ent people in countries where names are personal data may be found are the fol-
common. For example, according to The lowing:
Economist, one in five South Koreans
have the surname Kim.
Global-Z International | www.globalz.com | t: +1.802.445.1011 | gdpr_info@globalz.com• Marketing: Website forms, email lists, 2. Data Parsing
tradeshow leads, etc.
Simple approaches to matching will not
• Accounting: Billing, receiving, credit work because the same address may be
reports, etc. entered differently in different source
• Shipping / fulfillment records systems. For example, “23 David Street,
St. Helier” may be the same address as
• Sales: CRM systems, contact lists, etc. “Avon House, David St., Saint Helier,
Jersey.”
Proper data governance and data stew-
ardship include collecting and keeping The best practice is to parse the informa-
track of this source information so that tion into its component attributes. For
when requests are made from individu- example, a street address may be broken
als the proper actions can be taken on down into building number, directional
p4 the source files. (e.g., north or east), street name, and
street type (e.g., street or avenue). When
building a system for global use, it is
DATA PARSING EXAMPLES
ORIGINAL: PARSED:
23 DAVID PLACE Number: 23
ST HELIER JE2 4TE Street: David Place
City: St Helier
Postal Code: JE2 4TE
PARSING EXAMPLE WITH NON-ROMAN CHARACTER SET
An individual may write an address for billing in Japan as follows:
北海道札幌市東区北二十四条東3-3-1
The source information in the Japanese The same address would Romanized for a
system would be parsed as follows:: European copy of the database:
Block Sequence: 3-3-1 Block Sequence: 3-3-1
Area Name: 北二十四条東 Area Name: Kita-24 Johigashi
District: 東区 District: Higashi-Ku
City: 札幌市 City: Sapporo-Shi
State: 北海道 State: Hokkaido
Postal code: 065-0024 Postal Code: 065-0024
Global-Z International | www.globalz.com | t: +1.802.445.1011 | gdpr_info@globalz.comimportant to create a consistent data • Determine or validate the country
structure that can be used across coun-
tries. While parsing resolution varies by • Match to Postcode Address File (PAF)
geography, data usually may be parsed reference data for the specific country,
in the following way: Unit/Apartment/ which includes officially licensed
Flat, Premises, Number, Street, City, Dis- sources such as the country’s postal
trict, Town, Postal Code, and Country. authority, other government agencies,
and third parties.
3. Address Hygiene
• Correct and standardize identified
The same address may be entered into a components
database in a variety of ways, as shown
in the example on the next page. There- • Append/insert missing components
fore, the next step is to validate and While licensed and up-to-date PAF data
p5 standardize addresses. may cost more than other sources, they
Validation involves comparing a parsed can vastly improve accuracy and quality.
address with official records and mak- It is a good idea to ask vendors whether
ing corrections. In some countries, a they use officially licensed sources.
location may have multiple designations In large urban areas with multistory
or even names. Best practice would be buildings, such as Hong Kong, there
to perform the following operations:
ADDRESS HYGIENE EXAMPLE
The original records appear as the user entered them into various on-line forms:
FirstName LastName Address1 Address2 Address3 City Postcode
23 DAVID PLACE,
CHRISTIANE GELLESCH AVON HOUSE ST. HELIER JERSEY JE2 4TE
ST HELIER CHANNEL
CHRISTIANE GELLESCH AVON HOUSE 23 DAVID PLACE JERSEY ISLANDS JE2 2TE
CHRISTIANE GELLESCH AVON HOUSE 23 DAVID PLACE, ST. HELIER JERSEY JE2 4TE
CHRISTIANE GELLESCH 43 DAVID PLACE AVON HOUSE ST HELIER JE2 4TE
The parsed address is validated and standardized as follows:
Building: Avon House
Number: 23
Street: David Place
District: St. Helier
City: Jersey
Postcode: JE2 4TE
Global-Z International | www.globalz.com | t: +1.802.445.1011 | gdpr_info@globalz.commay be apartment buildings in which • Remove illegal characters.
many people have the same name
within the same premises. Sub-premises • Identify or insert country code and
detail, such as floor numbers or build- local area code.
ing information, adds significantly • Validate against reference database.
to the confidence in matching. Small
towns may use building names or route • Format phone number.
numbers without more specific address
With the postal address, email address,
components.
4. Email Address and Phone Number EMAIL HYGIENE EXAMPLE
Hygiene
In this example, the customer’s email
Email addresses and phone numbers address in the database is as follows:
p6 may add valuable information when
Christiane @ yaho.couk
used with names to resolve an individu-
al’s identity. As with postal addresses, a • Recognize that “.couk” is not a proper
critical step is hygiene. top-level domain.
• Parse the top-level domain and identify
The following steps may be used to
that “uk" is a proper country code top-
cleanse customer’s email address:
level domain for the United Kingdom.
• Check the email format for compliance • Alter the domain to “@ yaho.co.uk”.
with internet standards (RFC 2822).
• Check that the domain “yaho.co.uk”
• Parse email address into user and exists using the available DNS records. In
domain. this case, it does not.
• Recognize common errors, such as
• Correct common domain errors.
typing “yaho” instead of “yahoo”.
• Search Domain Name System (DNS) to • Alter the domain to “@ yahoo.co.uk”
confirm that the domain exists.
• Validate the domain “yahoo.co.uk” using
• When even greater confidence is the available DNS records.
required, send a test message to see if • For even higher accuracy, send a
the message is accepted by the server message to the email to see if the
to validate the specific email address. message bounces or use a third-party
Phone numbers may be similarly cor- service to validate the user mailbox.
rected using the following steps: The standardized address would be as
• Parse phone number to identify follows:
components. christiane @ yahoo.co.uk
Global-Z International | www.globalz.com | t: +1.802.445.1011 | gdpr_info@globalz.comand phone number fields cleansed, 5. Matching and Merging (Duplicate
standardized, and verified, it is now Reconciliation)
possible to compare records with each
other for matching and reconciliation, Simple attribute matching is not enough
which is the key to identity resolution to retrieve all customer records because
and building the Golden Master Record. even standardized records rarely match
PHONE NUMBER HYGIENE EXAMPLE
The same phone number may be entered into a database in the following ways:
+44 7509312345
07509 312345
0 7509312345
75.09.31.23.45
p7
A check of the address record would determine that the phone number is likely to be in the
Channel Islands, which is part of the UK. If the incoming number has a +44 at the beginning,
it can be matched to the country code for the UK. If it does not begin with a +44, the country
can be postulated from the country field of the address. The remaining digits can then be
parsed for a known UK pattern. Once properly parsed, the area code can be used to identify
the phone type, as follows:
Country code: +44
NDD Prefix: 0
Area code: 7509
Local number: 312345
Phone type: Mobile
The final result can then be shown in the correct formats. The domestic format would be
07509 312345. The international number format would be +44 7509 312345.
Validating using known patterns for a particular country adds confidence that it was correctly
entered into the system and parsed. Here are some ways in which invalid phone numbers
may be identified:
Country Data Found Problem
Australia 26621764 “26621” numbers must have exactly nine digits
Germany 5188547213 Numbers in Germany may not start with “5188”
Numbers may not start with “498” and “89250” numbers
Germany 0114989250 must have seven digits
France 3.33600E+11 Exponential notation caused by corruption
Global-Z International | www.globalz.com | t: +1.802.445.1011 | gdpr_info@globalz.comexactly. Personal data is always chang- An important technique known as
ing. People move. Individuals may pro- “cascading” helps to confidently build
vide a work phone number during one a complete Golden Master Record even
interaction and a mobile number during when some attributes disagree and to fill
another interaction. Many people also in incomplete information. An example
have multiple addresses (e.g., primary of cascading is described below.
and vacation homes), multiple phone
numbers (e.g., fixed landline and mo- For cascading to work, matching toler-
bile), and multiple email addresses (e.g., ances must be established individually
personal and work). for each attribute. The match tolerance
for any given attribute may differ based
Systems must be able to build a Golden on the geography. As mentioned earlier,
Master Record even when the informa- in some densely populated locations,
tion is incomplete or disagrees. Without such as Hong Kong, sub-premises detail
p8 that capability, a customer could ap- is important. The weight placed upon
pear multiple times. Therefore, a search those address components in Hong
would not retrieve all of the customer Kong would be higher than in a smaller
data required during a GDPR request. city.
CASCADING: A KEY MATCHING TECHNIQUE
Cascading uses all of the customer informa- However, with cascading, Record 3 can be
tion available across multiple records to matched. Because Record 1 and Record 2
increase the probability of a match. Con- are known to match, there is knowledge
sider the example of three records: about the customer’s identity gained from
the combination of the records: name, ad-
• “Record 1” contains a name, postal
dress, email address, and phone number.
address, and phone number.
This new combined knowledge can be
• “Record 2” contains a name, postal used for matching. The email address in
address, phone number, and email. Record 3 matches the combined knowl-
edge of Record 1 and Record 2.
• “Record 3” contains a name, email, and a
different phone number. As more data is compared the knowledge
cascades. In addition to the name, address,
Record 1 and Record 2 match because the email address, and phone number attri-
name, address and phone number match. butes, some valuable information used in
cascading includes gender, social media
Record 1 and Record 3 considered in isola-
handles and identification numbers.
tion would not be a match because the
phone number differs and there is other
useful data to match to Record 1 alone.
Global-Z International | www.globalz.com | t: +1.802.445.1011 | gdpr_info@globalz.comGOLDEN MASTER RECORD AND CASCADING EXAMPLE
The following illustration shows a Golden Master Record with the most accurate consolidated
information about the customer after successful matching with cascading:
Golden
Attribute Record 1 Record 2 Record 3 Record 4 Master
First Name Christiane Christiane Chris Christiane Christiane
Last Name Gellesch Gellesch Gellesch Gellesch Gellesch
Building Avon House Avon House Avon House
Number 23 43 23 23
Street David Place David Place David Place David Place
District St. Helier St. Helier St. Helier St. Helier St. Helier
City Jersey Jersey Jersey Jersey
p9
Postcode JE2 4TE JE2 4TE JE2 4TE JE2 4TE JE2 4TE
Country Channel Islands Channel Islands Channel Islands Channel Islands Channel Islands
Country Code +44 +44 +44 +44
Local Area Code 0 0 0 0
Area Code 7509 7509 7509 7509
Local Number 312345 312345 312345 312345
Christiane@yahoo. Christiane@yahoo. Christiane@yahoo. Christiane@yahoo.
Email co.uk co.uk co.uk co.uk
The art in the process of matching is individual grants the use of an address,
to understand the importance of each the permission would apply equally to
attribute in a particular geography and all of the address attributes: building,
to set the match threshold accordingly. number, street, district, and city.)
Significant local knowledge and experi-
ence are required to get this right. Best practice is for the metadata for each
attribute or group of attributes to in-
6. Metadata clude the types of permissions granted
and the date of the grant. The date is
Because GDPR allows individuals to critical because permissions can change
grant and change permissions for each and it is necessary to know the current
attribute, it is important that the Golden permissions. For example, it is possible
Master Record contain additional infor- that an individual will grant the use of
mation (“metadata”) for each attribute or an address for any purpose in January,
group of attributes about these permis- limit that purpose for credit card noti-
sions. (An example of a group of attri- fications in March, and then allow it for
butes would be a person’s address. If an sale notifications in June.
Global-Z International | www.globalz.com | t: +1.802.445.1011 | gdpr_info@globalz.comAs the Golden Master Record is cre- Conclusion
ated, it is also best practice to create
and maintain a link back to the source To meet European General Data Protec-
records. This is especially important as it tion Regulation (GDPR) requirements
is necessary to change or delete a source a customer identity resolution process
record when requested. becomes critical. While there are many
other requirements for GDPR compli-
The source databases must be consid- ance, it is necessary to be able to find,
ered when building the metadata for the edit, and potentially delete any personal
Golden Master Record as well. A large data about any EU individual.
conglomerate may want to have a Gold-
en Master Record that covers the entire Creating Golden Master Records with
conglomerate. However, an individual links to find the original source records
may grant permission to one subsidiary are critical to accomplishing this goal.
or brand, but not another subsidiary Deep international experience with an
p10
or brand. This may mean that the con- understanding of data in each country
glomerate does not have the permission greatly increases the chance of success.
to use the attribute in any place other If you would like to know more about
than where the permission was granted. Golden Master Records, this process or
Master record management requires how they would apply in your circum-
knowing the source of the permissions stance, please contact Global-Z at
and the applicable entity. gdpr_info@globalz.com.
v1/18
Copyright 2018 by Global-Z International. All Rights Reserved.
This article is for informational purposes only and should not be considered legal advice. You should seek appropriate counsel for
your own situation. Global-Z International is not liable for the information provided herein. Global-Z International is not responsible
for errors and all information is subject to change without notice. Global-Z and the Global-Z logo are trademarks of Global-Z
International. All other marks are the property of their respective owners.
Global-Z International | www.globalz.com | t: +1.802.445.1011 | gdpr_info@globalz.comYou can also read