Lexer's Identity Resolution

What is Identity Resolution?

Identity Resolution sets the foundation for a single-customer view. Identity resolution works by linking customer touchpoints to their customer profile, these are aptly named “links”. By mapping different datapoints into one customer profile, segmentation and insights become possible across the entire MarTech stack.

“Identity Resolution” is a data workflow, whereas the “Single-Customer View” is the output in Lexer’s Understand tool and allows you to activate to different marketing channels.

1896

Problem and pain points

Let’s start with the problem: fragmented data that arises from multiple systems supporting customer experience.

For example, imagine a scenario where a customer was receiving messages from two different marketing campaigns. On Monday the customer receives an email promoting outdoor jackets for outdoor activity, on Wednesday they get a notification for streetwear ensembles. These messages could conflict and prevent conversion: how could this happen?!

A common source of this discrepancy is that the same customer profile actually lives in two segments. Data from two systems, such as ecommerce and email, form the basis of a customer profile, but without resolution these profiles aren’t unified.

This impacts customer experience and also limits progress in the marketing program because it erodes trust. With a low trust in data, it can become difficult to commit to plans, the cycle of learning from campaigns slows down and the whole marketing program can stall.

How it works

Identity Resolution framework

The workflow to Identity Resolution has five steps:

  1. Collect: Lexer connects to customer-facing systems, via integration or solution design, and acquires data into our secure environment.
  2. Clean: The data is then processed so that each record meets data type specifications — i.e. email and mobile formats.
  3. Links: Relationships are discovered between profiles and touchpoints.
  4. Resolve: A profile of the customer is now formed from linked records.
  5. Deploy: A single-view of customer is made available for personalization through attribute calculation, segmentation, and activation.

Note: It can be important to spend some time at step three to ensure that the selected linkage configuration matches the business logic.

1895

Concepts

Here’s a set of terms you’ll see a lot when exploring this topic. This glossary pulls together the Lexer definition of these terms.

Identity Resolution

Identity Resolution is the process of linking records of a person to form a “Single-Customer View” (SCV). An accurate SCV is crucial for powerful segmentation and personalized messaging. Broadly there are two approaches to unification: deterministic and probabilistic.

The deterministic approach trusts the identifiers (such as email, and customer_id) to represent a person and link the records.

The probabilistic approach is more complex but can overcome poor data quality to find links between records.

Identity Resolution is otherwise known as unification, deduplication and stitching. It’s also a more specialized implementation of Record Linkage in that it focusses on people instead of other entities.

Graph

A graph is a useful data structure for representing relationships between records. In this case, we are using a graph to represent the relationships between customers. If you’ve ever played Six Degrees of Kevin Bacon, you’ll understand the concept of a graph. Just replace “movies” starring Kevin Bacon with “customers” buying t-shirts and opening emails and you’ve got a Single-Customer View.

Touch-point

An interaction with a customer such as an in-store purchase, email click or a website login. These are otherwise known as interactions, events, or observations.

Identifier

A datapoint in a profile that directly (deterministically) or indirectly (probabilistically) can be found in another profile or touch-point. A high-confidence identifier will uniquely link to only one customer profiles.

Good examples of identifiers are:

  • Customer ID assigned by a Point of Sale system or CRM
  • System ID assigned by a commerce or marketing application such as Shopify or Klaviyo
  • Email Address provided by the customer at the time of signup, although shared email addresses are common

Examples of identifiers that have linkage with less confidence include:

  • A mobile phone number
  • A web browser or cookie
  • A household address

Link

A link represents a relationship between one or more profiles (entities) or touchpoints (events). A deterministic link is a shared datapoint such as a customer’s email address or customer ID.

Deterministic Identity Resolution

The simplest kind of identity resolution, called deterministic or rules-based record linkage, generates links based on the number of individual identifiers that match among the available data sets. Two records are said to match via a deterministic record linkage procedure if all or some identifiers (above a certain threshold) are identical. Deterministic record linkage is a good option when the entities in the data sets are identified by a common identifier, or when there are several representative identifiers (e.g. name, date of birth, and sex, when identifying a person) whose quality of data is relatively high.

Probabilistic Identity Resolution

Probabilistic record linkage, sometimes called fuzzy matching takes a different approach to the record linkage problem by taking into account a wider range of potential identifiers. For each pair of identifiers between records a model estimates a match or a non-match, and using these weights to calculate the probability that two given records refer to the same entity. Record pairs with probabilities above a certain threshold are considered to be matches, while pairs with probabilities below another threshold are considered to be non-matches. Pairs that fall between these two thresholds are considered to be "possible matches" and can be dealt with accordingly (e.g. human reviewed, linked, or not linked, depending on the requirements). Probabilistic identity resolution is also called probabilistic merging or fuzzy merging.

Defining Success

Success in Identity Resolution means finding the best links between profiles and interactions for a customer profile.

To reach this objective and get linkage just right, we need to solve for two cases — finding too many links (over unification) or too few links (under unification).

Over unification

Over unification is where profiles are merged that should actually be distinct. At the worse case the identity graph can collapse and see many identities against a single profile. Factors leading to over unification include low data quality or spurious input such as false addresses like test@test.com.

To protect against over unification Lexer implements link occurrence thresholds and term frequency analysis. For data that may not have sufficient linkage quality Lexer also supports appending attributes to profiles after resolution has occurred.

Under unification

Under unification occurs when valid links do not match due to a mismatch in formatting. Lexer deploys rich data type specifications to clean and normalize data to maximize the linkage opportunity.

Identity Resolution recap

Lexer's Identity Resolution sets the foundation for a single-customer view and works by linking customer touchpoints to their customer profile. By mapping different datapoints into one customer profile, segmentation and insights become possible across the entire MarTech stack.

Updated:
September 23, 2022
Did this page help you?
Thank you! Your feedback has been received!
Oops! Something went wrong while submitting the form, for assistance please contact support@lexer.io
Welcome to Lexer!
Fundamentals
Getting started
Our glossary
Fundamentals
Getting started
Integrations
Fundamentals
Setup
My account
Fundamentals
Setup
Manage team
Fundamentals
Setup
Group permissions
Fundamentals
Setup
Classifications
Fundamentals
Setup
Out of the box segments
Fundamentals
Setup
Browser guide
Fundamentals
Security
Corporate networks
Fundamentals
Security
Emergency contact
Fundamentals
Security
Multi-factor authentication
Fundamentals
Security
Single sign-on
Fundamentals
Security
Trust and compliance
Fundamentals
Security
Lexer's Identity Resolution
Fundamentals
Identity Resolution
Troubleshooting tech issues
Fundamentals
Troubleshooting
Error code: 503 Service Unavailable
Fundamentals
Troubleshooting
Error code: 401 Unauthorized
Fundamentals
Troubleshooting
Error code: 403 Forbidden
Fundamentals
Troubleshooting
Troubleshooting Activate
Fundamentals
Troubleshooting
Troubleshooting Respond
Fundamentals
Troubleshooting
Help! My data is missing from the Hub
Fundamentals
Troubleshooting
Understanding APIs at Lexer
Data
Data Onboarding
Providing JSON data to Lexer
Data
Data Onboarding
Providing CSV data to Lexer
Data
Data Onboarding
Upload using SFTP
Data
Data Onboarding
Upload using S3
Data
Data Onboarding
Lexer data specification
Data
Lexer Data Specification
Customer data specification
Data
Lexer Data Specification
Commerce data specification
Data
Lexer Data Specification
Marketing data specification
Data
Lexer Data Specification
Compliance data specification
Data
Lexer Data Specification
Data Formatting and Validation
Data
Getting Started with APIs
Authentication and API token creation
Data
Getting Started with APIs
Rate Limits
Data
Getting Started with APIs
Response codes and common errors
Data
Getting Started with APIs
Product imagery
Data
Getting Started with APIs
Currency conversion
Data
Getting Started with APIs
Lexer’s APIs overview
Data
Lexer’s APIs
Dataset management in the Hub
Data
Dataset management
Chatbox user API
Data
Lexer’s APIs
Activity API
Data
Lexer’s APIs
Visualize API
Hidden from nav
Profile Read API
Data
Lexer’s APIs
Lexer Javascript Tag basics
Data
Lexer Javascript Tag
Lexer Javascript Tag technical guide
Data
Lexer Javascript Tag
Lexer Javascript Tag use cases
Data
Lexer Javascript Tag
dataLayer configuration: Shopify
Data
Lexer Javascript Tag
Customer segment CSV export
Data
Data off-boarding
Export to CSV
Data
Data off-boarding
Data in Lexer's CDXP
Understand
Customer Data
Lexer's attributes
Understand
Customer Data
Attribute value types
Understand
Customer Data
Data source - CRM
Understand
Customer Data
Data source - Transactions
Understand
Customer Data
Data source - Email
Understand
Customer Data
Partner data - Experian
Understand
Customer Data
Partner data - Mastercard
Understand
Customer Data
Partner data - Roy Morgan
Understand
Customer Data
GDPR and CCPA requests
Understand
Customer Data
Upload data files
Understand
Customer Data
File upload API
Understand
Customer Data
Data provision and schemas
Understand
Customer Data
Segment overview
Understand
Segment
Creating segments
Understand
Segment
Smart Search
Understand
Segment
Export attribute results
Understand
Segment
Contact a customer
Understand
Segment
Fixing a disabled segment
Understand
Segment
Profile tab
Understand
Segment
Compare segments
Understand
Compare
Compare attributes
Understand
Compare
Activate overview
Engage
Activate
Ongoing activations
Engage
Activate
Audience splits
Engage
Activate
A/B splits
Engage
Activate
Control group splits
Engage
Activate
Inbox filtering
Engage
Respond
Ignored Senders
Engage
Respond
Forms for service
Engage
Respond
Workflow states
Engage
Respond
Bulk changes
Engage
Respond
Scheduled replies
Engage
Respond
Message templates
Engage
Respond
Customer profiles
Engage
Respond
Grouped messages
Engage
Respond
Automation rules
Engage
Respond
Redact messages
Engage
Respond
Track overview
Measure
Track
Activity overview
Measure
Activity
Team report
Measure
Activity
Cases report
Measure
Activity
Listen overview
Measure
Listen
Searching in Listen
Measure
Listen
Tier filters
Measure
Listen
Boolean search
Measure
Listen
Saved dives
Measure
Listen
Email notifications
Measure
Listen
Twitter data
Measure
Listen
Facebook data
Measure
Listen
Instagram data
Measure
Listen
Visualize overview
Measure
Visualize
Curate feed
Measure
Visualize
Report overview
Measure
Report
// Rich text for code blocks and nested lists