Dataset management in the Hub
Datasets in Lexer: An explainer
Lexer’s Dataset Manager gives you the ability to explore the data and high-level statistics from each of your integrated accounts, directly in the Hub.
Each of your datasets contain record types specific to each integrated account that has data flowing into the Hub. These have been transformed and organised into datasets that follow Lexer’s standard schema.
For example, Shopify datasets will contain record types related to Shopify online transactions, including customer records, products, orders, and returns. Whereas Klaviyo datasets will contain record types relating to email events, including customer, emails sent, emails clicked, subscribed, etc.
Organizing data into a dataset
In the diagram below, we have used Klaviyo as an example to illustrate how data makes its way into a dataset from the original Klaviyo source. You are then able to view this dataset within the Lexer Hub.
- Klaviyo connects with Lexer via an API, and data is transferred into Lexer via this method.
- The raw data that comes in from Klaviyo is stored in an Amazon S3 bucket.
- The data from this S3 bucket is then transformed into Lexer's accepted data schema.
- The data is then arranged into a dataset than can be accessed and viewed via the Lexer Hub.

So, how can I use datasets?
Having your datasets easily accessible within the Hub has a number of benefits and uses, and we have outlined the use cases below.
Data QA and validation
Lexer’s Dataset Manager lets you quickly view and validate the health of your integration feed, including data continuity, volume, and freshness of data flowing into your datasets. Drilling down even further, you can click into each record to view the payload for individual customer records.
Dataset statistics
Within the Dataset Manager you can also access high-level statistics and charts, which allow you to quickly visualize important metrics for each dataset. These will vary depending on the type of records contained within the dataset.


As we continue to build out our API capabilities, the Dataset Manager will also give you the ability to create your own datasets, which can be written into directly using JSON and CSV uploads!
Finding your datasets in the Hub
You can view your datasets in the Hub by navigating to Manage > Datasets in the top navigation bar.

Please note that access to datasets is managed via Group Permissions and Lexer admins will be able to give access to this feature by selecting the checkbox for Can view and manage datasets when creating, or editing, a permissions group. Contact Lexer if you need yours updated to enable access to this feature.
Navigating the Dataset Manager UI
Once you find your datasets in the Hub, navigating this tool and understanding each different section is easy!
All your datasets will be listed in the left-side panel. You’ll be able to see the name of each dataset along with a brief description of the dataset, the status and time of the last job load, and when the dataset load job last ran.

Click on a dataset in this panel to open the detailed dataset view in the main window. You can also click on the View button in the top, right-hand corner of the dataset screen to see more details about your dataset and the jobs that have run.

The Jobs tab is especially useful because you’ll be able to see a history of your dataset. This includes:
- The status of the last job load: Did it run successfully, did it fail, or is it still pending?
- When the run job started (dates and times are displayed in your local timezone).
- How long the job took to run.
- Which record types were updated.

To find out more information about each individual job, you can click on the row it belongs to, which will bring up the Job View panel.
Towards the bottom of the panel you’ll see a section called Stats. The table in this section displays a list of the record types that were updated, including:
- Total Records: The sum of New Records + Updated Records.
- New Records: All new records that will be loaded to the CDE.
- Updated Records: All existing records that have been updated.
- Rejected Records: All records that have been rejected and are not a part of the Total Records count.

We can then collapse these panels and move back to the main view where you will see a list of record types for the selected dataset. These form the basis of the dataset you are viewing.
The example below shows a list of Customer records in a Klaviyo dataset. You can view other record types within the dataset by selecting a different dataset record type from the tabs along the top.

Click on a record to view detailed payload information.
Use the date picker at the top of the page to change the timelines on the data you wish to view.

Dataset metrics and statistics
Relevant, top-line metrics for the selected record type are presented at the top of the main window, along with a chart that presents a view of high-level metrics relevant to the record type within the dataset across the date range selected.

What’s next for the Dataset Manager?
The next step with datasets is to give you the ability to create and edit your own datasets within the Dataset Manager, and to support direct integration using Lexer’s Write API, which is coming soon, so watch this space!