Getting started with Logs

Data ingestion process
Before you get stuck into this article, have you checked out How data is ingested into the Hub? This article covers our data ingestion process. We recommend reading this first as it provides the foundation for Logs

Monitoring your data in the Hub has just got even easier with Logs. Logs lets you view your data load and processing logs status at a glance. Logs aren't just about monitoring, they can also help troubleshoot data issues, making the resolution with the Lexer team even quicker. Get ready to take control of your data management like never before!

The order of Logs jobs

Jobs in the Hub will run in a specific order. This order can be a great help in identifying if and where something has gone wrong. The general order that occurs when uploading data contains 5 job types: 

  1. Integration or File Upload: You integrate or upload a file.
  2. Dataset Load: The uploaded data is loaded into a dataset.
  3. Dataset health checks: These checks determine whether there are any issues with the data in the dataset, with a focus on volume, recency and rejected records. 
  4. Dataflow: Your data goes through multiple processes to organize and enrich it for Hub readiness.
  5. Build: The final stages involve building your data into the functional Hub interface, enabling interrogation.
Your upload may not include all of the steps shown here. The jobs depend on the source and method you’ve chosen to use for your upload.

Understanding Logs components

Let’s start by navigating to Logs, navigate to Manage > Logs in your Hub. 

There are 6 components that Logs shows in the main page, you’ll also see 4 filters on the top of the page to help you find what you need. 

1. Job type

The specific type of job. A description of job types can be found below. 

Job type Definition

History Activation

Exports event data out of a dataset for use in outbound integrations eg. Facebook CAPI and Google Conversions.

Build Index

Creates the client Elasticsearch indexes. These indexes are what the Hub uses to perform segmentation, run analytical queries on the data and to power Activate.

Dataflow Job

Combines the output from datasets, surveys, activations and feature tables and generates an ElasticSearch index. Dataflows are the mechanism that provides a single customer view and standard attributes. 

Integration Job

Allows Lexer to retrieve data from integration partners, eg. Shopify and Klaviyo.

Dataset Load Job

Loads data into Datasets.

File Upload

Securely uploads customer data to Lexer.

Activations (to be added)

Sends segments to paid and owned channels (eg Klaviyo and Dot Digital).

Delete Dataset Job

Deletes the dataset and/or the data inside a dataset.

 

2. Name 

The name of the job. 

3. Status

The current status of the job, there are 4 potential statuses:

Status Definition

Success

The job finished successfully. 

Failed

There was an issue and the job did not successfully finish. 

Pending

A job has been created but hasn’t started yet.

In Progress

The job is currently running. 



4. Source

The source indicates where the job originated from and is inferred from the data that’s received. 

5. Created at

The time the job began. 

6. Duration

The total time taken for the job to complete. 

Navigating and filtering Logs

There are a few options to help you refine your search to find the correct logs. 

You can start by selecting which column will dictate the order. To do this, click on the headers for any of these columns to filter by ascending or descending order which is indicated by a corresponding arrow icon. 

The filters located at the top of the page enable you to further refine your search based on the components outlined in the previous section. When adding filtering options, you will see a count in brackets showing the number of selections within that filter. 

Last 7 Days - A date picker for the time period you’d like to search which defaults to the last 7 days.

Job Type - Select as many job types as you like from the drop down menu. 

Status - Select the types of status you are interested in seeing in your search. 

Dataset - Refers to the specific dataset that relates to the job. The name of the dataset will contain the dataset ID and dataset name ie. [.code]datasetID - datasetname[.code].

Once you make your selections in the filters, click on the apply button on the right side of the page. You will see the number of results adjusted to show the total number of jobs found in the search. 

Health checking your data

The best way to make sure your data is accurate is to conduct regular health checks. Within Logs you can see all of the load jobs, the process of ingestion of data from start to finish. It allows you to determine how your CDP is going and whether everything has been ingested properly. Datasets is a bit more granular, allowing you to check on the health of a dataset specifically, for more information about Datasets you can check out our Learn articles here

Finding a job or group of jobs

If you are attempting to find a particular job, use a combination of filtering to narrow down the scope. For example you might want to check on dataflows in the last 30 days. 

  1. Add a 30 day time frame to the date picker by clicking on “Last 7 Days”. 
  2. Select dataflows in the “Job type” field. 
  3. Click Apply on the right to initiate the search. 

You’ll now be able to see all dataflows in the last 30 days. You can narrow this down even further by adding a status filter if you are trying to identify any failed or pending dataflows. 

Failed jobs
Scheduled jobs can fail from time to time and correct themselves, there can be a number of reasons. If you have any concerns please contact us and we can look into it.

More detail with the job view

If you are looking for some additional information about a job, you can click on the job. This will open up the Job View. 

The job view adjusts based on the type of job as some details are not universal. 

Each view will contain:

  • Status
  • Created at
  • Updated at
  • Job Type
  • An ID (format will vary depending on job type)
  • Name
  • Duration (if relevant)
  • Provider (integrations only)
  • Pipeline type (integrations only)

This additional information can be really helpful if you are worried some data is yet to arrive into the Hub. You can check, for instance, the pipeline type to determine whether an integration is scheduled. 

Need some help?

If something looks a bit off and you need some help, you can copy the exact job you are having an issue with and send this to your Success Manager or Support. To share, open the job view, copy the URL and send it through, it's that simple. This provides us with more information about the search query you have used, the job that needs some attention, and ultimately speeds up the investigation significantly. 

That’s a wrap!

In this article we covered how you can assess the status of data load and processing logs, making sure everything is working as it should, and how to use Logs to troubleshoot your data issues. Logs serves as an invaluable tool for troubleshooting data issues, speeding up resolutions with the dedicated support of the Lexer team. Logs is here to add that extra oomph to your data game! 🚀

Updated:
September 6, 2023
Did this page help you?
Thank you! Your feedback has been received!
Oops! Something went wrong while submitting the form, for assistance please contact support@lexer.io