Validate data in the datalake
Last update: February 14, 2025
- Topics:
- Queries
CREATED FOR:
- Intermediate
- Developer
Learn how to validate if data has successfully ingested into the datalake using Adobe Experience Platform’s Query Service. For detailed product documentation, see the Query Editor UI guide.
Transcript
Hi everyone. Today, we are going to discuss about validate data in the data lake. The first thing is request the dataset schema name IDs that ingested the data for the profile. For that, keep the below details in handy, which includes schema name, dataset name, and dataset ID. The second thing is, why do we need dataset schema information for query service? Firstly, for the table name to look up when building the query, this can be found on the dataset overview page. Now log into your AEP profile. AEP stands for Adobe Experience Platform. On the dashboard to the left panel, scroll down and click on the tab named datasets. There you will find a browse option. Click on it and search for your desired dataset. Now click on the dataset. There on the right panel, you can easily locate the table name. There’s also an option to copy it for building query. Secondly, we can preview the last successful batch ingestion, having default limit of a hundred rows to see which XDM field holds the data, which we want to look up via query service. Now, in order to preview dataset on the top right corner, you can find the option to preview dataset. Click on the option to see the XDM field that’s holding the ECID value. Now you can view the XDM schema and when scrolling down on the right panel, we can locate the XDM field path that will be used in the SQL statement. Now let’s try to build an SQL statement using XDM paths. So what is the need to run the query? It is because the query in AEP works on ADL’s data. So if we need to confirm the data in ADL’s, we need to query the data in query tab. And if records are returned, it proves that the data exists in ADL’s. If not, ADL’s does not have any data. Go back to AEP profile and navigate to the query tab on the left panel. Then on the top right corner, you can see an option to create query. Click on it and write down your SQL statement. The statement is select aapsupport.identification.ecid as ECID from Gupta event dataset or website limit 10. You can see their desired results over here. At last, let’s move on to the explanation of SQL statement. The first line select aapsupport.identification.ecid as ECID is for selecting the XDM field that holds the value of ECIDs and then placing this into a column named ECID using the as command. The as command is used to rename a column of the table within alias. Alias only exist for the duration of the query. The second line that is from guftart event datasets for website, which is pointing the dataset table in question. Last line limit 10 limits the output results to 10. Please contact the AEP support team for any further assistance. Hope this was helpful. Thank you.
Previous pageRecharge your customer data
Next pageOverview
Experience Platform
- Platform Tutorials
- Introduction to Platform
- A customer experience powered by Experience Platform
- Behind the scenes: A customer experience powered by Experience Platform
- Experience Platform overview
- Key capabilities
- Platform-based applications
- Integrations with Experience Cloud applications
- Key use cases
- Basic architecture
- User interface
- Roles and project phases
- Introduction to Real-Time CDP
- Getting started: Data Architects and Data Engineers
- Authenticate to Experience Platform APIs
- Import sample data to Experience Platform
- Administration
- AI Assistant
- Audiences and Segmentation
- Introduction to Audience Portal and Composition
- Upload audiences
- Overview of Federated Audience Composition
- Connect and configure Federated Audience Composition
- Create a Federated Audience Composition
- Audience rule builder overview
- Create audiences
- Use time constraints
- Create content-based audiences
- Create conversion audiences
- Create audiences from existing audiences
- Create sequential audiences
- Create dynamic audiences
- Create multi-entity audiences
- Create and activate account audiences (B2B)
- Demo of streaming segmentation
- Evaluate batch audiences on demand
- Evaluate an audience rule
- Create a dataset to export data
- Segment Match connection setup
- Segment Match data governance
- Segment Match configuration flow
- Segment Match pre-share insights
- Segment Match receiving data
- Audit logs
- Data Collection
- Collaboration
- Dashboards
- Data Governance
- Data Hygiene
- Data Ingestion
- Overview
- Batch ingestion overview
- Create and populate a dataset
- Delete datasets and batches
- Map a CSV file to XDM
- Sources overview
- Ingest data from Adobe Analytics
- Ingest data from Audience Manager
- Ingest data from cloud storage
- Ingest data from CRM
- Ingest data from databases
- Streaming ingestion overview
- Stream data with HTTP API
- Stream data using Source Connectors
- Web SDK tutorials
- Mobile SDK tutorials
- Data Lifecycle
- Destinations
- Destinations overview
- Connect to destinations
- Create destinations and activate data
- Activate profiles and audiences to a destination
- Export datasets using a cloud storage destination
- Integrate with Google Customer Match
- Configure the Azure Blob destination
- Configure the Marketo destination
- Configure file-based cloud storage or email marketing destinations
- Configure a social destination
- Activate through LiveRamp destinations
- Adobe Target and Custom Personalization
- Activate data to non-Adobe applications webinar
- Identities
- Intelligent Services
- Monitoring
- Partner data support
- Profiles
- Understanding Real-Time Customer Profile
- Profile overview diagram
- Bring data into Profile
- Customize profile view details
- View account profiles
- Create merge policies
- Union schemas overview
- Create a computed attribute
- Pseudonymous profile expirations (TTL)
- Delete profiles
- Update a specific attribute using upsert
- Privacy and Security
- Introduction to Privacy Service
- Identity data in Privacy requests
- Privacy JavaScript library
- Privacy labels in Adobe Analytics
- Getting started with the Privacy Service API
- Privacy Service UI
- Privacy Service API
- Subscribe to Privacy Events
- Set up customer-managed keys
- 10 considerations for Responsible Customer Data Management
- Elevating the Marketer’s Role as a Data Steward
- Queries
- Overview
- Query Service UI
- Query Service API
- Explore Data
- Prepare Data
- Adobe Defined Functions
- Data usage patterns
- Run queries
- Generate datasets from query results
- Tableau
- Analyze and visualize data
- Build dashboards using BI tools
- Recharge your customer data
- Connect clients to Query Service
- Validate data in the datalake
- Schemas
- Overview
- Building blocks
- Plan your data model
- Convert your data model to XDM
- Create schemas
- Create schemas for B2B data
- Create classes
- Create field groups
- Create data types
- Configure relationships between schemas
- Use enumerated fields and suggested values
- Copy schemas between sandboxes
- Update schemas
- Create an ad hoc schema
- Sources
- Use Case Playbooks
- Experience Cloud Integrations
- Industry Trends