Prepare data
Last update: February 14, 2025
- Topics:
- Queries
CREATED FOR:
- Beginner
- Developer
Learn how to clean, prepare, and combine data from multiple datasets to create a new dataset using CTAS (Create Table AS) and Spark SQL functions for reporting and dashboarding. For more information, please visit the Query Service documentation.
Transcript
Hi there. Adobe Experience Platform ingests data from a wide variety of sources. A major challenge for marketers is making sense of this data, to gain insights about their customers. Adobe Experience Platform’s Query Service facilitates that by allowing you to use standard SQL to query data in platform, using an interface and API. In this video let me show you how data engineers can query existing data sets and derive data, that can be used for reporting and dash boarding purposes. From your Experience Platform homepage navigate to “Data Sets” under “Data Management”. For this video, we are using a fictional retail brand called Luma. Luma loyalty data set contains information about customers, loyalty details, geographic details, et cetera. Luma web data contains web traffic information about customers’ interaction on the Luma site, including products viewed by a customer, visited pages, products purchased, et cetera. Open the Luma web data set and the last successful batch run ingested around 87 key records. These records are sample data created for demo purposes and capture user interaction with the Luma site. Let’s scroll down to the bottom of the source file to ensure that the total number of records ingested into Experience Platform, and the number of records within the source files, match. Let’s select the web interactions for a single user and check out some of the operations performed by a user, and what information is captured. Let’s take a closer look at the order placed by our customer. In a single order, the customer purchased two items stored in a product list item array. Each array object contains information about the product, such as SKU, total price, and description. Now let’s switch back to the platform UI and navigate to queries. Let’s click on “Create a New Query” option. Let’s take a use case where a marketer needs to create a report about orders placed by a loyalty customer, and it should include customer details and product details. So clearly, we need to get information from two different data sets, Luma loyalty data and Luma web data, to create this report. As a data engineer, you can write a query to combine the data sets, by performing a “Join SQL” operation and then filtering the result to display necessary details. Remember when one customer ordered multiple products in a single purchase, and product details were stored as a product list item array. When your query result contains an array object column, and you want to separate elements of an array into multiple roles, you can use the “Explode Spark” SQL function. Adobe Experience Platform Query Service, provides several built-in Spark SQL functions to extend SQL functionality. This document contains the Spark SQL functions that Query Service currently supports. Also note that we have a “CTAS” statement in our query, which means a query result gets stored as a data set in Experience Platform. Let’s run the query and open successful query execution and you can obtain the data set ID for the new data set under the results tab. Let’s navigate to the query window and check the logs. Click on the data set created, and it should open the order summary data set, created by the “CTAS” command, and the last batch ingestion, inserted a few records. Let’s preview the data set to make sure that the data set contains data required by the marketer. This data set contains order transaction details of loyalty members and products purchased by them. The marketer can now use this data set to create a report or create a dashboard using this data. Using Query Service in Adobe Experience Platform, data engineers can quickly explore, and validate data ingested to data sets, derive insights from the data, and prepare it for various purposes. -
Previous pageExplore Data
Next pageAdobe Defined Functions
Experience Platform
- Platform Tutorials
- Introduction to Platform
- A customer experience powered by Experience Platform
- Behind the scenes: A customer experience powered by Experience Platform
- Experience Platform overview
- Key capabilities
- Platform-based applications
- Integrations with Experience Cloud applications
- Key use cases
- Basic architecture
- User interface
- Roles and project phases
- Introduction to Real-Time CDP
- Getting started: Data Architects and Data Engineers
- Authenticate to Experience Platform APIs
- Import sample data to Experience Platform
- Administration
- AI Assistant
- Audiences and Segmentation
- Introduction to Audience Portal and Composition
- Upload audiences
- Overview of Federated Audience Composition
- Connect and configure Federated Audience Composition
- Create a Federated Audience Composition
- Audience rule builder overview
- Create audiences
- Use time constraints
- Create content-based audiences
- Create conversion audiences
- Create audiences from existing audiences
- Create sequential audiences
- Create dynamic audiences
- Create multi-entity audiences
- Create and activate account audiences (B2B)
- Demo of streaming segmentation
- Evaluate batch audiences on demand
- Evaluate an audience rule
- Create a dataset to export data
- Segment Match connection setup
- Segment Match data governance
- Segment Match configuration flow
- Segment Match pre-share insights
- Segment Match receiving data
- Audit logs
- Data Collection
- Collaboration
- Dashboards
- Data Governance
- Data Hygiene
- Data Ingestion
- Overview
- Batch ingestion overview
- Create and populate a dataset
- Delete datasets and batches
- Map a CSV file to XDM
- Sources overview
- Ingest data from Adobe Analytics
- Ingest data from Audience Manager
- Ingest data from cloud storage
- Ingest data from CRM
- Ingest data from databases
- Streaming ingestion overview
- Stream data with HTTP API
- Stream data using Source Connectors
- Web SDK tutorials
- Mobile SDK tutorials
- Data Lifecycle
- Destinations
- Destinations overview
- Connect to destinations
- Create destinations and activate data
- Activate profiles and audiences to a destination
- Export datasets using a cloud storage destination
- Integrate with Google Customer Match
- Configure the Azure Blob destination
- Configure the Marketo destination
- Configure file-based cloud storage or email marketing destinations
- Configure a social destination
- Activate through LiveRamp destinations
- Adobe Target and Custom Personalization
- Activate data to non-Adobe applications webinar
- Identities
- Intelligent Services
- Monitoring
- Partner data support
- Profiles
- Understanding Real-Time Customer Profile
- Profile overview diagram
- Bring data into Profile
- Customize profile view details
- View account profiles
- Create merge policies
- Union schemas overview
- Create a computed attribute
- Pseudonymous profile expirations (TTL)
- Delete profiles
- Update a specific attribute using upsert
- Privacy and Security
- Introduction to Privacy Service
- Identity data in Privacy requests
- Privacy JavaScript library
- Privacy labels in Adobe Analytics
- Getting started with the Privacy Service API
- Privacy Service UI
- Privacy Service API
- Subscribe to Privacy Events
- Set up customer-managed keys
- 10 considerations for Responsible Customer Data Management
- Elevating the Marketer’s Role as a Data Steward
- Queries
- Overview
- Query Service UI
- Query Service API
- Explore Data
- Prepare Data
- Adobe Defined Functions
- Data usage patterns
- Run queries
- Generate datasets from query results
- Tableau
- Analyze and visualize data
- Build dashboards using BI tools
- Recharge your customer data
- Connect clients to Query Service
- Validate data in the datalake
- Schemas
- Overview
- Building blocks
- Plan your data model
- Convert your data model to XDM
- Create schemas
- Create schemas for B2B data
- Create classes
- Create field groups
- Create data types
- Configure relationships between schemas
- Use enumerated fields and suggested values
- Copy schemas between sandboxes
- Update schemas
- Create an ad hoc schema
- Sources
- Use Case Playbooks
- Experience Cloud Integrations
- Industry Trends