Delete datasets and batches
Last update: February 21, 2025
- Topics:
- Data Hygiene
- Datasets
CREATED FOR:
- Intermediate
- Developer
Learn how to delete datasets and batches in Adobe Experience Platform. If a dataset needs to be removed from the system for any reason, such as cleaning up test datasets in lower environments or datasets that were added in error, you can simply delete that dataset and remove its contents from the data lake, identity graph, and profile store. Individual batches can be deleted from the data lake, but not from the identity graph and profile store.

Transcript
In this video, we’ll show you how to delete a dataset, an experienced platform. When you ingest data into platform, that data is stored in a dataset and platforms data leak. If the dataset has been enabled for profile, the data is used to build identity graphs and populate your customer profiles with attributes and events. If a dataset needs to be removed from the system for any reason, such as cleaning up test datasets in lower environments or datasets that were added in error, you can simply deleted inside and remove its contents from the data like identity graph and profile store. There are a few methods you can use to delete platform datasets. To delete a dataset manually, you can use the platform interface or make calls to the catalog API. This will be the method we’ll focus on in this video. If your organization has access to health care shield or security and Privacy Shield, you can also configure your platform data sets to automatically expire on a scheduled future date. Check out our video on automated expirations for more details. Keep in mind that while any dataset you create can be deleted using these methods, system datasets that are auto created by Adobe applications like Analytics Audience Manager or offer decisioning cannot be deleted. All that being said though, why would you want to delete a dataset in the first place? Well, while every organization’s data requirements are different, here’s a few common use cases. Deleting datasets helps them for its data minimization principles, since naturally reducing the amount of data in the system. If the dataset hasn’t been used for anything for an extended period, it may need to be deleted. Second, deleting datasets should be a common practice when removing old test data from lower level development sandboxes when engineers develop and migrate new features. This keeps your test environments clean without the need to do a full sandbox reset and reconfigure everything from scratch. Finally, in rare cases, it may be necessary to make breaking changes to your data model and associated data flows to account for oversights in their initial design. While careful schema planning should make this an avoidable situation, in most cases under extreme circumstances, you can delete the offending schema and its associated datasets and attempt to re ingest the data under a new or modified schema. Now, if your data issues simply stem from a bad batch or a few erroneous records, you won’t need to go as far as deleting the whole dataset. In the case of a bad batch, you can simply delete that batch from the dataset itself, which will show you in this video. Keep in mind that, unlike deleting datasets, deleting batches only removes data from the data lake. It doesn’t remove data from the identity graph or profile store. If there are some individual profiles with incorrect attributes that you want to correct. You can do that directly by using the absolute method through platform APIs. Refer to our corresponding video guide for more info. Okay. Now that that’s all out of the way, let’s jump into the platform interface and walk through the process of deleting a dataset and seeing how that can affect a customer profile. Let’s start with our dataset first. Start by selecting data sides in the left navigation, and here you’ll see a list of all datasets available for your organization and which ones are participating in real time customer profile. Using the search feature, I can narrow down the list to this loyalty dataset, which captures some loyalty program information about our customers. Our engineers only added this dataset recently, and we’re still troubleshooting how it behaves in real time customer profile. So we need to delete this dataset before they add an updated version. We could just go ahead and delete the dataset from here. But before we do that, let’s quickly head over to profiles in the left NAV and look up a sample profile to see how this will affect our data downstream. Since they know the email ID of a particular profile. I’ll use it here to pull it up from the list and click into it to see its details. And here I can see a list of attributes derived from the loyalty data set from earlier. While we’re just looking at a single profile here, deleting the dataset will affect all profiles that use it as a source. Clicking into the customers identity graph, we can see the various identities we have for this profile and how they’re linked to each other. One of these is the customer’s loyalty ID and when we selected it we can see which data says this identity link was inferred from. See, in this case, we know that loyalty IDs were not defined correctly in our test schema, and this actually isn’t the right value for this customer. So I’ll click this node and in the right rail, I can see our Luma loyalty data set listed as the source. This is the data that I want to delete, so I’ll click into it from here. Once we’re in the details view for the data side, you can see a history of previously ingested batches. If this happened to be an ingestion issue with one of the batches and not the data set itself, we could click into the batch from here and then use the delete batch control to remove it from the system cleanly. In this case, though, we want to delete the whole data set, so we’ll go back to the dataset view and from here we’ll select more in the top right and then select DELETE will confirm our choice in the dialog and that’s it. We can check back on the sample profile we looked up earlier and now we can see that the loyalty details it had earlier are gone. Since those attributes were sourced from the data set we just deleted. The effects are immediate down to the individual profile level. Clicking into the identity graph. For this profile we can see that deleting the data slide also removed the node for the loyalty ID tied to this customer’s profile. As a result, any future ingested events or records containing that loyalty ID will not be stitched to this profile unless also paired with one or more of these remaining identities in the graph. As you can see, it’s really easy to delete data sets in the interface, but keep in mind that Experience Platform uses an API first architecture, meaning that any action you can do in the interface can also be done using calls to platforms, open APIs. When it comes to deleting data sets, you can use the data such endpoint in the catalog service API. Simply use the delete method and include the data sets ID in the path. Now you know how to delete a data set and Experience Platform. We hope this functionality will help you ensure that you’re not spending resources on storing any information that you no longer need. Thanks for watching.
Previous pageData prep for data hygiene
Next pagePseudonymous profile and event expiration (TTL)
Experience Platform
- Platform Tutorials
- Introduction to Platform
- A customer experience powered by Experience Platform
- Behind the scenes: A customer experience powered by Experience Platform
- Experience Platform overview
- Key capabilities
- Platform-based applications
- Integrations with Experience Cloud applications
- Key use cases
- Basic architecture
- User interface
- Roles and project phases
- Introduction to Real-Time CDP
- Getting started: Data Architects and Data Engineers
- Authenticate to Experience Platform APIs
- Import sample data to Experience Platform
- Administration
- AI Assistant
- Audiences and Segmentation
- Introduction to Audience Portal and Composition
- Upload audiences
- Overview of Federated Audience Composition
- Connect and configure Federated Audience Composition
- Create a Federated Audience Composition
- Audience rule builder overview
- Create audiences
- Use time constraints
- Create content-based audiences
- Create conversion audiences
- Create audiences from existing audiences
- Create sequential audiences
- Create dynamic audiences
- Create multi-entity audiences
- Create and activate account audiences (B2B)
- Demo of streaming segmentation
- Evaluate batch audiences on demand
- Evaluate an audience rule
- Create a dataset to export data
- Segment Match connection setup
- Segment Match data governance
- Segment Match configuration flow
- Segment Match pre-share insights
- Segment Match receiving data
- Audit logs
- Data Collection
- Collaboration
- Dashboards
- Data Governance
- Data Hygiene
- Data Ingestion
- Overview
- Batch ingestion overview
- Create and populate a dataset
- Delete datasets and batches
- Map a CSV file to XDM
- Sources overview
- Ingest data from Adobe Analytics
- Ingest data from Audience Manager
- Ingest data from cloud storage
- Ingest data from CRM
- Ingest data from databases
- Streaming ingestion overview
- Stream data with HTTP API
- Stream data using Source Connectors
- Web SDK tutorials
- Mobile SDK tutorials
- Data Lifecycle
- Destinations
- Destinations overview
- Connect to destinations
- Create destinations and activate data
- Activate profiles and audiences to a destination
- Export datasets using a cloud storage destination
- Integrate with Google Customer Match
- Configure the Azure Blob destination
- Configure the Marketo destination
- Configure file-based cloud storage or email marketing destinations
- Configure a social destination
- Activate through LiveRamp destinations
- Adobe Target and Custom Personalization
- Activate data to non-Adobe applications webinar
- Identities
- Intelligent Services
- Monitoring
- Partner data support
- Profiles
- Understanding Real-Time Customer Profile
- Profile overview diagram
- Bring data into Profile
- Customize profile view details
- View account profiles
- Create merge policies
- Union schemas overview
- Create a computed attribute
- Pseudonymous profile expirations (TTL)
- Delete profiles
- Update a specific attribute using upsert
- Privacy and Security
- Introduction to Privacy Service
- Identity data in Privacy requests
- Privacy JavaScript library
- Privacy labels in Adobe Analytics
- Getting started with the Privacy Service API
- Privacy Service UI
- Privacy Service API
- Subscribe to Privacy Events
- Set up customer-managed keys
- 10 considerations for Responsible Customer Data Management
- Elevating the Marketer’s Role as a Data Steward
- Queries
- Overview
- Query Service UI
- Query Service API
- Explore Data
- Prepare Data
- Adobe Defined Functions
- Data usage patterns
- Run queries
- Generate datasets from query results
- Tableau
- Analyze and visualize data
- Build dashboards using BI tools
- Recharge your customer data
- Connect clients to Query Service
- Validate data in the datalake
- Schemas
- Overview
- Building blocks
- Plan your data model
- Convert your data model to XDM
- Create schemas
- Create schemas for B2B data
- Create classes
- Create field groups
- Create data types
- Configure relationships between schemas
- Use enumerated fields and suggested values
- Copy schemas between sandboxes
- Update schemas
- Create an ad hoc schema
- Sources
- Use Case Playbooks
- Experience Cloud Integrations
- Industry Trends