What is the difference between Query Service and Data Distiller?

Query Service: Used for SQL queries focused on data exploration, validation, and experimentation. Outputs are not stored in the data lake, and execution time is limited to 10 minutes. Ad hoc queries are suited for lightweight, interactive data checks and analyses.

Data Distiller: Enables batch queries that process, clean, and enrich data, with results stored back in the data lake. These queries support longer execution (up to 24 hours) and additional features like scheduling, monitoring, and accelerated reporting. Data Distiller is ideal for in-depth data manipulation and scheduled data processing tasks.

See the Query Service packaging document for more detailed information.

Question categories

The following list of answers to frequently asked questions is divided into the following categories:

General Query Service questions

This section includes information on performance, limits, and processes.

Can I turn off the auto-complete feature in the Query Service Editor?

Answer
No. Turning off the auto-complete feature is not currently supported by the editor.

Why does the Query Editor sometimes become slow when I type in a query?

Answer
One potential cause is the auto-complete feature. The feature processes certain metadata commands that can occasionally slow the editor during query editing.

Can I use Postman for the Query Service API?

Answer
Yes, you can visualize and interact with all Adobe API services using Postman (a free, third-party application). Watch the Postman setup guide for step-by-step instructions on how to set up a project in Adobe Developer Console and acquire all the necessary credentials for use with Postman. See the official documentation for guidance on starting, running, and sharing Postman collections.

Is there a limit to the maximum number of rows returned from a query through the UI?

Answer
Yes, Query Service internally applies a limit of 50,000 rows unless an explicit limit is specified externally. See the guidance on interactive query execution for more details.

Can I use queries to update rows?

Answer
In batch queries, updating a row inside the dataset is not supported.

Is there a data size limit for the resulting output from a query?

Answer
No. There is no limit on data size, but there is a query timeout limit of 10 minutes from an interactive session. If the query is executed as a batch CTAS then a 10-minute timeout is not applicable. See the guidance on interactive query execution for more details.

How do I stop my queries from timing out in 10 minutes?

Answer

One or more of the following solutions are recommended in case of queries timing out.

Is there any issue or impact on Query Service performance if multiple queries run simultaneously?

Answer
No. Query Service has an autoscaling capability that ensures concurrent queries do not have any noticeable impact on the performance of the service.

Can I use reserved keywords as a column name?

Answer
There are certain reserved keywords that cannot be used as column name such as, ORDER, GROUP BY, WHERE, DISTINCT. If you want to use these keywords, then you must escape these columns.

How do I find a column name from a hierarchical dataset?

Answer

The following steps describe how to display a tabular view of a dataset through the UI, including all nested fields and columns in a flattened form.

  • After logging into Experience Platform, select Datasets in the left navigation of the UI to navigate to Datasets dashboard.
  • The datasets Browse tab opens. You can use the search bar to refine the available options. Select a dataset from the list displayed.

The Datasets dashboard in the Experience Platform UI with the search bar and a dataset highlighted.

  • The Datasets activity screen appears. Select Preview dataset to open a dialog of the XDM schema and tabular view of flattened data from the selected dataset. More details can be found in the preview a dataset documentation

The Dataset activity tab of the Datasets dashboard with Preview dataset highlighted.

  • Select any field from the schema to display its contents in a flattened column. The name of the column is displayed above its contents on the right side of the page. You should copy this name to use for querying this dataset.

The XDM schema and tabular view of the flattened data. The column name of a nested dataset is highlighted in the UI.

See the documentation for full guidance on how to work with nested data structures using the Query Editor or a third-party client.

How do I speed up a query on a dataset that contains arrays?

Answer
To improve the performance of queries on datasets containing arrays, you should explode the array as a CTAS query on runtime, and then explore it for further for opportunities to improve its processing time.

Why is my CTAS query still processing after many hours for only a small number of rows?

Answer

If the query has taken a long time on a very small dataset, please contact customer support.

There can be any number of reasons for a query to be stuck while processing. To determine the exact cause requires an in-depth analysis on a case-by-case basis. Contact Adobe customer support to being this process.

How do I contact Adobe customer support?

Answer

A complete list of Adobe customer support telephone numbers is available on the Adobe help page. Alternatively, help can be found online by completing the following steps:

  • Navigate to https://www.adobe.com/ in your web browser.
  • On the right side of the top navigation bar, select Sign In.

The Adobe website with Sign in highlighted.

  • Use your Adobe ID and password that is registered with your Adobe license.
  • Select Help & Support from the top navigation bar.

The top navigation bar dropdown menu with Help and support, Enterprise support and Contact us highlighted.

A dropdown banner appears containing a Help and support section. Select Contact us to open the Adobe Customer Care Virtual Assistant, or select Enterprise support for dedicated help for large organizations.