Spark SQL functions
Last update: December 20, 2023
- Topics:
- Queries
CREATED FOR:
- User
- Developer
You can use several built-in Spark SQL functions to extend SQL functionality with Adobe Experience Platform Query Service. This document lists the Spark SQL functions that are supported by Query Service.
For more detailed information about the functions, including their syntax, usage, and examples, read the Spark SQL function documentation.
Not all functions in the external documentation are supported.
Math and statistical operators and functions
Operator/Function | Description |
---|---|
% | Returns the remainder of the two numbers |
* | Multiplies the two numbers |
+ | Adds the two numbers |
- | Subtracts the two numbers |
/ | Divides the two numbers |
abs | Returns the absolute value of the input |
acos | Returns the inverse cosine value |
approx_count_distinct | Returns the estimated cardinality by HyperLogLog++ |
approx_percentile | Returns the approximate percentile value at the given percentage |
asin | Returns the inverse sine value |
atan | Returns the inverse tangent value |
atan2 | Returns the angle between the positive x-axis plane and the points given by the coordinates |
avg | Returns the average value |
cbrt | Returns the cube root |
ceil or ceiling | Returns the smallest integer not larger than the inputted value |
conv | Convert from one base to another |
corr | Returns the Pearson coefficient between the numbers |
cos | Returns the cosine value |
cosh | Returns the hyperbolic cosine value |
cot | Returns the cotangent value |
dense_rank | Returns the rank of a value in a group of values |
e | Returns Euler’s number |
exp | Returns e to the power of the value |
expm1 | Returns e to the power of the value minus 1 |
factorial | Returns the factorial of the value |
floor | Returns the largest integer not smaller than the value |
greatest | Returns the largest value of all the parameters |
hypot | Returns the hypotenuse of the two values given |
kurtosis | Returns the kurtosis value from the group |
least | Returns the smallest value of all the parameters |
ln | Returns the natural logarithm of the value |
log | Returns the logarithm of the value |
log10 | Returns the logarithm, in base 10, of the value |
log1p | Returns the logarithm of the value plus 1 |
log2 | Returns the logarithm, in base 2, of the value |
max | Returns the maximum value of the expression |
mean | Returns the mean calculated from the values |
min | Returns the minimum value of the expression |
monotonically_increasing_id | Returns monotonically increasing IDs |
negative | Returns the negated value |
percent_rank | Returns the percentage ranking of a value |
percentile | Returns the exact percentile at a given percentage |
percentile_approx | Returns the approximate percentile at a given percentage |
pi | Returns pi |
pmod | Returns the positive modulo between two values |
positive | Returns the positive value |
pow , power | Returns the first value to the power of the second value |
radians | Converts the value to radians |
rand | Returns a random number from 0 through 1 |
randn | Returns a random value |
rint | Returns the closest double value |
round | Returns the closest rounded value |
sign , signum | Returns the number’s sign |
sin | Returns sine of the value |
sinh | Returns hyperbolic sine of the value |
sqrt | Returns the square root of the value |
stddev | Returns the standard deviation of the value |
sttdev_pop | Returns the population standard deviation of the value |
stddev_samp | Returns the sample standard deviation of the value |
sum | Returns the sum of the values |
tan | Returns tangent of the value |
tanh | Returns hyperbolic tangent of the value |
var_pop | Returns the calculated population variance |
var_samp , variance | Returns the calculated sample variance |
Logical operators and functions
Operator/Function | Description |
---|---|
! or not | Logical not |
< | Less than |
<= | Less than or equal to |
= | Equal to |
> | Greater than |
>= | Greater than or equal to |
^ | Bitwise exclusive or |
| | Bitwise or |
~ | Bitwise not |
arrays_overlap | Returns the common elements |
assert_true | Asserts if the expression is true |
if | If the expression evaluates to true, return the second expression. Otherwise, return the third expression. |
ifnull | If the expression is null, it returns the second expression. Otherwise, it returns the first expression. |
in | Returns true if the first expression is in any of the subsequent expressions. |
isnan | Returns true if the value is not a number |
isnotnull | Returns true if the value is not null |
isnull | Returns true if the value is null |
nanvl | Returns the first expression if not a number, returns the second expression otherwise |
or | Logical or |
when | When can be used to create branch conditions for comparison |
xpath_boolean | Returns true if the XPath expression evaluates to true or if a matching node is found |
Date/time functions
Function | Description |
---|---|
add_months | Add months to date |
date_add | Add days to date |
date_format | Modify date format |
date_sub | Subtract days from date |
date_trunc | Returns the date truncated to the specified unit |
datediff | Returns the difference between dates in days |
day , dayofmonth | Returns the day of the month |
dayofweek | Returns the day of week (1-7) |
dayofyear | Returns the day of year |
from_unixtime | Returns date in UNIX® time |
from_utc_timestamp | Returns date in UTC time |
hour | Returns the hour of the input |
last_day | Returns the last day of the month that the date belongs to |
minute | Returns the minute of the input |
month | Returns the month of the input |
months_between | Number of months between |
next_day | Returns the first day later than the input |
quarter | Returns the quarter of the input |
second | Returns the second of the string |
to_date | Converts the string to a date. Note: The string must be in the format yyyy-mm-ddTHH24:MM:SS . |
to_timestamp | Converts the string to a timestamp. Note: The string must be in the format yyyy-mm-ddTHH24:MM:SS . |
to_unix_timestamp | Converts the string to a UNIX® timestamp |
to_utc_timestamp | Converts the string to a UTC timestamp |
trunc | Truncates the date |
unix_timestamp | Returns the UNIX® timestamp |
weekday | Day of the week (0-6) |
weekofyear | Returns the week of the year for a given date |
year | Returns the year of the string |
Arrays
Function
Description
Creates an array with the given elements
Checks if the array contains the value
Removes duplicate values from the array
Returns an array of the elements in the first array, but not the second
Returns the intersection of the two arrays
Joins two arrays together
Returns the maximum value of the array
Returns the minimum value of the array
Returns the 1-based position of the element
Removes all elements that are equal to the element
Creates an array containing the value counted times
Sorts the array
Joins the array together, without any duplicates
Combines the values of given arrays with the values of the original collection at a given index
Return the size of the array
Return the element at position
Separate elements of array into multiple rows, excluding null
Separate elements of array into multiple rows, including null
Returns the 1 based position of array
Flattens an array of arrays
Separate array of structs into a table, excluding null
Separate array of structs into a table, including null
Separate elements of an array into multiple rows with positions, excluding null
Reverse elements of the array
Return a random permutation of the array
Subsets an array
Sort an array, given an order
Merges the two arrays into a single array, before applying a function
Datatype casting functions
Function
Description
Change the data type to bigint
Change the data type to binary
Change the data type to boolean
Change the data type to the specified type
Change the data type to date
Change the data type to decimal
Change the data type to double
Change the data type to float
Change the data type to int
Change the data type to smallint
Create a map from a string
Change the data type to string
Create a struct
Change the data type to tinyint
Conversion and formatting functions
Function
Description
Return the numeric (ASCII) value
Change the argument to a base64 string
Change the argument to a binary value
Return the bit length
Return the string length
Returns the cyclic redundancy check value
Convert radians to degrees
Change the number’s format
Get data from JSON
Return the hash value
Convert the argument to a hexadecimal value
Changes the string to be title case
Pads the left side of a string
Create a map
Create a map from an array
Create a map from an array of structs
Return the md5 value
Pads the right side of a string
Removes trailing spaces
Return the SHA2 value
Return the soundex code
Separate values into rows
Returns a JSON string
Replace values within string
Remove leading and trailing characters
Convert the base64 string to binary
Convert the hexadecimal to binary
Return a UUID
Data evaluation
Function
Description
Return the first non-null argument
Return a list of non-unique elements
Return a set of unique elements
Concatenation
Concatenation with separator
Returns the total count for rows
Decode using a character set
Encode using a character set
Returns the first value
Indicates if a column is grouped
Returns the level of grouping
Returns a 1-based index of character occurrence
Returns a tuple from a JSON input
Returns the last value
Returns the length of the string
Returns the Levenshtein distance between strings
Concatenate a map
Return a map’s keys
Return a map’s values
Divide rows into partitions
Returns null if true
Returns value if null
Returns value if not null
Extracts part of a URL
Computes rank of a value
Extracts something that matches the regex
Replaces something that matches the regex
Returns a string that repeats
Replace all instances of a string
Create a multi-dimensional rollup
Assigns a unique row number
Returns the schema of the JSON
Splits string into an array of words
Generates an array of elements
Signed bitwise shift left
Signed bitwise shift right
Unsigned bitwise shift right
Return the size of the array
Split string
Return index of substring
Window
Parse XML nodes
Parse XML nodes for double
Parse XML nodes for float
Parse XML nodes for integer
Parse XML nodes for long
Parse XML nodes for short integer
Parse XML nodes for string
Current information
Function
Description
Returns current database
Returns current date
Returns current timestamp
Higher-order functions
recommendation-more-help
ccf2b369-4031-483f-af63-a93b5ae5e3fb