WHERE clause
You can use certain operators in Python to help filter your dataset.
eq() = '='
gt() = '>'
ge() = '>='
lt() = '<'
le() = '<='
And = and operator
Or = or operator
An example of using these filtering functions can be seen below:
df = dataset_reader.where(experience_ds['timestamp'].gt(87879779797).And(experience_ds['timestamp'].lt(87879779797)).Or(experience_ds['a'].eq(123)))
ORDER BY clause
The ORDER BY clause allows received results to be sorted by a specified column in a specific order (ascending or descending). This is done by using the sort()
function.
An example of using the sort()
function can be seen below:
df = dataset_reader.sort([('column_1', 'asc'), ('column_2', 'desc')])
LIMIT clause
The LIMIT clause allows you to limit the number of records received from the dataset.
An example of using the limit()
function can be seen below:
df = dataset_reader.limit(100).read()
OFFSET clause
The OFFSET clause allows you to skip rows, from the beginning, to start returning rows from a later point. In combination with LIMIT, this can be used to iterate rows in blocks.
An example of using the offset()
function can be seen below:
df = dataset_reader.offset(100).read()
Writing a dataset
To write to a dataset, you need to supply the pandas dataframe to your dataset.
Writing the pandas dataframe
client_context = get_client_context(config_properties)
# To fetch existing dataset
dataset = Dataset(client_context).get_by_id({DATASET_ID})
dataset_writer = DatasetWriter(client_context, dataset)
write_tracker = dataset_writer.write(<your_dataFrame>, file_format='json')