WHERE clause

You can use certain operators in Python to help filter your dataset.

NOTE
The functions used for filtering are case sensitive.
eq() = '='
gt() = '>'
ge() = '>='
lt() = '<'
le() = '<='
And = and operator
Or = or operator

An example of using these filtering functions can be seen below:

df = dataset_reader.where(experience_ds['timestamp'].gt(87879779797).And(experience_ds['timestamp'].lt(87879779797)).Or(experience_ds['a'].eq(123)))

ORDER BY clause

The ORDER BY clause allows received results to be sorted by a specified column in a specific order (ascending or descending). This is done by using the sort() function.

An example of using the sort() function can be seen below:

df = dataset_reader.sort([('column_1', 'asc'), ('column_2', 'desc')])

LIMIT clause

The LIMIT clause allows you to limit the number of records received from the dataset.

An example of using the limit() function can be seen below:

df = dataset_reader.limit(100).read()

OFFSET clause

The OFFSET clause allows you to skip rows, from the beginning, to start returning rows from a later point. In combination with LIMIT, this can be used to iterate rows in blocks.

An example of using the offset() function can be seen below:

df = dataset_reader.offset(100).read()

Writing a dataset

To write to a dataset, you need to supply the pandas dataframe to your dataset.

Writing the pandas dataframe

client_context = get_client_context(config_properties)

# To fetch existing dataset
dataset = Dataset(client_context).get_by_id({DATASET_ID})

dataset_writer = DatasetWriter(client_context, dataset)

write_tracker = dataset_writer.write(<your_dataFrame>, file_format='json')