[Beta]{class="badge informative"}
Machine learning-assisted schema creation
- Machine learning-assisted schema creation is currently in beta. The documentation and the functionality are subject to change.
Use ML algorithms to generate a schema from sample data. This process saves time and increases accuracy when defining the structure, fields, and data types for large complex datasets.
With ML schema generation, you can quickly integrate new data sources and reduce the mistakes from manual creation. Non-technical users can use it to generate schemas or manage large and complex datasets without any extra effort. This assistance speeds up the process from getting data to gaining insights, as makes it easier to combine new data sources and perform data analysis.
Getting started
This tutorial requires a working understanding of the requirements for schema creation. Before continuing with this guide, you should read the UI guide to creating and editing schemas.
This guide explains how to create schemas using machine learning (ML) algorithms to generate a schema from sample data. See the manual schema creation workflow guide for information on creating schemas or the document on field-based workflows in the Schema Editor to enhance your understanding of the schema creation process.
Navigate to the Create schema workflow navigate-to-schema-creation-workflow
From the left navigation of the Platform UI, select the Schemas workspace. The Schemas workspace appears. Select Create schema to add a new schema to start a schema creation workflow.
Create a schema create-a-schema
The Create a schema dialog appears. Select the [ML-Assisted] schema creation option, followed by Select to confirm your choice.
Select a base class select-base-class
The Create schema workflow appears. Select a base class for your schema followed by Next.
Upload a CSV file upload-csv
The Select data stage of the creation workflow appears. From the Upload files section, select Choose files or the Drag and Drop files section. Select a .csv file from your computer to generate a schema.
Preview data preview-data
The Upload file section displays the name of the CSV file that you imported and the Preview section displays rows of sample data from the file you uploaded. Select Next to continue the workflow.
Review and edit schema review-schema
The Review and edit stage of the creation workflow now appears, displaying the machine learning-assisted Schema recommendation in a tabularized view. At this stage, you can edit, add, or remove fields from the recommended schema generated by the machine learning model. The table contains the following fields:
String
, Date
).
Add a field add-field
To add a field to the schema, select Add new field.
The Select field dialog appears. The dialog contains a diagram of the schema as it currently exists. Select the desired field and select [Select] to add a new field to the schema. Select [Cancel] to close the dialog if needed.
A new row appears on your recommended schema. You can now edit the field.
Edit a Field edit-field
To edit a field, select the pencil icon of the row you wish to edit. A details panel appears to the right where you can edit the custom field mapping. The details panel contains the Target field, Display Name, Data Type, and Field Group. Make any necessary changes and select Apply to confirm. Select the pencil icon again to close the details panel.
Remove a field remove-field
To remove a field, select the minus icon on a row you want to delete.
Approve your recommended schema approve
To approve your recommended schema and continue the Create schema workflow, select [Next].
Name and save schema name-and-save
The Name and save stage of the creation workflow appears. Enter a [Schema display name] and an optional description. The [Schema generated] section provides a diagram of the ML-generated schema. Select [Finish] to complete the schema creation workflow.
View in the Schema Editor view-in-editor
The Schema Editor appears with your newly created schema displayed in the canvas. Select Save to return to the Schemas workspace.
Next Steps
After creating your schema, you can use the Schema Editor to make further modifications, if necessary. Your new schema is now ready to be integrated with your data sources and used for data analysis.
See the Edit an existing schema guide for more information on using the Schema Editor.