Forms JEE workflows | Handling user data forms-jee-workflows-handling-user-data

CAUTION
AEM 6.4 has reached the end of extended support and this documentation is no longer updated. For further details, see our technical support periods. Find the supported versions here.

AEM Forms JEE workflows provide tools to design, create, and manage business processes. A workflow process consists of a series of steps that execute in a specified order. Each step performs a specific action such as assigning a task to a user or sending an email message. A process can interact with assets, user accounts, and services, and can be triggered using any of the following methods:

  • Starting a process from AEM Forms Workspace
  • Using SOAP or RESTful service
  • Submitting an adaptive form
  • Using watched folder
  • Using Email

For more information about creating AEM Forms JEE workflow process, see Workbench Help.

User data and data stores user-data-and-data-stores

When a process is triggered and as it progresses, it captures data about the process participants, data entered by participants in the form associated with the process, and attachments added to the form. The data is stored in AEM Forms JEE server database, and if configured, some data like attachments are stored in the Global Document Storage (GDS) directory. The GDS directory can be configured on a shared file system or a database.

Access and delete user data access-and-delete-user-data

When a process is triggered, a unique process instance ID and long-lived invocation ID is generated and associated with the process instance. You can access and delete data for a process instance based on the long-lived invocation ID. You can deduce the long-lived invocation ID of a process instance with the user name of the process initiator or process participants who have submitted their tasks.

However, you cannot identify the process instance ID for an initiator in the following scenarios:

  • Process triggered through a watched folder: A process instance cannot be identified using its initiator if the process is triggered by a watched folder. In this case, the user information is encoded in the stored data.
  • Process initiated from publish AEM instance: All process instances triggered from AEM publish instance do not capture information about the initiator. However, user data may be captured in the form associated with the process, which is stored in workflow variables.
  • Process initiated through email: The email ID of the sender is captured as a property in an opaque blob column of the tb_job_instance database table, which cannot be queried directly.

Identify process instance IDs when workflow initiator or participant is known initiator-participant

Perform the following steps to identify process instance IDs for a workflow initiator or a participant:

  1. Execute the following command in AEM Forms server database to retrieve the principal ID for workflow initiator or participant from the edcprincipalentity database table.

    code language-sql
    select id from edcprincipalentity where canonicalname='user_ID'
    

    The query returns the principal ID for the specified user_ID.

  2. (For workflow initiator) Execute the following command to retrieve all tasks associated with the principal ID for the initiator from the tb_task database table.

    code language-sql
    select * from tb_task where start_task = 1 and create_user_id= 'initiator_principal_id'
    

    The query returns tasks initiated by the specified initiator_ principal_id. The tasks are of two types:

    • Completed tasks: These tasks have been submitted and display an alphanumeric value in the process_instance_id field. Take note of all process instance IDs for submitted tasks and continue with the steps.
    • Tasks initiated but not complete: These tasks have initiated but not submitted yet. The value in the process_instance_id field for these tasks is 0 (zero). In this case, take note of the corresponding task IDs and see Work with orphan tasks.
  3. (For workflow participants) Execute the following command to retrieve process instance IDs associated with the principal ID of the process participant for the initiator from the tb_assignment database table.

    code language-sql
    select distinct a.process_instance_id from tb_assignment a join tb_queue q on a.queue_id = q.id where q.workflow_user_id='participant_principal_id'
    

    The query returns instance IDs for all processes associated with the participant including those where the participant has not submitted any task.

    Take note of all process instance IDs for submitted tasks and continue with the steps.

    For orphan tasks or tasks where the process_instance_id is 0 (zero), take note of the corresponding task IDs and see Work with orphan tasks.

  4. Follow the instructions in Purge user data from workflow instances based on process instance IDs section to delete user data for identified process instance IDs.

Identify process instance IDs when user data is stored in primitive variables primitive

A workflow can be designed such that the user data is captured in a variable that gets stored as a blob in the database. In such cases, you can query user data only if it is stored in one of the following primitive-type variables:

  • String: Contains the user ID directly or as a substring and can be queried using SQL.
  • Numeric: Contains the user ID directly.
  • XML: Contains the user ID as a substring within the text stored as text columns in database and can be queried like strings.

Perform the following steps to determine if a workflow that stores data in primitive-type variables contains data for the user:

  1. Execute the following database command:

    code language-sql
    select database_table from omd_object_type where name='pt_<app_name>/<workflow_name>'
    

    The query returns a table name in tb_<number> format for the specified application ( app_name) and workflow ( workflow_name).

    note note
    NOTE
    The value of the name property can be complex if the workflow is nested within sub-folders inside the application. Ensure that you specify the exact full path to the workflow, which you can get from the omd_object_type database table.
  2. Review the tb_<number> table schema. The table contains variables that store user data for the specified workflow. The variables in the table correspond to the variables in the workflow.

    Identify and take note of the variable that corresponds to workflow variable containing the user ID. If the identified variable is of primitive-type, you can run a query to determine workflow instances associated with a user ID.

  3. Execute the following database command. In this command, the user_var is the primitive-type variable that contains user ID.

    code language-sql
    select process_instance_id from <tb_name> where <user_var>=<user_ID>
    

    The query returns all process instance IDs associated with the specified user_ID.

  4. Follow the instructions in Purge user data from workflow instances based on process instance IDs section to delete user data for identified process instance IDs.

Purge user data from workflow instances based on process instance IDs purge

Now that you have identified the process instance IDs associated with a user, do the following to delete user data from the respective process instances.

  1. Execute the following command to retrieve long-lived invocation ID and status for a process instance from the tb_process_instance table.

    code language-sql
    select long_lived_invocation_id, status from tb_process_instance where id='process_instance_id'
    

    The query returns the long-lived invocation ID and status for the specified process_instance_id.

  2. Create an instance of the public ProcessManager client ( com.adobe.idp.workflow.client.ProcessManager) using a ServiceClientFactory instance with the correct connection settings.

    For more information, see Java API reference for Class ProcessManager.

  3. Check the status of the workflow instance. If the status is other than 2 (COMPLETE) or 4 (TERMINATED), terminate the instance first by calling the following method:

    ProcessManager.terminateProcess(<long_lived_invocation_id>).

  4. Purge the workflow instance by calling the following method:

    ProcessManager.purgeProcessInstance(<long_lived_invocation_id>)

    The purgeProcessInstance method completely deletes all data for the specified invocation ID from AEM Forms server database and GDS, if configured.

Work with orphan tasks orphan

Orphan tasks are the tasks whose containing process has been initiated but not submitted yet. in this case, the process_instance_id is 0 (zero). Therefore, you cannot trace user data stored for orphan tasks using process instance IDs. However, you can trace it using the task ID for an orphan task. You can identify the tasks IDs from the tb_task table for a user as described in Identify process instance IDs when workflow initiator or participant is known.

Once you have the task IDs, do the following to purge the associated files and data with an orphan task from GDS and database.

  1. Execute the following command on AEM Forms server database to retrieve IDs for the identified task IDs.

    code language-sql
    select id from tb_form_data where task_id=<task_id>
    

    The query returns a list of IDs. For each ID ( fd_id) returned in the results, create a list of session ID strings as follows:

    • _ wfattach<task_id>
    • _wftask<fd_id>
    • _wftaskformid<fd_id>
  2. Depending on whether your GDS points to a file system or a database, perform one of the following steps:

    1. GDS in file system

      In the GDS file system:

      1. Search for files with the following session ID strings as their extensions:

      • _wfattach<task_id>

      • _wftask<fd_id>

      • _wftaskformid<fd_id>

        The files with these extensions are the marker files. They are stored with filenames in the following format:

        <file_name_guid>.session<session_id_string>

      1. Delete all marker files and other files with the exact filename as <file_name_guid> from the file system.
    2. GDS in database

      Execute the following commands for each session ID:

      code language-sql
      delete from tb_dm_chunk where documentid in (select documentid from tb_dm_session_reference where sessionid=<session_id>)
      delete from tb_dm_session_reference where sessionid=<session_id>
      delete from tb_dm_deletion where sessionid=<session_id>
      
  3. Execute the following commands to delete data for task IDs from the AEM Forms server database:

    code language-sql
    delete from tb_task_acl where task_id=<task_id>
    delete from tb_task_attachment where task_id=<task_id>
    delete from tb_form_data where task_id=<task_id>
    delete from tb_assignment where task_id=<task_id>
    delete from tb_task where id=<task_id>
    
recommendation-more-help
a6ebf046-2b8b-4543-bd46-42a0d77792da