Editing the Transformation Configuration File

IMPORTANT

Read more about Data Workbench’s End-of-life announcement.

Steps for editing the Transformation.cfg file for a dataset profile.

  1. While working in your dataset profile, open the Profile Manager and click Dataset to show its contents.

    For information about opening and working with the Profile Manager, see the Data Workbench User Guide.

    NOTE

    A Transformation subdirectory may exist within the Dataset directory. This subdirectory contains the Transformation Dataset Include files that have been created for one or more inherited profiles. For information about Transformation Dataset Include files, see Dataset Include Files.

  2. Right-click the check mark next to Transformation.cfg and click Make Local. A check mark for this file appears in the User column.

  3. Right-click the newly created check mark and click Open > in Workstation. The Transformation.cfg window appears.

    You also can open the Transformation.cfg file from a Transformation Dependency Map. For information about transformation dependency maps, see Dataset Configuration Tools.

  4. Edit the parameters in the configuration file using the following table as a guide.

    When editing the Transformation.cfg file within a data workbench window, you can use shortcut keys for basic editing features, including cut (Ctrl+x ), copy (Ctrl+c) , paste (Ctrl+v ), undo (Ctrl+z ), redo (Ctrl+Shift+z ), select section (click+drag), and select all (Ctrl+a ). In addition, you can use the shortcuts to copy and paste text from one configuration file ( .cfg) to another.

    NOTE

    A Transformation Dataset Include files for an inherited profile contains a subset of the parameters described in the following table as well as some additional parameters. For information about Transformation Dataset Include files, see Dataset Include Files

    Parameter Description
    End Time

    Optional. Filter data to include log entries with timestamps up to, but not including, this time. Adobe recommends using one of the following formats for the time:

    • January 1 2013 HH:MM:SS EDT
    • Jan 1 2013 HH:MM:SS GMT

    For example, specifying "July 29 2013 00:00:00 EDT" as the End Time includes data through July 28, 2013, at 11:59:59 PM EDT.

    You must specify a time zone. The time zone does not default to GMT if not specified. For a list of time zone abbreviations supported by the data workbench server, see Time Zone Codes .

    Note: If you specify a value for End Time, a parameter named End Time is set and applied throughout the transformation phase of dataset construction. For information about parameters, see Defining Parameters in Dataset Include Files .

    Extended Dimensions Optional. Adobe recommends defining extended dimensions in one or more Transformation Dataset Include files. For information, see Transformation Dataset Include Files .
    Hash Threshold

    Optional. A sampling factor for random sub-sampling of rows. If set to a number n, then only one out of each n tracking IDs enters the dataset, reducing the total number of rows in the dataset by a factor of n. To create a dataset that requires 100 percent accuracy (that is, to include all rows), you would set Hash Threshold to 1.

    If Hash Threshold is specified in both the Log Processing.cfg and Transformation.cfg files, it is not applied in sequence; the maximum of the values set in either configuration file applies.

    Log Entry Condition Optional. Defines the rules by which log entries output from log processing are considered for inclusion in the dataset profile. See Log Entry Condition .
    New Visitor Condition Optional. For use with web data. Defines the rules by which visitors are considered for inclusion in the data. The New Visitor Condition defines the first log entry for a visitor (ordered by time) that is to be used in the dataset. All subsequent log entries for this visitor are included in the dataset regardless of whether they meet this condition. See New Visitor Condition .
    Reprocess

    Optional. Any character or combination of characters can be entered here. Changing this parameter and saving the file initiates data retransformation.

    For information about reprocessing your data, see Reprocessing and Retransformation .

    Schema Checking True or false. If true, then the data workbench server identifies dataset corruption problems and records information about the problems in log files in the data workbench server's Trace directory. The default value is true. Adobe recommends leaving this parameter set to true at all times.
    Stages

    Optional. The names of the processing stages that can be used in Transformation Dataset Include files. Processing stages provide a way to order the transformations that are defined in Transformation Dataset Include files. This parameter is very helpful if you have defined one or more transformations within multiple Transformation Dataset Include files and you want specific transformations to be performed at specific points during transformation.

    The order in which you list the stages here determines the order in which the transformations in the Transformation Dataset Include files are executed during transformation. Preprocessing and Postprocessing are built-in stages; Preprocessing is always the first stage, and Postprocessing is always the last stage. By default, there is one named stage called Default .

    To add a new processing stage

    • In the Transformation.cfg window, right-click Stages and click Add New > Stage .
    • Enter a name for the new stage.

    To delete an existing processing stage

    • Right-click the number corresponding to the stage that you want to delete and click Remove < #stage_number >.

    Note: When you specify a Stage in a Transformation Dataset Include files the name of the stage must match exactly the name that you enter here. For more information about dataset include files, see Dataset Include Files .

    Start Time

    Optional. Filter data to include log entries with timestamps at or after this time. Adobe recommends using one of the following formats for the time:

    • January 1 2013 HH:MM:SS EDT
    • Jan 1 2013 HH:MM:SS GMT

    For example, specifying July 29 2013 00:00:00 EDT as the Start Time includes data starting from July 29, 2013, at 12:00:00 AM EDT.

    You must specify a time zone. The time zone does not default to GMT if not specified. For a list of time zone abbreviations supported by data workbench Server, see Time Zone Codes .

    Note: If you specify a value for Start Time, a parameter named Start Time is set and applied throughout the transformation phase of dataset construction. For information about parameters, see Defining Parameters in Dataset Include Files .

    Transformations Optional. Adobe recommends defining transformations for the transformation phase of dataset construction in one or more Transformation Dataset Include files. For information, see Transformation Dataset Include Files .
    Time Zone

    Time zone of the dataset profile. Time zones are used for time conversions and for creating time dimensions. See Time Zones .

    Note: When defined in the Log Processing.cfg file, the Time Zone parameter is used for time conversions only.

  5. Right-click (modified) at the top of the window and click Save.

  6. In the Profile Manager, right-click the check mark for Transformation.cfgin the User column, then click Save to > * dataset profile name to make the locally made changes take effect. Retransformation of the data begins after synchronization of the dataset profile.

    NOTE

    Do not save the modified configuration file to any of the internal profiles provided by Adobe, as your changes are overwritten when you install updates to these profiles.

    For information about reprocessing or retransforming your data, see Reprocessing and Retransformation.

On this page