Read more about Data Workbench’s End-of-life announcement.
The processing of log files as log sources requires the definition of a decoder within the Log Processing Dataset Include file to extract fields of data from the log entries.
Defining text file decoder groups for log file log sources requires knowledge of the log file’s structure and contents, the data to be extracted, and the fields in which that data is stored. This section provides basic descriptions of the parameters that you can specify for decoders, but the manner in which you use any decoder depends on the log file that contains your source data.
For information about format requirements for log file log sources, see Log Files. For assistance with defining text file decoders, contact Adobe.
A text file decoder group can include:
A regular expression decoder identifies complex string patterns within the log entries in a log file and extracts these patterns as fields of data. For each decoder, the number of fields must equal the number of capturing sub-patterns in the regular expression. The portion of the line matching the nth capturing sub-pattern is assigned to the nth field for that line.
To add a regular expression decoder to a text file decoder group
Open the Log Processing Dataset Include file as described in Editing Existing Dataset Include Files and add a text file decoder group. See the table entry Decoder Groups.
Right click Decoders under the newly created decoder group, then click Add new > Regular Expression.
Specify the following information:
Fields: List of the fields in the log file. If any of the fields defined here are to be passed to the transformation phase of dataset construction, those fields must be listed in the Fields parameter of one of the Log Processing Dataset Include files for the dataset. Custom field names must begin with “x-”.
Name: Optional identifier for the decoder.
Regular Expression: Used to extract the desired fields from each line in the file.
Repeat steps 4 and 5 for any other decoders that you want to add to the group.
To save the Log Processing Dataset Include file, right-click (modified) at the top of the window and click Save.
To make the locally made changes take effect, in the Profile Manager, right-click the check mark for the file in the User column. Click Save to > < profile name>, where profile name is the name of the dataset profile or the inherited profile to which the dataset include file belongs.
Do not save the modified configuration file to any of the internal profiles provided by Adobe, as your changes are overwritten when you install updates to these profiles.
A given log file can have multiple regular expression decoders. The order in which you define the decoders is important: the first decoder to match a line in the log file is the one used to decode that line.
This example illustrates the use of a regular expression decoder to extract fields of data from a tab-delimited text file. You can achieve the same result by defining a delimited decoder with a tab delimiter.
For more information about regular expression decoders, including terminology and syntax, see Regular Expressions.
A delimited decoder decodes a log file whose fields are delimited by a single character. The number of fields must correspond to the number of columns in the delimited file; however, not all fields need to be named. If a field is left blank, the column is still required in the log file, but the decoder ignores it.
To add a delimited decoder to a text file decoder group
Open the Log Processing Dataset Include file as described in Editing Existing Dataset Include Files and add a text file decoder group. See the table entry Decoder Groups.
Right click Decoders under the newly created decoder group, then click Add new > Delimited.
Specify the following information:
Fields: List of the fields in the log file. If any of the fields defined here are to be passed to the transformation phase of dataset construction, those fields must be listed in the Fields parameter of one of the Log Processing Dataset Include files for the dataset. Custom field names must begin with “x-”.
Delimiter: Character that is used to separate fields in the output file.
Repeat steps 4 and 5 for any other decoders that you want to add to the group.
To save the Log Processing Dataset Include file, right-click (modified) at the top of the window and click Save.
To make the locally made changes take effect, in the Profile Manager, right-click the check mark for the file in the User column, then click Save to > < profile name>, where profile name is the name of the dataset profile or the inherited profile to which the dataset include file belongs.
Do not save the modified configuration file to any of the internal profiles provided by Adobe, as your changes are overwritten when you install updates to these profiles.
This example illustrates the use of a delimited decoder to extract fields of data from a comma-delimited text file containing data about movies.