Split Processor

This processor splits a document into multiple one based on the split strategy (configurable). The current implemented strategies are:

Split Strategies

Multi-value field

Takes a multi-value field and create a document for each value.
The rest of the fields are copied as they come.
Each value in the multi-value field is copied to a configurable output value.

For example, let's assume the following document:

{
  "arrayField":["a", "b", "c"],
  "otherField": "d"
}

If the arrayField is set as the split field, the it will produce three documents:

[
    {
      "arrayField":["a", "b", "c"],
      "otherField": "d",
      "outputField": ["a"]
      
    },
    {
      "arrayField":["a", "b", "c"],
      "otherField": "d",
      "outputField": ["b"]
    },
    {
      "arrayField":["a", "b", "c"],
      "otherField": "d",
      "outputField": ["c"]
    }
]

Configuration

Example configuration in a processor:

{
  "strategy":"multiValueField",
  "splitField":"fieldToSplitOn",
  "outputField":"fieldAfterSplit",
  "name": "Split processor",
  "type": "split-processor"
}

Configuration parameters:

strategy

(Required, String) Split strategy to use.

splitField

(Required, String) The name of the multi-value field to split on.

outputField

(Required, String) The name of the field where each value of the output field is copied to.

prependRecordId

(Optional, Boolean) Flag to indicate if original record id should be prepended to the split child record. Defaults to true.

Note: Disabling this parameter may produce duplicated record identifiers which might impact how PDP retrieves data across processors. Multiple records in a single batch can end up with the same identifier if the data is not unique.