This processor splits a document into multiple one based on the split strategy (configurable). The current implemented strategies are:
For example, let's assume the following document:
{
"arrayField":["a", "b", "c"],
"otherField": "d"
}
If the arrayField
is set as the split field, the it will produce three documents:
[
{
"arrayField":["a", "b", "c"],
"otherField": "d",
"outputField": ["a"]
},
{
"arrayField":["a", "b", "c"],
"otherField": "d",
"outputField": ["b"]
},
{
"arrayField":["a", "b", "c"],
"otherField": "d",
"outputField": ["c"]
}
]
Example configuration in a processor:
{
"strategy":"multiValueField",
"splitField":"fieldToSplitOn",
"outputField":"fieldAfterSplit",
"name": "Split processor",
"type": "split-processor"
}
strategy
(Required, String) Split strategy to use.
splitField
(Required, String) The name of the multi-value field to split on.
outputField
(Required, String) The name of the field where each value of the output field is copied to.
prependRecordId
(Optional, Boolean) Flag to indicate if original record id should be prepended to the split child record. Defaults to true
.
Note: Disabling this parameter may produce duplicated record identifiers which might impact how PDP retrieves data across processors. Multiple records in a single batch can end up with the same identifier if the data is not unique.