Split Processor
Split Processor
This processor splits a document into multiple one based on the split strategy (configurable). The current implemented strategies are:
Split Strategies
Multi-value field
- Takes a multi-value field and create a document for each value.
- The rest of the fields are copied as they come.
- Each value in the multi-value field is copied to a configurable output value.
For example, let's assume the following document:
{
"arrayField":["a", "b", "c"],
"otherField": "d"
}
If the arrayField
is set as the split field, the it will produce three documents:
[
{
"arrayField":["a", "b", "c"],
"otherField": "d",
"outputField": ["a"]
},
{
"arrayField":["a", "b", "c"],
"otherField": "d",
"outputField": ["b"]
},
{
"arrayField":["a", "b", "c"],
"otherField": "d",
"outputField": ["c"]
}
]
Configuration
Example configuration in a processor:
{
"strategy":"multiValueField",
"splitField":"fieldToSplitOn",
"outputField":"fieldAfterSplit",
"name": "Split processor",
"type": "split-processor"
}
Configuration parameters:
strategy
(Required, String) Split strategy to use.
splitField
(Required, String) The name of the multi-value field to split on.
outputField
(Required, String) The name of the field where each value of the output field is copied to.
prependRecordId
(Optional, Boolean) Flag to indicate if original record id should be prepended to the split child record. Defaults to true
.
Note: Disabling this parameter may produce duplicated record identifiers which might impact how PDP retrieves data across processors. Multiple records in a single batch can end up with the same identifier if the data is not unique.
©2024 Pureinsights Technology Corporation