CSV Processor
CSV Connector
This connector splits a single record into multiple child records.
Incoming record must contain a reference to a csv (comma separated values) file.
Each line on the csv file will be read and a new child record will be created and enqueued on the current processing pipeline .
All child items will have the same action as the parent item.
API
OpenCSV. A simple library for reading and writing CSV in Java - Version 5.5.2
Configuration
Sample configuration in a processor:
{
"csv": {
"key": "fieldWithCSVContent",
"columns": [
{
"name": "columnName1",
"index": 0,
"idColumn": true
},
{
"name": "columnName2",
"index": 1
},
{
"name": "columnName3",
"index": 2
}
],
"skipLines": 1,
"encoding": "UTF-8",
"csvParser": "RFC4180"
},
"name": "{some name}",
"type": "csv-processor",
"pipelineId": "{Some Id}"
}
Configuration parameters:
key
- Required, string
Name of the record's field containing the csv file content
skipLines
- Optional, integer
Number of lines to skip from the begging of the csv file. Useful to skip row with column names
encoding
- Optional, string
Encoding to used when reading the csv file. Defaults to the JVM provided default charset.
parser
- Optional, string
Name of the parser strategy to use. Supports the following values:
- RFC4180: Use the RFC4180 standard, which stipulates the use of CRLF pairs to denote line breaks. This avoids breaking line on
\n
or other characters, even if they are surrounded by double quotes.
If this value is not set, then a default parser will be used.
columns
- Required, Json Object
List of objects describing the file's columns and how to map them to the newly created record. Each object has:
name
- Required, string. Column's Name. It will be used as the fields name on the child document's record.index
- Required, integer. 0 based index for the columns position on the file.idColumn
- Optional, boolean. Whether the column will should be treated as the child document's id. Only one column must be set to true.
©2024 Pureinsights Technology Corporation