Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The "pipelineId" references a pipeline that you'll create in the next step.

The "batchSize" determines the number of records processed in a single Job.

Finally, the "properties" element offers some global properties that any processor working on this seed can access. In particular, this is setting an "index" property that will be used later on to decide where to store records produced by this seed.

...

[
  {
    "bulkSize": 300,
    "servers": [
      {
        "hostname": "localhost",
        "port": 9200
      }
    ],
    "name": "Persist to Search",
    "index": "${index}",
    "type": "elasticsearch-hydrator"
    "batchSize": 50
  }
]

The above "type" field references the "elasticsearch-hydrator" processor. This is a processor that takes content and sends it to Elasticsearch efficiently.

There are multiple fields, like "bulkSize", "index" (referenced from global seed properties) and "servers" that are specific to this type of processor. Other processors will take different fields as configuration, and you can find details on how to configure each on their respective User Guides.

Processors also include a "batchSize" field that overrides the seed's configuration batch size. This is particularly useful for processors that generate new records. The processor with the new batch size configuration will restructure the records from each job into new jobs after it's done processing.

Start an Execution

Now that we have a sample seed, pipeline and processor configured, we can start an execution.

...