Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

Ingestion Framework

Custom ETL (Extract, Transform, Load) implementation that facilitates the ingestion and processing of content from different data sources so this content can be made available through the Discovery API.

Terms

Record

Simple unit of content, usually references one document, one user, one row in a table.

Seed

A seed is the origin of records. Can be records from a relational database, a website, Amazon S3, among others.

Pipeline

A finite state machine that defines the order in which processing steps are executed on records

Processor

A unit of work that needs to be done to a record.

Cronjob

Defines a schedule to run a given list of seeds.

Job

A batch of records and unit of work in the PDP.

  • No labels