Ingestion Framework
Custom ETL (Extract, Transform, Load) implementation that facilitates the ingestion and processing of content from different data sources so this content can be made available through the Discovery API.
Terms
Record
Simple unit of content, usually references one document, one user, one row in a table.
Seed
A seed is the origin of records. Can be records from a relational database, a website, Amazon S3, among others.
Pipeline
A finite state machine that defines the order in which processing steps are executed on records
Processor
A unit of work that needs to be done to a record.
Cronjob
Defines a schedule to run a given list of seeds.
Job
A batch of records and unit of work in the PDP.