/
Keyword Extraction Processor

Keyword Extraction Processor

Use the YAKE API to extract relevant keywords of a text.

Configuration

Example configuration in a processor:

{
  "servers": [
    {
      "host": "website",
      "port": 5000,
      "path": "yake"
    }
  ],
  "name": "Keyword Extraction Processor",
  "active": true,
  "type": "keyword-extraction-processor",
  "sourceField": "source",
  "language": "en",
  "maxNgramSize": 3,
  "minNgramSize": 2,
  "maxNumberOfKeywords": 20,
  "deduplication_algo": "seqm",
  "outputField": "output",
  "id": "efe35dc7-fa16-4787-9362-db23395c96e8"
}

Configuration parameters:

servers.host

(Required, String) The host where is located the YAKE API.

servers.port

(Required, Int) The host port where is located the YAKE API.

servers.path

(Required, String) The host path to call the YAKE API.

sourceField

(Required, String) The specific field from where extract the keywords.

language

(Optional, String) The language of the text to be processed. Default: "en"

maxNgramSize

(Optional, int) Max contiguous sequence of items. Default: 3

minNgramSize

(Optional, int) Min contiguous sequence of items. Default: 2

maxNumberOfKeywords

(Optional, int) Total of keywords to be extracted. Default: 20

deduplication_algo

(Optional, String) Function that evaluates the recordset for duplicate records, with options being "leve", "jaro" and "seqm". Default: "seqm"

outputField

(Required, String) Name of the field to allocate the keywords array.

Additional Reference

YAKE reference

Related content

NLP Service Processor
NLP Service Processor
More like this
Chunk Processor
More like this
Ingestion Processors
Ingestion Processors
Read with this
Tika Processor
More like this
Split Processor
Read with this
5. Ingestion Processors
5. Ingestion Processors
More like this

©2024 Pureinsights Technology Corporation