NLP Service Processor
Use the spaCy API to build information extraction or natural language understanding.
Configuration
Example configuration in a processor:
{
"servers":[
{
"host": "website",
"port": 80
}
],
"connectTimeout": 1000,
"readTimeout": 1000,
"name":"NLP Service Processor",
"active":true,
"type":"nlp-service-processor",
"mode": "deps",
"sourceField":"source",
"collapse_punctuation": false,
"collapse_phrases": true,
"outputField": "output",
"id": "efe35dc7-fa16-4787-9362-db23395c96e8"
}
Configuration parameters:
servers.host
(Required, String) The host where the spaCy API is located.
servers.port
(Required, Int) The host port where the spaCy API is located.
mode
(Required, String) The client call for the spaCy API. Options: ent, deps
sourceField
(Required, String) The specific field to be processed.
connectTimeout
(Optional, Int) timeout to connect to the server. Should be expressed in milliseconds. Default 60000 (1m)
readTimeout
(Optional, Int) timeout to read from the server. Should be expressed in milliseconds. Default 60000 (1m)
model
(Optional, String) The model installed on the server. Default: "en"
collapse_punctuation
(Optional, Boolean) Boolean to decide if merge punctuation onto the preceding token. Defaults to false.
collapse_phrases
(Optional, Boolean) Boolean to decide if merge noun chunks and named entities into single tokens. Defaults to false.
outputField
(Optional, String) Name of the field to allocate the result. Default: nlpOutput
Input/Output examples
deps
- Processor
{
"servers":[
{
"host": "website",
"port": 80
}
],
"connectTimeout": 1000,
"readTimeout": 1000,
"name":"NLP Service Processor",
"active":true,
"type":"nlp-service-processor",
"mode": "deps",
"sourceField":"source",
"collapse_punctuation": 0,
"collapse_phrases": 1,
"outputField": "output",
"id": "efe35dc7-fa16-4787-9362-db23395c96e8"
}
- Input
{
"source": "They ate the pizza with anchovies"
}
- Output
{
"source": "They ate the pizza with anchovies",
"output": {
"arcs": [
{
"dir": "left",
"start": 0,
"end": 1,
"label": "nsubj"
},
{
"dir": "right",
"start": 1,
"end": 2,
"label": "dobj"
},
{
"dir": "right",
"start": 1,
"end": 3,
"label": "prep"
},
{
"dir": "right",
"start": 3,
"end": 4,
"label": "pobj"
},
{
"dir": "left",
"start": 2,
"end": 3,
"label": "prep"
}
],
"words": [
{
"tag": "PRP",
"text": "They"
},
{
"tag": "VBD",
"text": "ate"
},
{
"tag": "NN",
"text": "the pizza"
},
{
"tag": "IN",
"text": "with"
},
{
"tag": "NNS",
"text": "anchovies"
}
]
}
}
ent
- Processor
{
"servers":[
{
"host": "website",
"port": 80,
"path": "ent"
}
],
"name":"NLP Service Processor",
"active":true,
"type":"nlp-service-processor",
"sourceField":"source",
"outputField": "output",
"id": "efe35dc7-fa16-4787-9362-db23395c96e8"
}
- Input
{
"source": "When Sebastian Thrun started working on self-driving cars at Google in 2007, few people outside of the company took him seriously."
}
- Output
{
"source": "When Sebastian Thrun started working on self-driving cars at Google in 2007, few people outside of the company took him seriously.",
"output":
[
{
"end": 20,
"start": 5,
"type": "PERSON"
},
{
"end": 67,
"start": 61,
"type": "ORG"
},
{
"end": 75,
"start": 71,
"type": "DATE"
}
]
}
Additional Reference
©2024 Pureinsights Technology Corporation