/
Open AI Processor

Open AI Processor

Open AI Processor

This processor uses Open AI's API for embedding and chat completion features (see here).

Action: Process (Embeddings)

Uses Open AI's API embeddings feature (see here) to get a vector representation of a given input.

Configuration

Example configuration in a processor:

{
  "type": "open-ai-processor",
  "sourceField": "description",
  "output": "embeddings",
  "multiSourceFieldSeparator": " ",
  "model": "text-embedding-ada-002",
  "timeout": "PT120S",
  "user": "pdp-embeddings",
  "backoffType": "exponential",
  "backoffInitialDelay": "PT10S",
  "backoffMaxRetries": 7,
  "credentialId": "b0fa9c0f-912f-4345-88cf-1d78bdf462f1"
}

Configuration parameters:

sourceField

(Optional, String/List) field with the input text. Default is cleanContent. If multiple fields are provided, they will be concatenated with the value specified in multiSourceFieldSeparator.

multiSourceFieldSeparator

(Optional, String) The string that will be used to concatenate multiple source fields if needed. Defaults to

output

(Optional, String) The records field where the embeddings result will be stored, defaults to openAiEmbedding

model

(Optional, String) The Open AI embeddings model to use in requests, defaults to text-embedding-ada-002

timeout

(Optional, String) The timeout on embeddings requests. Expressed in Duration notation, defaults to PT120S

user

(Optional, String) The user to be included in embeddings requests. If specified as null, no user will be included in requests. If not specified at all, defaults to pdp-content-processor.

backoffType

(Optional, String) The type of backoff to apply to the retries of the check for a passed cooldown. Options are none, constant, and exponential. Defaults to constant.

backoffInitialDelay

(Optional, String) The initial delay between backoff checks for a passed cooldown. Expressed in Duration notation, defaults to PT10S

backoffMaxRetries

(Optional, Integer) Maximum amount of times a processor tries to check for a passed cooldown before failing a job. Defaults to 25

ignoreExceedingInput

(Optional, Boolean) Defines if a record text that exceeds the limit of tokens is ignored or truncated. If true, the record is ignored. Otherwise, the record text is truncated. Defaults to false

Supported models generations are V1 and V2, respectively denoted as -001 and -002 in the model ID. In case of truncating a text, V1 models limit the number of tokens to 2046, and the limit in V2 models is of 8192 tokens. A token is equivalent to 3 characters.

In case the configured model is from an unsupported version, the default model generation is V2.

Input/Output examples

  • Processor
{
  "name": "Embeddings processor",
  "type": "open-ai-processor",
  "sourceField": "description",
  "output": "embeddings",
  "user": "pdp-embeddings",
  "backoffType": "constant",
  "backoffInitialDelay": "PT10S",
  "backoffMaxRetries": 25,
  "ignoreExceedingInput": false,
  "credentialId": "b0fa9c0f-912f-4345-88cf-1d78bdf462f1",
  "id": "fa16215c-1ff1-4233-8519-bb763144b7e9"
}
  • Input
{
  "description": "JavaScript lies at the heart of almost every modern web application, from social apps like Twitter to browser-based game frameworks like Phaser and Babylon. Though simple for beginners to pick up and play with, JavaScript is a flexible, complex language that you can use to build full-scale applications."
}
  • Output
{
  "description": "JavaScript lies at the heart of almost every modern web application, from social apps like Twitter to browser-based game frameworks like Phaser and Babylon. Though simple for beginners to pick up and play with, JavaScript is a flexible, complex language that you can use to build full-scale applications",
  "embeddings": [
    -0.0070322175,
    0.031458195,
    0.02097213,
    ...,
    0.019307466,
    -0.037907124
  ]
}

Empty/Null text to encode

It is required that the text provided in the source fields is not null or empty. If the text provided is invalid, the processor will ignore it and the output will be an empty array.

  • Processor
{
  "name": "Embeddings processor",
  "type": "open-ai-processor",
  "sourceField": "description",
  "output": "embeddings",
  "user": "pdp-embeddings",
  "backoffType": "constant",
  "backoffInitialDelay": "PT10S",
  "backoffMaxRetries": 25,
  "credentialId": "b0fa9c0f-912f-4345-88cf-1d78bdf462f1",
  "id": "fa16215c-1ff1-4233-8519-bb763144b7e9"
}
  • Input
{
  "description": ""
}
  • Output
{
  "description": "",
  "embeddings": []
}

Action: Completion

Uses Open AI's API chat completion feature (see here) to get a response from a given prompt.

Important Note: Be aware that regardless of the number of records configured per job, a separate request to the OpenAI API will be made for each record. This means that the number of API calls could be equal to the number of records, which might impact your API usage and potentially your billing if you’re using a paid API. Please consider this when configuring and running your jobs.

Configuration

Example configuration in a processor:

{
  "type": "open-ai-processor",
  "name": "Chat Completion",
  "model": "gpt-3.5-turbo",
  "promptField": "prompt",
  "output": "response",
  "temperature": "0.8",
  "maxPromptTokens": 8192,
  "user": "pdp-chat-completion",
  "timeout": "PT120S",
  "backoffType": "exponential",
  "backoffInitialDelay": "PT10S",
  "backoffMaxRetries": 7,
  "credentialId": "b0fa9c0f-912f-4345-88cf-1d78bdf462f1"
}

Configuration parameters:

promptField

(String) Field with the prepared prompt for chat completion. The prompt must contain the context if needed.

As a recommendation to add the contents of other fields into the prompt in the prompt field, use a field mapper template or a script processor to insert the values as needed.

output

(Optional, String) The records field where the response will be stored, defaults to chatGptMessage

model

(Optional, String) The Open AI chat completion model to use in requests, defaults to gpt-3.5-turbo. See Model endpoint compatibility (see here).

temperature

(Optional, String of a Double) What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. Defaults to "0.8"

maxPromptTokens

(Optional, Integer) The maximum amount of tokens that the prompt field can have. Implemented because if it surpasses 8192, the API will not give an answer even though the request was made. Defaults to 8192.

timeout

(Optional, String) The timeout on chat completion requests. Expressed in Duration notation, defaults to PT120S

user

(Optional, String) The user to be included in chat completion requests. If specified as null, no user will be included in requests. If not specified at all, defaults to pdp-content-processor.

backoffType

(Optional, String) The type of backoff to apply to the retries of the check for a passed cooldown. Options are none, constant, and exponential. Defaults to constant.

backoffInitialDelay

(Optional, String) The initial delay between backoff checks for a passed cooldown. Expressed in Duration notation, defaults to PT10S

backoffMaxRetries

(Optional, Integer) Maximum amount of times a processor tries to check for a passed cooldown before failing a job. Defaults to 25

Input/Output examples

  • Processor
{
  "name": "Chat Completion Action",
  "type": "open-ai-processor",
  "credentialId": "b0fa9c0f-912f-4345-88cf-1d78bdf462f1",
  "model": "gpt-3.5-turbo-16k",
  "promptField": "prompt",
  "output": "response",
  "temperature": 0.3,
  "user": "pdp-chat-completion-all",
  "timeout": "PT120S",
  "backoffType": "exponential",
  "backoffInitialDelay": "PT10S",
  "backoffMaxRetries": 7
}
  • Input
{
  "prompt": "For the title given 'Guest Blog: What is OpenSearch? - Pureinsights', generate questions that can be asked."
}
  • Output
{
  "prompt": "For the title given 'Guest Blog: What is OpenSearch? - Pureinsights', generate questions that can be asked.",
  "response": "1. What is the main purpose of OpenSearch?\n2. How does OpenSearch differ from other search engines?\n3. What are the key features of OpenSearch?\n4. How can businesses benefit from using OpenSearch?\n5. What are the potential challenges or limitations of implementing OpenSearch?\n6. How does OpenSearch ensure data privacy and security?\n7. Can OpenSearch be integrated with existing systems and applications?\n8. What is the process for setting up and configuring OpenSearch?\n9. How does OpenSearch handle scalability and performance issues?\n10. Are there any case studies or success stories of organizations using OpenSearch effectively?"
}

Credentials

This processor requires a special type of credentials due to the authentication to the Open AI API being done through an API token. The credentials also allow some configuration of the request cooldown functionality, which will be explained latter. The processor expects the presence of the following values in the config of the credential:

{
  "config": {
    "token": "<AnAPIToken>",
    "organizationId": "<AnOrgId>",
    "requestCooldown": "PT60S"
  }
}

Configuration parameters:

token

(Required, String) The API token used to the authenticate to Open AI's API.

organizationId

(Optional, String) An organization id to be included with requests to Open AI's API. If null, or not present, no organization id will be included in requests.

requestCooldown

(Required, String) When a request made by a processor with these credentials returns a rate limit error (see here), requests using this Credential (even in other processors) will be put in cooldown and thus rejected for the length of this Duration. The shared request cooldown is, nonetheless, limited to processors with the same timeout configuration value. Defaults to PT65S

And so a full credential that can be used with this processor looks like this:

  {
  "type": "open-ai-component",
  "name": "chat-gpt personal credentials",
  "description": "Credentials for ChatGPT",
  "config": {
    "token": "<AnAPIToken>",
    "organizationId": "<AnOrgId>",
    "requestCooldown": "PT60S"
  }
}

The credential must be of type open-ai-component, otherwise the processors using them will fail.

Request cooldown

Open AI processors come with a built-in request cooldown functionality to try to handle and minimize rate limit errors returned by Open AI API. How it works is that once a request made by a processor returns a rate limit error, the requests of all processors that share the same credential config values and timeout config value will be put in cooldown for the length of time specified in the credential. This cooldown can be applied again if, once the first cooldown has passed, a new request receives a rate limit error.

Each individual processor can be configured to check for the passing of this cooldown in a different manner through its backoff policy related config values. Note this backoff affects only how many times and how often the check for the passed cooldown is made. Not how for how long the requests will be rejected because of the cooldown.

It is recommended to have the backoff configured so that the processors only try to check for the ending of the cooldown for a reasonable amount of time, before failing due to the maximum amount of retries. This is because if a rate limit error is received because of a daily token limit or insufficient credit, then the cooldown will just be reapplied again and again and the processor could get stuck.

It is important to note that the request cooldown mechanism isn't perfect, and requests made too closely together in time by two or more processors sharing a cooldown can all receive a rate limit error, thus possibly increasing the time Open AI API's will be returning the same error. That being said, the requests have to be made at almost the same time for this happen, so it shouldn't be common. No matter how many requests receive the rate limit error, only one cooldown will be applied at a time.

Given that the rate limiting on Open AI API's side applies at the organization level first, and API token second, it is recommended to keep as many processors as possible sharing the same cooldown if they share the same API token and organization id in their credentials. Or in other words, it is recommended, if two or more processors share the same API token and organization id, to have them share the same requestCooldown value in the credential and timeout value in their configs, because then they will have a shared request cooldown and so less rate limit errors will be returned.

Additional Reference

Open AI's public API reference

©2024 Pureinsights Technology Corporation