Version 1.7.0

Release Notes

Release Date: April 25th, 2023

What’s New

We continue to improve our products, so in this release, we have focused on fixing bugs, minor improvements, and upgrading dependencies for better performance and security. We have also updated our Binary Data Storage Service to now support Amazon S3. As always, if you have any questions or feedback, please don't hesitate to reach out to our development team.

Continue to read on for all new features, bug fixes, and improvements.

New Features

Components

New Features

Components

New Features

Binary Data Storage

  • Binary data storage service now supports S3.

Bugs Fixed

  • Close record collector in AbstractRequestActionExecutor for errored jobs

  • Incorrect log message from Staging Hydrator

  • Error importing zip

  • [Discovery API] Feedback Atlas default template

  • [Ingestion] Possible memory leakage in Elasticsearch connector

  • ES Connector: Checksum is not hashed before adding a record.

  • Search UI showing more search result pages than required

  • Job in the wrong state should be retried

Improvements

  • Binary data server implementation in S3

  • Change spacy parameters collapsePunctuation and collapsePhrases to booleans

  • Bump of Dependencies

  • [Discovery API] The composite endpoint is not working with target FSM endpoint type

  • [Discovery-Ingestion] Component 'config' unbounded by Elasticsearch storage mapping

Breaking changes

  • If upgrading from an older version, export the existing configuration, delete the configuration indices (ingestion: seed, processor, and pipeline / discovery: endpoint and processor), and re-import it after upgrade has been performed.

Others

  • Apache Commons Collections updated to version 4.4

  • Apache Commons IO updated to version 2.11.0

  • Apache Commons Lang updated to version 3.12.0

  • Apache Commons Pool2 updated to version 2.11.1

  • AWS Java SDK S3 updated to version 1.12.412

  • Cron Utils updated to version 9.2.0

  • Fabric8 updated to version 6.4.1

  • Groovy updated to version 3.0.14

  • Hoverfly updated to version 0.14.3

  • Jackson updated to version 2.14.1

  • JSONPath updated to version 2.7.0

  • Jython updated to version 2.7.3

  • OkHTTP3 Mock Web Server updated to version 4.10.0

  • MongoJack updated to version 4.8.0

  • MongoDB updated to version 4.8.2

  • Nashorn updated to version 15.4

  • OkHTTP3 updated to version 4.10.0

  • Quartz updated to version 2.3.2

  • RabbitMQ updated to version 5.16.0

  • Reflections updated to version 0.10.2

  • Resilience4j updated to version 2.0.2

  • Rhino updated to version 1.7.14

  • Snappy updated to version 1.1.9.0

  • SSHj updated to version 0.34.0

  • Zstd updated to version 1.5.4-2

  • EvalEx updated to version 3.0.3

  • SystemStubs Jupiter updated to version 2.0.2

  • Java JWT updated to version 4.2.2

  • JWKS RSA updated to version 0.21.3

  • Lingua updated to version 1.2.2

  • Azure Blob Storage updated to version 12.20.2

  • Azure Identity updated to version 1.8.1

  • MySQL updated to version 8.0.31

  • MySQL Test updated to version 1.17.6

  • AWS Java SDK S3 updated to version 1.12.430

  • Norconex HTTP Collector updated to version 3.0.0

  • Apache Bridge updated to version 2.7.5

  • Open CSV updated to version 5.7.1

  • JMESPath Jackson updated to version 0.5.1

  • JSoup updated to version 1.15.4

  • Lingua updated to version 1.2.2

  • Tesseract For Java updated to version 5.5.0

  • Groovy updated to version 4.0.10

  • Jython updated to version 2.7.3

  • Javascript Engine updated to version 22.3.0

  • Apache Tika Core updated to version 2.6.0

  • Neo4j Driver updated to version 5.5.0

  • OpenSearch updated to version 2.5.0

Supported Versions

Discovery provides support and bug fixing for the following versions:

  • Version 1.6.0

    • Pipeline configuration should not allow null action

    • Ingestion Admin should allow deep cloning multiple times

    • Single seed schedules should not be enqueued if the same seed is already running

    • Discovery API - Endpoints should handle an empty body request

    • Discovery API - Mongo Component should store response body as JsonNode

    • Discovery API - Snap Component not casting error message when facets field points to an array

    • Breaking changes:

      • When configuring the S3 connector, the region must be in the format AWS requires it.

        • e.g. “us-east-1” instead of “"US_EAST_1"

  • Version 1.5.0

    • Internal server should not error when adding item with no "body" to Staging repo

    • JsonUtils can not substitute value properties of intNode objects when are into an array

  • Version 1.4.0 (If you are in this version, plan to upgrade soon)

    • Environment variables are not substituted in processors if there are no seed properties

    • Credentials Service cannot deserialize value '_id' from cache

    • Distilbert by HuggingFace Service causes node to reset

Deprecated Versions

Releases that are no longer recommended for use and their deprecation dates are listed below.

  • Version 1.3.0 (April 2023)

    • Discovery API’s Post Component was failing to parse response from elastic

    • Prevent creation of seeds with duplicate name

    • Mongo credential source should default to “admin”

    • Failed to encode 'Credential'. Encoding '_id' errored with: Can't find a codec for class java.lang.Object

    • Import fails with unhelpful error message

    • Aggregation merge should support empty values and also others than just content

  • Version 1.2.0 (March 2023)

    • SSL error when using UDEMY connector has been fixed.

    • Disk cache was not getting cleared when a crawl fails.

  • Version 1.1.0 (February 2023)

    • Breaking changes:

      • “recordData” parameter has been changed to “recordDataStrategy”.

      • To use “recordDataStrategy”, it must be set outside of the config block.

  • Version 1.0.0 or prior (December 2022)

©2024 Pureinsights Technology Corporation