Version 1.15.0

Release Notes

Release Date: May 10th, 2024

What’s New

In this version, we've introduced improvements to our Website Connector to support more configuration options, adding more flexibility in gathering data from your sites. The HTML content processor has been updated to allow the user to extract multiple matches from a single selector, with different output options. Finally the Staging Hydrator now provides a way to select which field to use as record ID when hydrating data into our Staging Repository.

Feel free to contact our development team with any questions, suggestions, or feedback. We value your input and are committed to continuously improving our services.

New Features

Components

New Features

Components

New Features

Website Connector

  • Option to add custom Norconex Importer configurations

  • Support for Forms authentication.

  • Option to set maximum documents per crawl url.

Html Processor

  • Option to extract multiple selector matches into a single string or an array.

Staging Hydrator

  • Hydrate record id is now configurable.

Bugs Fixed

  • None

Improvements

  • Discovery API Post component timeout is now configurable.

  • Seeds can now be configured to trigger other seeds when the execution is done.

Breaking changes

  • None.

Others

  • None.

Supported Versions

Discovery provides support and bug fixing for the following versions:

  • Version 1.14.0

    • Elasticsearch connector option to crawl an aggregation.

    • Language Detector option to have multiple source fields.

  • Version 1.13.0

    • Discovery 1.12.0 was tested in AKS v1.28 with Azure CNI.

    • Discovery 1.11.0 was tested in GKE v1.28.4 with GCP.

  • Version 1.12.0

    • If upgrading from a version prior to 1.7.0, export the existing configuration, delete the configuration indices (ingestion: seed, processor, and pipeline / discovery: endpoint and processor), and re-import it after upgrade has been performed.

    • Discovery 1.12.0 was tested in AKS v1.28 with Azure CNI.

Deprecated Versions

Releases that are no longer recommended for use and their deprecation dates are listed below.

  • Version 1.11.0 (May 2024)

    • If upgrading from a version prior to 1.7.0, export the existing configuration, delete the configuration indices (ingestion: seed, processor, and pipeline / discovery: endpoint and processor), and re-import it after upgrade has been performed.

    • Trim UUIDs on API

    • Discovery 1.11.0 was tested in GKE v1.28.4 with GCP.

  • Version 1.10.0 (February 2024)

    • If upgrading from a version prior to 1.7.0, export the existing configuration, delete the configuration indices (ingestion: seed, processor, and pipeline / discovery: endpoint and processor), and re-import it after upgrade has been performed.

    • Json exception when script processor compilation fails

    • BatchId is stored as null

    • OCR component records don't continue in the pipeline

    • Internal server error importing a zip file

    • [Search UI] Queries with special regex characters causes page to fail

    • Scheduler entity is ignoring some properties from the configuration

    • Website Connector requires a mountpoint to work in k8s

    • Hugging Face service creates outdated file structure

  • Version 1.9.0 (January 2024)

    • If upgrading from a version prior to 1.7.0, export the existing configuration, delete the configuration indices (ingestion: seed, processor, and pipeline / discovery: endpoint and processor), and re-import it after upgrade has been performed.

    • Improvements in JsonUtils.substitute

    • Token scanner writes checksums to the scan index.

    • Staging Connector - Aggregation by group doesn't support Arrays

    • Discovery 1.9.0 was tested in AKS v1.25.11, v1.26.6, and v1.27.1

©2024 Pureinsights Technology Corporation