Vector DB Destination

Overview

The VectorDB Destination object in Astera, allows seamless integration with Vector databases for efficient data storage and retrieval. This destination option streamlines the process of writing data to Vector databases while offering advanced configuration options for optimal performance.

Getting the VectorDB Destination Object

  1. To add a VectorDB Destination object to your dataflow, go to Toolbox > Destinations > VectorDB Destination. If you are unable to see the Toolbox, go to View > Toolbox or press Ctrl + Alt + X.

  1. Drag-and-drop the VectorDB Destination object onto the designer.

Configuring the Vector DB Destination Object

  1. To configure the properties of the VectorDB Destination object, right-click on the header and select Properties from the context menu.

  1. This will open a new window, Vector Database Connection, in Astera. This is where you select the specific Vector Database provider you want to connect to.

  • Here, you have a fixed provider, Pinecone.

  • Environment: You can specify the environment of the Vector database to establish the connection accurately.

  • API Key: API key required for secure access to the Vector database.

  1. Now, you need to configure the vector embedding provider.

  • The Vector Embedding Connection window features a fixed embedding provider, OpenAI, enhancing the embedding capabilities of Vector databases.

  • API Key: API key required for secure access to the embedding provider.

  • Embedding Models: You can select the embedding models to be utilized for data storage and retrieval.

  • The model dimensions are set to 1536 by default, ensuring compatibility with the Vector database.

  • You can also use the Recently Used drop-down list to connect to a recently connected provider.

  1. The next window is the Vector Database Pick Index window. Here, you can choose from the following options:

  • Pick Index: To append data into an existing index.

  • Create/Replace: To write data to a new index or replace an existing index.

  • Insert options:

    • Upsert: Update existing records if found, else insert new records.

    • Update: Update existing records in the Vector database.

    • Delete: Delete records from the Vector database.

  • Bulk Insert Options

  • Bulk insert with batch size when you want the whole dataset to be loaded in batches for the specified size. Typically, larger batch sizes result in better transfer speeds; however performance gains may be less with relatively large batch sizes.

  • Bulk insert with all records in one batch when you want all the records to be loaded into a table in one batch. In this case, any database specific error in your transfer won’t show until the end of the transfer.

  • Use single record insert when you want records to be loaded individually. Records are inserted into a destination table one-by-one. This loading option renders the slowest performance among the three insert types. However, any errors or warnings during the transfer are displayed immediately as the transfer progresses.

  1. The next window you will see is the Layout Builder. Here, the layout of the vector database destination file can be modified.

  1. Once the object layout is configured, click Next. This will take you to the Config Parameters window where you can further configure and define parameters for the database destination file.

Parameters can provide easier deployment of flows by eliminating hardcoded values and provide an easier way of changing multiple configurations with a simple value change during the runtime.

Note: Parameters left blank will use their default values assigned on the properties page.

  1. Click Next. A General Options window will appear. Here you have the following options:

  • Comments can be added.

  • General Options are given, which relate to the processing of records in the destination file.

    • Clear Incoming Record Messages for any messages coming in from objects preceding the current object to be cleared.

    • Do Not Process Records With Errors will not let erroneous records to process further for the output.

    • Do Not Overwrite Default Values with Nulls makes sure that values are not overwritten with null values in the output.

  1. Click OK.

The VectorDB Destination object is now configured according to the settings made in the properties window.

The VectorDB Destination object is now successfully configured, and the destination file can now be created by running the dataflow.

Last updated

© Copyright 2023, Astera Software