# LLM Generate

## Overview&#x20;

LLM Generate is a core component of Astera’s AI offerings, enabling users to retrieve responses from an LLM based on input prompts. It supports various providers like OpenAI, Llama, and custom models, and can be combined with other objects to build AI-powered solutions. The object can be used in any ETL/ELT pipeline, by sending incoming data in the prompt and using LLM responses for transformation, validation, and loading steps

This unlocks the ability to incorporate AI-driven functions into your data pipelines, such as:&#x20;

* [Data Classification](#use-case)
* Template-less data extraction&#x20;
* Natural language to SQL Generation&#x20;
* Data Summarization
* Data Augmentation&#x20;

## Use Case&#x20;

LLM Generate can be used in countless use cases to generate unique applications. Here, we will cover a simple use case, where LLM Generate will be used to classify support tickets.  &#x20;

The source is an Excel spreadsheet with customer support ticket data.&#x20;

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/IlkUZVp8Zw7CPhJjxhRG/1-ExcelSource.png" alt=""><figcaption><p>Excel Source</p></figcaption></figure>

We want to add an additional category field to the data, which will contain one of the following tags based on the content of the *customer\_message* field:&#x20;

* Billing
* Technical Issue
* Account Management
* Delivery
* Product Inquiry

This use case requires natural language understanding of the customer message to assign it a relevant category, making it an ideal use case for LLM Generate. &#x20;

## How To Work with LLM Generate&#x20;

1. Drag-and-drop an *Excel Workbook Source* object from the toolbox to the dataflow as our source data is stored in an Excel file.
2. Now we can use the *customer\_message* field from the *Excel Source* and provide it to the *LLM Generate* object as input, along with a prompt containing instructions that will populate the category field with a category. \
   To do this, let's drag-and-drop the *LLM Generate* object from the AI Section of the toolbox to the dataflow designer.

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/WlVbZzZDBhAQKuGzNae4/LLM%20Generate%202.gif" alt=""><figcaption><p>LLM Generate</p></figcaption></figure>

To use an LLM Generate object, we need to map input field(s) and define a prompt. In the Output node we get the response of the LLM model, which we can map to further objects in our data pipeline.&#x20;

*Other configurations of the LLM Generate are set as default but may be adjusted if required by the use case.*  &#x20;

3. As the first step, we will auto-map our input fields to the *LLM Generate* object’s input node. We can map any number of input fields as required by our use case. These input fields may or may not be used inside the prompt. Any fields not used in the prompt will still pass through the object and can be used unchanged later in the flow.

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/0BFmQMj6e4OdDZmqzWj1/temp.gif" alt=""><figcaption><p>Mapping to LLM Generate</p></figcaption></figure>

4. The next step is to write ta prompt that serves as instructions for the LLM to generate the desired output.\
   Right-click on the *LLM Generate* object and select *Properties*, then click the *Add Prompt* button at the top of the *LLM Template* window to add a prompt.

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/KQCilwWQQScYZmkESjcm/temp.gif" alt=""><figcaption><p>Adding Prompt</p></figcaption></figure>

5. A *Prompt* node will appear containing the *Properties* and *Text* fields. &#x20;

### Prompt Properties&#x20;

Prompt properties are set by default. We can click the *Properties* field to view or change these settings. The default configuration is as shown in the image below:&#x20;

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/R5tLCtsekm8lSU0WRVe1/4-PromptProperties.png" alt=""><figcaption><p>Prompt Properties</p></figcaption></figure>

Let’s quickly go over what each of these options means:

* *Run Strategy Type*: Defines the execution of the object based on the input. &#x20;
  * *Once Per Item* means the object runs once for each input record. Use this when the input has multiple records and LLM Generate is required to execute for each one.&#x20;
  * *Chain* means the object uses the output of one prompt as input for the next within the LLM Generate object. Use `{LLM.LastPrompt.Result}` to use the previous prompt's output, and `{LLM.LastPrompt.Text}` to use the previous prompts text itself.
* *Conditional Expression:* Specify the condition under which this prompt should be used. Useful when you want to select one prompt from several based on some criteria. &#x20;

For our use case, we will be using default settings of the Prompt Properties.&#x20;

### Prompt Text&#x20;

Prompt text allows us to write a prompt consisting of instructions, which is sent to the LLM model to get the response in the output. &#x20;

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/p4LvDYzpk9SevaBZT1eV/5-PromptText.PNG" alt=""><figcaption><p>Prompt text</p></figcaption></figure>

In the prompt, we can include the contents of the input fields using the syntax: \
`{Input.field}`&#x20;

In the above syntax, we can provide the input field name in place of `field`.&#x20;

We can also use functions to customize our prompt by clicking the functions icon.&#x20;

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/tXselpm42jNuZPVd9VUP/6-Functions.PNG" alt=""><figcaption><p>Functions</p></figcaption></figure>

For instance, the following syntax will resolve to the first 1000 characters of the input field value in the prompt:&#x20;

`{Left(Input.field,1000)}`&#x20;

6. For our use case, we will write a prompt that instructs the LLM to classify the customer message into one of the categories we have provided in the prompt.&#x20;

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/3dbQFE93P12NJZwzYzPI/7-Prompt.PNG" alt=""><figcaption><p>LLM Template</p></figcaption></figure>

7. Click *Next* to move to the next screen. This is the *LLM Generate:* *Properties* screen. &#x20;

### Properties&#x20;

Let's see the options provided here:

#### General Options&#x20;

* *AI provider:* Select your desired AI provider here (e.g. OpenAI, Anthropic, Llama)
* *Shared Connection:* Select the [*Shared API Connection*](https://documentation.astera.com/api-flow/api-consumption/consume/api-connection) you have configured with API key of the provider.&#x20;
* *Model Type/Model:* Select the model type (if applicable) and choose the specific model you want to use.

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/kZKH7sM64tfG6iThZ0nF/8-GeneralOptions.PNG" alt=""><figcaption><p>General Options</p></figcaption></figure>

#### Ai SDK Options&#x20;

Ai SDK Options allows us to fine-tune the output or behavior of the model. We will discuss these options in detail in the next article.

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/uWC7BcCTBTZl7wim3TRy/9-AISDK.PNG" alt=""><figcaption><p>Ai SDK Options</p></figcaption></figure>

11. For our primary use case, of support ticket classification, we are using *OpenAI* as our provider, *gpt-4o* as our model and default configurations for other options on the *Properties* screen to generate the result.&#x20;
12. Now, we’ll click *OK* to complete the configuration of the *LLM Generate* object. &#x20;
13. We can right-click on the *LLM Generate* object and click *Preview Output* to preview the LLM response and confirm that we are getting the desired result. \
    We can see that the LLM response gives the category of the support ticket based on the customer message. &#x20;

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/hJ2NsklEbxvPSDmJBuQb/10-LLMOutput.PNG" alt=""><figcaption><p>LLM Generate Preview Output</p></figcaption></figure>

## Writing to Destination&#x20;

14. We can now use the LLM’s result in other objects to transform, validate or just write it. Let’s say we want to write the enriched support ticket data to a CSV destination. &#x20;
15. To do this, let's drag-and-drop a [*Delimited Destination*](https://documentation.astera.com/dataflows/destinations/delimited-file-destination) from the toolbox.&#x20;
16. Next, let's map the original fields to the *Delimited Destination* object and create a new field called *Category*. The LLM’s *Result* field can be mapped to this *Category* field.

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/BtrbEEvkceweLUEYOmrD/11-Destination.PNG" alt=""><figcaption><p>Delimited Destination</p></figcaption></figure>

17. Our dataflow is now configured; we can preview the output of our *Delimited Destination* to see what the final support ticket data will look like.

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/cg1rqqZlVUv1TUeP1R3I/11a-FinalOutput.PNG" alt=""><figcaption><p>Delimited Destination Preview Output</p></figcaption></figure>

18. We can also run this dataflow to create the delimited file containing our enriched support ticket data.

<figure><img src="https://content.gitbook.com/content/zEifS4h8yurLAAwiGNX2/blobs/zrtkbh29dKOgocYkNE9F/12-JobProgress.PNG" alt=""><figcaption><p>Job Progress</p></figcaption></figure>

## Summary&#x20;

The flexibility of LLM Generate to provide an input and give natural language commands on how to manipulate the input to generate the output makes it a dynamic universal transformation object in a data pipeline. There can be countless use cases for LLM Generate. We will cover some of these in the next articles.&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.astera.com/astera-intelligence/llm-generate.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
