LLM Generate
Last updated
Was this helpful?
Last updated
Was this helpful?
LLM Generate is the primary object of Astera’s AI offerings. When used in a logical combination with other objects, we can use it to create AI-powered solutions.
LLM Generate allows the user to retrieve an output from an LLM model, based on the input prompt. The user can select from a choice of LLM providers, including OpenAI, Llama etc. User also has the option to use custom LLM models.
It has an input port and an output port.
Input port allows us to map fields that we want to include in the input prompt to the LLM model to generate the result.
Output port is populated with the result of LLM Generate.
LLM Generate can be used in countless use cases to generate unique applications. Here, we will cover a basic use case, where LLM Generate will be used to create an invoice extraction solution.
The source file is a PDF invoice. In the output, we want the extracted data from the invoice in a structured JSON file. We want to create a flexible extraction solution that can take invoices of various unpredictable layouts and generate the JSON output in a fixed format.
Some possible layouts of input invoice:
Create a new dataflow. Here we will design our invoice extraction pipeline.
To read unstructured invoice as a source in our pipeline, we can drag and drop the Text Convertor object from the Sources section in the toolbox. Configure it by providing the path of your source file.
In the output port of the source object, we have the entire content of the pdf file as a single string.
This string can now be mapped to the LLM Generate object as input, along with our instructions in the prompt to generate the output.
To do this, we will drag-and-drop LLM Generate object from AI Section of the toolbox to the dataflow designer.
To use an LLM Generate object, we need to map input field/s and define a prompt. In the Output node we get the response of the LLM model, which we can map to downstream objects in our data pipeline. Other configurations of the LLM Generate are set as default but may be adjusted if required by the use case.
As the first step, we will map our input fields to the LLM Generate object’s input node. We can map any number of input fields as required by our use case. For our use case, we will map a single input field, the invoice text from the Text Convertor. This field will have the invoice content as a string. We can rename the input fields, if needed, inside the LLM Generate object.
The next step is to write the prompt that will act as a set of instructions to the LLM for the response that we would like in the output. Go into the properties of LLM Generate object and right-click on the Prompts node and select ‘Add Prompt’. You can also use the ‘Add Prompt’ button at the top of the layout window.
A Prompt node will appear containing the Properties and Text fields.
Prompt Properties
Properties are set by default. Clicking the Properties field opens the Prompt options. The default settings are as shown in the image below:
Run Strategy Type: Defines the execution of the object based on the input.
Once Per Item means that the object will run once per input record. This option is used in cases where the input has multiple records and LLM Generate is to be executed for each record. The output of LLM Generate will have the same number of records as the input.
Chain means that the object will use the output of one prompt and feed it as input for the subsequent prompt within the LLM Generate object. To use the output from the last prompt within the current prompt use the syntax {LLM.LastPrompt.Result}
. To use the prompt from the last prompt within the current prompt use the syntax {LLM.LastPrompt.Text}
.
Conditional Expression: Here, you can provide the condition that must be satisfied for this prompt to be used in the LLM Generate execution. It works in conjunction with multiple prompts, in cases where one of the multiple prompts are to be used based on some criteria.
For our use case, we have used the default settings of the Prompt Properties.
Prompt Text
Prompt text allows us to write the prompt that is sent to the LLM model to get the response in the output.
In the prompt, we can include the contents of the input fields using the syntax:
{Input.field}
In the above syntax, we can provide the input field name in place of field.
We can also use functions to customize our prompt by clicking the functions icon.
For instance, the following syntax will resolve to the first 1000 characters of the input field value in the prompt:
{Left(Input.field,1000)}
For our use case, we will write a prompt that instructs the LLM to extract data from the provided invoice and generate the output in the JSON structure we have provided in the prompt.
Click Next to move to the next screen. This is the LLM Generate Properties screen.
General Options
We can select the AI provider and model for our object. Additionally, we can use LLM models not provided in the list or a custom LLM model by configuring their API connection as part of a shared connection file inside the project. However, custom fine-tuned models are only supported when using the "Open AI" AI provider.
Ai SDK Options
AI SDK Options allow fine-tuning the output or behavior of the model.
Evaluation Metrics: Enabling this option introduces three additional fields in the output, OutputTokens, LogProbs, and PerplexityScore.
Output Tokens : The total number of tokens of the generated result by LLM Generate. It can help understand the volume/length of the generated content.
LogProbs: Log-probabilities associated with each generated token. These values represent the likelihood (in logarithmic scale) of a specific token being generated based on the model's understanding of the input and context. It depicts the model’s confidence in generating each token.
To understand Log-Probs better, we have a second flow here that identifies the document type from the available options we have provided in the prompt.
We want the confidence score of the result of the AI model as well. For this, we can parse the LogProbs and calculate its exponential. This linear prob calculation is only possible when there is a single value of logprob, which means the output will be set to generate a single token. It is useful for classification cases such as this one, or where boolean response is expected.
After applying exponential to logprob, it becomes linear probability. In the output, we will have the result and the linear probability, or the confidence score. A value closer to 1 means higher confidence of the result.
PerplexityScore: It measures how well a language model predicts a sequence of words. Lower perplexity (closer to 1) means better predictions, while higher perplexity indicates greater uncertainty in predicting the next word.
Max Tokens: Limits the output tokens. Limit is specific to the model. Each model has its max token limit, and we need to set our limit within that threshold.
Temperature: It controls randomness in model predictions. At temperature 0 (default), the model's output is deterministic and consistent, always choosing the most likely token. This also means the model will always produce the same output for the same input. Higher temperatures increase randomness and creativity in the output.
Top P: Also called nucleus sampling controls the diversity of generated output by selecting tokens from a subset of the most likely options. Top P 0.1 (default) means the model will only consider the smallest set of tokens whose cumulative probability is at least 10%. This significantly narrows down the possible token choices, making the output more focused and less random. Increasing the Top P results in less constrained and more creative responses.
To understand these options better click here.
For our primary use case, of invoice extraction, we are using OpenAI gpt-4 and default configurations for other options on the Properties screen to generate the result.
Now, we’ll click OK to complete the configuration of the LLM Generate object. We can preview the output to confirm that we are getting the desired response.
Now we want to write this text output to a JSON file. We will first drag and drop a JSON Parser onto the designer. We will map the output field of the LLM Generate object to the input field of the JSON Parser object.
Open the Properties of JSON Parser. On the layout screen, we can create our preferred layout or provide a JSON sample to generate the layout automatically. We have copied the same layout we provided in the prompt and pasted into the ‘Generate Layout by Providing Sample Text’ option in the Json layout window.
Once the JSON Parser object is configured, drag and drop a JSON Destination file, configure its file path and map all the fields from the JSON Parser output.
Our dataflow is now configured, and we can run it to create the JSON file for our invoice.
To automate the process for extracting multiple invoices, we will create a workflow. To parameterize the source and destination file paths, we will add and configure a Variables object in our dataflow.
In our workflow, we’ll configure three objects:
File System Items Source: Provide the folder path where all of our invoices are stored.
Expression: To generate an output JSON file path for each invoice using the source file name.
Run Dataflow: Provide the file path for the pre-configured dataflow.
Once our workflow is configured, we can run it to extract data and write to JSON files for all of our invoices.
The flexibility of LLM Generate to provide an input, give natural language commands on how to manipulate the input to generate the output makes it a dynamic universal transformation object in a data pipeline. There can be countless use cases for LLM Generate. We will cover some of these in the next documents.