PipeLLM
PipeLLM
is the core operator in Pipelex for leveraging Large Language Models (LLMs). It can be used for a wide range of tasks, including text generation, summarization, classification, and structured data extraction.
How it works
At its core, PipeLLM
constructs a detailed prompt from various inputs and templates, sends it to a specified LLM, and processes the output. It can produce simple text or complex structured data (in the form of Pydantic models).
For structured data output, PipeLLM
employs two main strategies:
- Direct Mode: The LLM is prompted to directly generate a JSON object that conforms to the target Pydantic model's schema. This is fast but relies on the LLM's ability to generate well-formed JSON.
- Preliminary Text Mode: This is a more robust two-step process: a. First, the LLM generates a free-form text based on the initial prompt. b. Second, another LLM call is made with a specific prompt designed to extract and structure the information from the generated text into the target Pydantic model.
Working with Images (Vision Language Models)
PipeLLM
supports Vision Language Models (VLMs) that can process both text and images. To use images in your prompts:
Basic Image Input
Images must be declared in the inputs
section of your pipe definition. The image will be automatically passed to the VLM along with your text prompt.
[pipe.describe_image]
PipeLLM = "Describe an image"
inputs = { image = "Image" }
output = "VisualDescription"
prompt_template = """
Describe the provided image in great detail.
"""
Important: Do NOT reference image variables in your prompt template using @image
or $image
. Images are automatically passed to vision-enabled LLMs and should not be treated as text variables.
Flexible Image Inputs
You can use any concept that refines Image
as an input, and choose descriptive variable names that fit your use case:
[pipe.analyze_wedding]
PipeLLM = "Analyze wedding photo"
inputs = { wedding_photo = "images.Photo" }
output = "PhotoAnalysis"
prompt_template = """
Analyze this wedding photo and describe the key moments captured.
"""
Images as Sub-attributes of Structured Content
When working with structured content that contains image fields (like PageContent
which has a page_view
field), you need to specify the full path to the image attribute in the inputs
section:
[pipe.analyze_page_view]
PipeLLM = "Analyze the visual layout of a page"
inputs = { "page_content.page_view" = "Image" }
output = "LayoutAnalysis"
prompt_template = """
Analyze the visual layout and design elements of this page.
Focus on typography, spacing, and overall composition.
"""
In this example:
- page_content
is the input variable containing a PageContent
object
- page_view
is the ImageContent
field within the PageContent
structure
- The dot notation page_content.page_view
tells Pipelex to extract the image from that specific field
Multiple Images
You can include multiple images in a single prompt by listing them in the inputs:
[pipe.compare_images]
PipeLLM = "Compare two images"
inputs = {
first_image = "Image",
second_image = "Image"
}
output = "ImageComparison"
prompt_template = """
Compare these two images and describe their similarities and differences.
"""
Combining Text and Image Inputs
You can mix any stuff and image inputs in the same pipe:
[pipe.analyze_document_with_context]
PipeLLM = "Analyze a document page with additional context"
inputs = {
context = "Text",
document.page_view = "Image"
}
output = "DocumentAnalysis"
prompt_template = """
Given this context: $context
Analyze the document page shown in the image and explain how it relates to the provided context.
"""
Configuration
PipeLLM
is configured in your pipeline's .toml
file.
TOML Parameters
Parameter | Type | Description | Required |
---|---|---|---|
PipeLLM |
string | A descriptive name for the LLM operation. | Yes |
inputs |
dictionary | The input concept(s) for the LLM operation, as a dictionary mapping input names to concept codes. For images within structured content, use dot notation (e.g., "page.image" ). |
Yes |
output |
string | The output concept produced by the LLM operation. | Yes |
llm |
string or table | Specifies the LLM preset(s) to use. Can be a single preset or a table mapping different presets for different generation modes (e.g., main , object_direct ). |
No |
system_prompt |
string | A system-level prompt to guide the LLM's behavior (e.g., "You are a helpful assistant"). Can be inline text or a reference to a template file ("file:path/to/prompt.md" ). |
No |
prompt |
string | A simple, static user prompt. Use this when you don't need to inject any variables. | No |
prompt_template |
string | A template for the user prompt. Use $ for inline variables (e.g., $topic ) and @ to insert the content of an entire input (e.g., @text_to_summarize ). Note: Do not use @ or $ for image variables. |
No |
images |
list of strings | Deprecated: Use the inputs section to declare image inputs instead. |
No |
structuring_method |
string | The method for generating structured output. Can be direct or preliminary_text . Defaults to the global configuration. |
No |
prompt_template_to_structure |
string | The prompt template for the second step in preliminary_text mode. |
No |
nb_output |
integer | Specifies exactly how many outputs to generate (e.g., nb_output = 3 for exactly 3 outputs). Use when you need a fixed number of results. Mutually exclusive with multiple_output . |
No |
multiple_output |
boolean | Controls output generation mode. Default is false (single output). Set to true for variable-length list generation when you need an indeterminate number of outputs. Mutually exclusive with nb_output . |
No |
Output Generation Modes
PipeLLM
supports three different output generation modes:
-
Single Output (default): Don't specify
nb_output
ormultiple_output
, or setmultiple_output = false
. The LLM generates exactly one result. -
Fixed Multiple Outputs: Use
nb_output = N
(where N is a positive integer) when you need exactly N outputs. For example,nb_output = 3
will try to generate 3 results. The parameter_nb_output
will be available in the prompt template, e.g. "Give me the names of $_nb_output flowers". -
Variable Multiple Outputs: Use
multiple_output = true
when you need a variable-length list where the LLM determines how many outputs to generate based on the content and context. |output_multiplicity
| string or integer | Defines the number of outputs. Use"list"
for a variable-length list, or an integer (e.g.,3
) for a fixed-size list. | No |
Examples
Simple Text Generation Example
This pipe takes no input and writes a poem.
[pipe.write_poem]
PipeLLM = "Write a short poem"
output = "Text"
llm = "llm_for_creative_writing"
prompt = """
Write a four-line poem about pipes.
"""
Text-to-Text Example
This pipe summarizes an input text, using a prompt_template
to inject the input.
[pipe.summarize_text]
PipeLLM = "Summarize a text"
inputs = { text = "TextToSummarize" }
output = "TextSummary"
prompt_template = """
Please provide a concise summary of the following text:
@text
The summary should be no longer than 3 sentences.
"""
Vision (VLM) Example
This pipe takes an image of a table and uses a VLM to extract the content as an HTML table.
[pipe.extract_table_from_image]
PipeLLM = "Extract table data from an image"
inputs = { image = "TableScreenshot" }
output = "TableData"
prompt_template = """
Extract the table data from this image and format it as a structured table.
"""
Structured Data Extraction Example
This pipe extracts a list of Expense
items from a block of text.
[concept.Expense]
structure = "Expense" # Assumes a Pydantic model 'Expense' is defined
[pipe.process_expense_report]
PipeLLM = "Process an expense report"
inputs = { report = "ExpenseReport" }
output = "ProcessedExpenseReport"
prompt_template = """
Analyze this expense report and extract the following information:
- Total amount
- Date
- Vendor
- Category
- Line items
@report
"""
In this example, Pipelex
will instruct the LLM to return a list of objects that conform to the Expense
structure.