LLM Settings Guide

Overview

Pipelex provides a flexible way to configure and manage your LLM (Large Language Model) integrations through the inference backend configuration system.

The system provides three main concepts for LLM configuration:

LLM Handles (Aliases)
LLM Presets
Model Deck

For complete details about the inference backend configuration system, see the Inference Backend Configuration documentation.

LLM Handles (Aliases)

An LLM handle can be either:

A direct model name (like "gpt-4o-mini", "claude-3-sonnet") - automatically available for all models loaded by the inference backend system
An alias - user-defined shortcuts that map to model names, defined in the [aliases] section:

Example Alias Configurations

[aliases]
best-claude = "claude-4.1-opus"
best-gemini = "gemini-2.5-pro"
best-mistral = "mistral-large"
base-gpt = "gpt-5"

The system first looks for direct model names, then checks aliases if no direct match is found. The system handles model routing through backends automatically.

💡 Defining an alias is always meant to describe what model it is. Never define an alias to describe what it is for or what it's good at. LLM Presets are for that.

LLM Settings & LLM Presets

LLM Settings combine an LLM handle with specific parameters optimized for particular tasks. They help maintain consistency across similar operations and make it easier to switch between different configurations.

An LLM Preset is simply a name for a LLM Settings that you have predefined in order to use it in various places.

Example LLM Preset definitions

[llm_presets]

llm_to_reason = { 
    llm_handle = "gpt-4-turbo", 
    temperature = 0.7, 
    max_tokens = "auto" 
}

llm_to_extract = { 
    llm_handle = "claude-4-sonnet", 
    temperature = 0.1, 
    max_tokens = "auto" 
}

Using LLM Settings in Pipelines

Here's how to use these configurations in your pipelines:

[pipe.generate_response]
type = "PipeLLM"
definition = "Generate a creative response"
inputs = { question = "Question" }
output = "Response"
llm = {
    llm_handle = "gpt-4-turbo",  # Using inline LLM settings
    temperature = 0.8,
    max_tokens = "auto",
}
prompt = """
Generate a creative response to this question:

@question
"""

[pipe.extract_weather_data]
type = "PipeLLM"
definition = "Extract structured weather data from text"
inputs = { text = "Text" }
output = "WeatherData"
llm = "llm_to_extract"  # Using a preset
prompt = """
Extract the weather data from this text:

@text
"""

Model Deck

The Model Deck is your central configuration hub for all LLM-related settings. It's stored in the .pipelex/inference/deck/ directory and consists of:

base_deck.toml: Core LLM configurations including aliases and presets
overrides.toml: Custom overrides for specific use cases

Directory Structure

.pipelex/
└── inference/
    ├── backends.toml              # Backend configurations
    ├── routing_profiles.toml      # Model routing rules
    ├── backends/                  # Individual backend model specs
    │   ├── openai.toml
    │   ├── anthropic.toml
    │   └── ...
    └── deck/                      # Model deck configurations
        ├── base_deck.toml         # Aliases and presets
        └── overrides.toml         # Custom overrides

Best Practices

Consistent Naming: Use clear, descriptive names for handles and presets
Task-Specific Presets: Create presets optimized for specific skills and tasks
Cost Management: Consider using different models based on task complexity and cost requirements