Pipeline Discovery and Loading
When running pipelines, Pipelex needs to find your .plx bundle files. There are two approaches:
-
Point to the bundle file directly - The simplest option. Just pass the path to your
.plxfile. No configuration needed. -
Configure library directories - For larger projects. Pipelex scans directories to discover all bundles, letting you reference pipes by code.
Most users should start with the first approach.
The Simplest Way: Use the Bundle Path Directly
If you just want to run a pipe from a single .plx file, you don't need any library configuration. Simply point to your bundle file:
# CLI: run the bundle's main_pipe
pipelex run path/to/my_bundle.plx
# CLI: run a specific pipe from the bundle
pipelex run path/to/my_bundle.plx --pipe my_pipe
# Python: run the bundle's main_pipe
pipe_output = await execute_pipeline(
bundle_uri="path/to/my_bundle.plx",
inputs={...},
)
# Python: run a specific pipe from the bundle
pipe_output = await execute_pipeline(
bundle_uri="path/to/my_bundle.plx",
pipe_code="my_pipe",
inputs={...},
)
This is the recommended approach for newcomers and simple projects. Pipelex reads the file directly - no discovery needed.
When to use library directories
The library directory configuration below is useful when you have multiple bundles across different directories and want to reference pipes by code without specifying the bundle path each time. For most use cases, pointing to the .plx file directly is simpler.
How Pipeline Discovery Works
When you initialize Pipelex with Pipelex.make(), the system:
- Scans your project directory for all
.plxfiles - Discovers Python structure classes that inherit from
StructuredContent - Loads pipeline definitions including domains, concepts, and pipes
- Registers custom functions decorated with
@pipe_func()
All of this happens automatically - no configuration needed.
Configuring Library Directories
When executing pipelines, Pipelex needs to know where to find your .plx files and Python structure classes. You can configure this using a 3-tier priority system that gives you flexibility from global defaults to per-execution overrides.
The 3-Tier Priority System
Pipelex resolves library directories using this priority order (highest to lowest):
| Priority | Source | When to Use |
|---|---|---|
| 1 (Highest) | Per-call library_dirs parameter |
Override for specific pipeline executions |
| 2 | Instance default via Pipelex.make(library_dirs=...) |
Application-wide default |
| 3 (Fallback) | PIPELEXPATH environment variable |
System-wide or shell session default |
Empty List is Valid
Passing an empty list [] to library_dirs is a valid explicit value that disables directory-based library loading. This is useful when using plx_content directly without needing files from the filesystem.
Using the PIPELEXPATH Environment Variable
The PIPELEXPATH environment variable provides a system-wide default for library directories. It uses PATH-style syntax:
- Unix/macOS: Colon-separated paths (
:) - Windows: Semicolon-separated paths (
;)
Bash / Zsh (macOS/Linux):
# Single directory
export PIPELEXPATH="/path/to/my/pipelines"
# Multiple directories (colon-separated)
export PIPELEXPATH="/path/to/shared_pipes:/path/to/project_pipes"
# Add to your ~/.bashrc or ~/.zshrc for persistence
echo 'export PIPELEXPATH="/path/to/my/pipelines"' >> ~/.bashrc
PowerShell (Windows):
# Single directory
$env:PIPELEXPATH = "C:\path\to\my\pipelines"
# Multiple directories (semicolon-separated)
$env:PIPELEXPATH = "C:\shared_pipes;C:\project_pipes"
In a .env file:
# .env file in your project root
PIPELEXPATH=/path/to/shared_pipes:/path/to/project_pipes
Using the CLI --library-dir Option
When running pipelines from the command line, use the -L or --library-dir option. This option can be specified multiple times to include multiple directories.
# Single directory
pipelex run my_pipe -L /path/to/pipelines
# Multiple directories
pipelex run my_pipe -L /path/to/shared_pipes -L /path/to/project_pipes
# Combined with other options
pipelex run my_bundle.plx --inputs data.json -L /path/to/pipelines
# Available on multiple commands
pipelex validate all -L /path/to/pipelines
pipelex show pipes -L /path/to/pipelines
pipelex which my_pipe -L /path/to/pipelines
CLI Option Overrides PIPELEXPATH
When you use -L on the command line, it takes precedence over any PIPELEXPATH environment variable that may be set.
Setting Instance Defaults with Pipelex.make()
For Python applications, you can set a default library directory when initializing Pipelex. This default will be used for all subsequent execute_pipeline calls unless overridden.
from pipelex.pipelex import Pipelex
from pipelex.pipeline.execute import execute_pipeline
# Set instance-level defaults at initialization
Pipelex.make(
library_dirs=["/path/to/shared_pipes", "/path/to/project_pipes"]
)
# All execute_pipeline calls will use these directories by default
pipe_output = await execute_pipeline(
pipe_code="my_pipe",
inputs={"input": "value"},
)
Per-Call Override with execute_pipeline
For maximum flexibility, you can override library directories on each execute_pipeline call:
from pipelex.pipelex import Pipelex
from pipelex.pipeline.execute import execute_pipeline
# Initialize with default directories
Pipelex.make(
library_dirs=["/path/to/default_pipes"]
)
# Use the default directories
output1 = await execute_pipeline(
pipe_code="default_pipe",
inputs={"input": "value"},
)
# Override for a specific execution
output2 = await execute_pipeline(
pipe_code="special_pipe",
library_dirs=["/path/to/special_pipes"], # Overrides instance default
inputs={"input": "value"},
)
# Disable directory loading (use only plx_content)
output3 = await execute_pipeline(
plx_content=my_plx_string,
library_dirs=[], # Empty list disables directory-based loading
inputs={"input": "value"},
)
Priority Resolution Examples
Example 1: Environment variable as fallback
# Shell: Set PIPELEXPATH
export PIPELEXPATH="/shared/pipes"
# Python: No library_dirs specified anywhere
Pipelex.make() # No library_dirs
# Uses PIPELEXPATH: /shared/pipes
output = await execute_pipeline(pipe_code="my_pipe", inputs={...})
Example 2: Instance default overrides PIPELEXPATH
# Shell: PIPELEXPATH is set
export PIPELEXPATH="/shared/pipes"
# Python: Instance default set
Pipelex.make(library_dirs=["/project/pipes"])
# Uses instance default: /project/pipes (PIPELEXPATH ignored)
output = await execute_pipeline(pipe_code="my_pipe", inputs={...})
Example 3: Per-call override takes highest priority
Pipelex.make(library_dirs=["/default/pipes"])
# Uses per-call value: /special/pipes
output = await execute_pipeline(
pipe_code="my_pipe",
library_dirs=["/special/pipes"], # Highest priority
inputs={...},
)
Best Practices
-
Use PIPELEXPATH for shared libraries: Set it in your shell profile for directories that should always be available.
-
Use
Pipelex.make(library_dirs=...)for application defaults: When building an application, set sensible defaults at initialization. -
Use per-call
library_dirsfor exceptions: Override only when a specific execution needs different directories. -
Use empty list
[]for isolated execution: When you want to execute only fromplx_contentwithout loading any file-based definitions. -
Include structure class directories: Remember that
library_dirsmust contain both.plxfiles AND Python files definingStructuredContentclasses.
Excluded Directories
To improve performance and avoid loading unnecessary files, Pipelex automatically excludes common directories from discovery:
.venv- Virtual environments.git- Git repository data__pycache__- Python bytecode cache.pytest_cache- Pytest cache.mypy_cache- Mypy type checker cache.ruff_cache- Ruff linter cachenode_modules- Node.js dependencies.env- Environment filesresults- Common output directory
Files in these directories will not be scanned, even if they contain .plx files or structure classes.
Project Organization
Golden rule: Put .plx files where they make sense in YOUR project. Pipelex finds them automatically.
Common Patterns
1. Topic-Based (Recommended for structured projects)
Keep pipelines with related code:
your_project/
├── my_project/
│ ├── finance/
│ │ ├── models.py
│ │ ├── services.py
│ │ ├── invoices.plx # With finance code
│ │ └── invoices_struct.py
│ └── legal/
│ ├── models.py
│ ├── contracts.plx # With legal code
│ └── contracts_struct.py
├── .pipelex/
└── requirements.txt
Benefits:
- Related things stay together
- Easy to find pipeline for a specific module
- Natural code organization
2. Centralized (For simpler discovery)
Group all pipelines in one place:
your_project/
├── my_project/
│ ├── pipelines/ # All pipelines here
│ │ ├── finance.plx
│ │ ├── finance_struct.py
│ │ ├── legal.plx
│ │ └── legal_struct.py
│ └── core/
└── .pipelex/
Benefits:
- All pipelines in one location
- Simple structure for small projects
Alternative Structures
Pipelex supports flexible organization. Here are other valid approaches:
Feature-Based Organization
your_project/
├── my_project/
│ ├── features/
│ │ ├── document_processing/
│ │ │ ├── extract.plx
│ │ │ └── extract_struct.py
│ │ └── image_generation/
│ │ ├── generate.plx
│ │ └── generate_struct.py
│ └── main.py
└── .pipelex/
Domain-Driven Organization
your_project/
├── my_project/
│ ├── finance/
│ │ ├── pipelines/
│ │ │ └── invoices.plx
│ │ └── invoice_struct.py
│ ├── legal/
│ │ ├── pipelines/
│ │ │ └── contracts.plx
│ │ └── contract_struct.py
│ └── main.py
└── .pipelex/
Flat Organization (Small Projects)
your_project/
├── my_project/
│ ├── invoice_processing.plx
│ ├── invoice_struct.py
│ └── main.py
└── .pipelex/
Loading Process
Pipelex loads your pipelines in a specific order to ensure dependencies are resolved correctly:
1. Domain Loading
- Loads domain definitions from all
.plxfiles - Each domain must be defined exactly once
- Supports system prompts and structure templates per domain
2. Concept Loading
- Loads native concepts (Text, Image, PDF, etc.)
- Loads custom concepts from
.plxfiles - Validates concept definitions and relationships
- Links concepts to Python structure classes by name
3. Structure Class Registration
- Discovers all classes inheriting from
StructuredContent - Registers them in the class registry
- Makes them available for structured output generation
4. Pipe Loading
- Loads pipe definitions from
.plxfiles - Validates pipe configurations
- Links pipes with their respective domains
- Resolves input/output concept references
5. Function Registration
- Discovers functions decorated with
@pipe_func() - Registers them in the function registry
- Makes them available for
PipeFuncoperators
Custom Function Registration
For custom functions used in PipeFunc operators, add the @pipe_func() decorator:
from pipelex.system.registries.func_registry import pipe_func
from pipelex.core.memory.working_memory import WorkingMemory
from pipelex.core.stuffs.text_content import TextContent
@pipe_func()
async def my_custom_function(working_memory: WorkingMemory) -> TextContent:
"""
This function is automatically discovered and registered.
"""
input_data = working_memory.get_stuff("input_name")
# Process data
return TextContent(text=f"Processed: {input_data.content.text}")
# Optional: specify a custom name
@pipe_func(name="custom_processor")
async def another_function(working_memory: WorkingMemory) -> TextContent:
# Implementation
pass
Validation
After making changes to your pipelines, validate them:
# Validate all pipelines
pipelex validate all
# Validate a specific pipe
pipelex validate pipe YOUR_PIPE_CODE
# Show all available pipes
pipelex show pipes
# Show details of a specific pipe
pipelex show pipe YOUR_PIPE_CODE
Best Practices
1. Organization
- Keep related concepts and pipes in the same
.plxfile - Use meaningful domain codes that reflect functionality
- Match Python file names with PLX file names (
finance.plx→finance.py) - Group complex pipelines using subdirectories
2. Structure Classes
- Only create Python classes when you need structured output
- Name classes to match concept names exactly
- Use
_struct.pysuffix for files containing structure classes (e.g.,finance_struct.py) - Inherit from
StructuredContentor its subclasses - Place structure class files near their corresponding
.plxfiles - Keep modules clean: Avoid module-level code that executes on import (Pipelex imports modules during auto-discovery)
3. Custom Functions
- Always use the
@pipe_func()decorator - Use descriptive function names
- Document function parameters and return types
- Keep functions focused and testable
- Keep modules clean: Avoid module-level code that executes on import (Pipelex imports modules during auto-discovery)
4. Validation
- Run
pipelex validate allafter making changes - Check for domain code consistency
- Verify concept relationships
- Test pipes individually before composing them
Troubleshooting
Pipelines Not Found
Problem: Pipelex doesn't find your .plx files.
Solutions:
- Ensure files have the
.plxextension - Check that files are not in excluded directories
- Verify file permissions allow reading
- Run
pipelex show pipesto see what was discovered
Structure Classes Not Registered
Problem: Your Python classes aren't recognized.
Solutions:
- Ensure classes inherit from
StructuredContent - Check class names match concept names exactly
- Verify files are not in excluded directories
- Make sure class definitions are valid Python
Custom Functions Not Found
Problem: PipeFunc can't find your function.
Solutions:
- Add the
@pipe_func()decorator - Ensure function signature matches requirements
- Check function is
asyncand acceptsworking_memory - Verify function is in a discoverable location
Validation Errors
Problem: pipelex validate all shows errors.
Solutions:
- Read error messages carefully - they indicate the problem
- Check concept references are spelled correctly
- Verify pipe configurations match expected format
- Ensure all required fields are present