PipeImgGen

The PipeImgGen operator is used to generate images from a text prompt using a specified image generation model.

How it works

PipeImgGen takes a text prompt and a set of parameters, then calls an underlying image generation service (like GPT Image or Flux) to create one or more images.

The pipe can be configured to generate a single image or a list of images.

Configuration

PipeImgGen is configured in your pipeline's .mthds file.

The `prompt` Field is Required

IMPORTANT: PipeImgGen requires an explicit prompt field. When using dynamic inputs, the prompt field must reference the input variable using the $ prefix.

Wrong - Missing required prompt field:

[pipe.render_slide]
type = "PipeImgGen"
inputs = { img_prompt = "ImgGenPrompt" }
output = "Image"
model = "$gen-image"
# ERROR: Missing required field 'prompt'

Correct - prompt field references the input:

[pipe.render_slide]
type = "PipeImgGen"
inputs = { img_prompt = "ImgGenPrompt" }
output = "Image"
prompt = "$img_prompt"
model = "$gen-image"

Image Generation Models and Backend System

PipeImgGen uses the unified inference backend system to manage image generation models. This means you can:

Use different image generation providers (OpenAI GPT Image, FAL models like Flux, etc.)
Configure image generation models through the same backend system as LLMs and OCR models
Use image generation presets for consistent configurations across your pipelines
Route image generation requests to different backends based on your routing profile

Common image generation model handles:

base-img-gen: Base image generation model (alias for flux-pro/v1.1)
best-img-gen: Best quality image generation model (alias for flux-pro/v1.1-ultra)
fast-img-gen: Fast image generation model (alias for fast-lightning-sdxl)

Image generation presets are defined in your model deck configuration and can include parameters like quality, guidance_scale, and safety_tolerance.

MTHDS Parameters

Parameter	Type	Description	Required
`type`	string	The type of the pipe: `PipeImgGen`	Yes
`description`	string	A description of the image generation operation.	Yes
`inputs`	dictionary	The input concept(s) for the image generation operation, as a dictionary mapping input names to concept codes. Required when the prompt references variables.	No
`output`	string	The output concept produced by the image generation operation. Use bracket notation for multiple images: `"Image[3]"` generates exactly 3 images.	Yes
`model`	string	The choice of image generation model name, setting, or preset to use (e.g., `"gpt-image-1"`). Defaults to the model specified in the global config.	No
`prompt`	string	The image generation prompt. Can be a static string or reference input variables using `$` prefix (e.g., `"$description"` or `"A sketch of: $subject"`).	Yes
`negative_prompt`	string	Optional negative prompt when supported by the selected model or provider.	No
`aspect_ratio`	string	The desired aspect ratio of the image. Valid values: `"square"`, `"landscape_4_3"`, `"landscape_3_2"`, `"landscape_16_9"`, `"landscape_21_9"`, `"portrait_3_4"`, `"portrait_2_3"`, `"portrait_9_16"`, `"portrait_9_21"`.	No
`seed`	integer or "auto"	A seed for the random number generator to ensure reproducibility. `"auto"` uses a random seed.	No
`background`	string	Optional background setting when supported by the selected provider.	No
`output_format`	string	Optional output image format when supported by the selected provider.	No
`is_raw`	boolean	Request a raw generation mode when supported by the selected provider.	No

Example: Generating a single image from a static prompt

This pipe generates one image of a futuristic car without requiring any input.

[pipe.generate_car_image]
type = "PipeImgGen"
description = "Generate a futuristic car image"
output = "Image"
prompt = "A sleek, futuristic sports car driving on a neon-lit highway at night."
model = "$gen-image"
aspect_ratio = "landscape_16_9"

Example: Generating an image from a dynamic prompt

This pipe takes a text prompt as input and generates an image.

[pipe.generate_from_description]
type = "PipeImgGen"
description = "Generate an image from a description"
inputs = { description = "ImgGenPrompt" }
output = "Image"
prompt = "$description"
model = "$gen-image"

Example: Dynamic prompt with prefix text

You can combine static text with dynamic variables in the prompt:

[pipe.generate_sketch]
type = "PipeImgGen"
description = "Generate a black and white sketch from a description"
inputs = { subject = "ImgGenPrompt" }
output = "Image"
prompt = "Create a black and white pencil sketch of: $subject"
model = "$gen-image"

Example: Generating multiple images

This pipe takes a text prompt as input and generates three variations of the image using bracket notation.

[pipe.generate_logo_variations]
type = "PipeImgGen"
description = "Generate three logo variations from a prompt"
inputs = { img_prompt = "ImgGenPrompt" }
output = "Image[3]"
prompt = "$img_prompt"
model = "$gen-image"
aspect_ratio = "square"

The output of the PipeImgGen has to be a concept compatible with the native Image concept. A concept is compatible with the Image concept if it refines the Image concept.

To use this pipe, you would first load a text prompt (e.g., "A minimalist logo for a coffee shop, featuring a stylized mountain and a coffee bean") into the input stuff (ImgGenPrompt concept). After running, the output will be a list containing three generated ImageContent objects.

Image Generation Feature - Overview of image generation capabilities
Cloud Storage - Store generated images in cloud storage