Skip to content

PipeImgGen

The PipeImgGen operator is used to generate images from a text prompt using a specified image generation model.

How it works

PipeImgGen takes a text prompt and a set of parameters, then calls an underlying image generation service (like GPT Image or Flux) to create one or more images.

The pipe can be configured to generate a single image or a list of images.

Configuration

PipeImgGen is configured in your pipeline's .mthds file.

The prompt Field is Required

IMPORTANT: PipeImgGen requires an explicit prompt field. When using dynamic inputs, the prompt field must reference the input variable using the $ prefix.

Wrong - Missing required prompt field:

[pipe.render_slide]
type = "PipeImgGen"
inputs = { img_prompt = "ImgGenPrompt" }
output = "Image"
model = "$gen-image"
# ERROR: Missing required field 'prompt'

Correct - prompt field references the input:

[pipe.render_slide]
type = "PipeImgGen"
inputs = { img_prompt = "ImgGenPrompt" }
output = "Image"
prompt = "$img_prompt"
model = "$gen-image"

Image Generation Models and Backend System

PipeImgGen uses the unified inference backend system to manage image generation models. This means you can:

  • Use different image generation providers (OpenAI GPT Image, FAL models like Flux, etc.)
  • Configure image generation models through the same backend system as LLMs and OCR models
  • Use image generation presets for consistent configurations across your pipelines
  • Route image generation requests to different backends based on your routing profile

Common image generation model handles:

  • base-img-gen: Base image generation model (alias for flux-pro/v1.1)
  • best-img-gen: Best quality image generation model (alias for flux-pro/v1.1-ultra)
  • fast-img-gen: Fast image generation model (alias for fast-lightning-sdxl)

Image generation presets are defined in your model deck configuration and can include parameters like quality, guidance_scale, and safety_tolerance.

MTHDS Parameters

Parameter Type Description Required
type string The type of the pipe: PipeImgGen Yes
description string A description of the image generation operation. Yes
inputs dictionary The input concept(s) for the image generation operation, as a dictionary mapping input names to concept codes. Required when the prompt references variables. No
output string The output concept produced by the image generation operation. Use bracket notation for multiple images: "Image[3]" generates exactly 3 images. Yes
model string The choice of image generation model name, setting, or preset to use (e.g., "gpt-image-1"). Defaults to the model specified in the global config. No
prompt string The image generation prompt. Can be a static string or reference input variables using $ prefix (e.g., "$description" or "A sketch of: $subject"). Yes
negative_prompt string Optional negative prompt when supported by the selected model or provider. No
aspect_ratio string The desired aspect ratio of the image. Valid values: "square", "landscape_4_3", "landscape_3_2", "landscape_16_9", "landscape_21_9", "portrait_3_4", "portrait_2_3", "portrait_9_16", "portrait_9_21". No
seed integer or "auto" A seed for the random number generator to ensure reproducibility. "auto" uses a random seed. No
background string Optional background setting when supported by the selected provider. No
output_format string Optional output image format when supported by the selected provider. No
is_raw boolean Request a raw generation mode when supported by the selected provider. No

Example: Generating a single image from a static prompt

This pipe generates one image of a futuristic car without requiring any input.

[pipe.generate_car_image]
type = "PipeImgGen"
description = "Generate a futuristic car image"
output = "Image"
prompt = "A sleek, futuristic sports car driving on a neon-lit highway at night."
model = "$gen-image"
aspect_ratio = "landscape_16_9"

Example: Generating an image from a dynamic prompt

This pipe takes a text prompt as input and generates an image.

[pipe.generate_from_description]
type = "PipeImgGen"
description = "Generate an image from a description"
inputs = { description = "ImgGenPrompt" }
output = "Image"
prompt = "$description"
model = "$gen-image"

Example: Dynamic prompt with prefix text

You can combine static text with dynamic variables in the prompt:

[pipe.generate_sketch]
type = "PipeImgGen"
description = "Generate a black and white sketch from a description"
inputs = { subject = "ImgGenPrompt" }
output = "Image"
prompt = "Create a black and white pencil sketch of: $subject"
model = "$gen-image"

Example: Generating multiple images

This pipe takes a text prompt as input and generates three variations of the image using bracket notation.

[pipe.generate_logo_variations]
type = "PipeImgGen"
description = "Generate three logo variations from a prompt"
inputs = { img_prompt = "ImgGenPrompt" }
output = "Image[3]"
prompt = "$img_prompt"
model = "$gen-image"
aspect_ratio = "square"

The output of the PipeImgGen has to be a concept compatible with the native Image concept. A concept is compatible with the Image concept if it refines the Image concept.

To use this pipe, you would first load a text prompt (e.g., "A minimalist logo for a coffee shop, featuring a stylized mountain and a coffee bean") into the input stuff (ImgGenPrompt concept). After running, the output will be a list containing three generated ImageContent objects.