StuffArtefact & Image Rendering

StuffArtefact is a thin delegation adapter that wraps Stuff objects for Jinja2 template access. The ImageRenderable protocol enables image extraction from nested content without circular imports between the template layer and domain layer.

Design Principle

Two problems, one pattern:

Template Access: Templates need to access content fields via {{ my_stuff.field_name }} without knowing Pydantic internals
Image Extraction: The with_images filter needs to traverse content hierarchies without importing concrete types

Solution: StuffArtefact delegates attribute access to underlying content. ImageRenderable uses @runtime_checkable Protocol for duck-typed image extraction.

StuffArtefact wraps Stuff → delegates to StuffContent → Protocol enables traversal

Protocol over Inheritance

ImageRenderable uses duck typing. Any class with a render_with_images() method satisfies the protocol—no base class required.

Template Access Patterns

Pattern	Result
`{{ doc.title }}`	Content field value
`{{ doc.pages }}`	Nested content (list, struct, etc.)
`{{ doc._stuff_name }}`	Metadata: variable name
`{{ doc._content_class }}`	Metadata: content class name
`{{ doc._concept_code }}`	Metadata: concept code
`{{ doc \\| with_images }}`	Text with `[Image N]` tokens
`{{ doc \\| tag }}`	Tagged output (no images)

Image Extraction Behavior

Content Type	`render_with_images()` Behavior
`ImageContent`	Self-registers, returns `[Image N]`
`ListContent`	Iterates items, recurses into nested
`TextAndImagesContent`	Renders text first, then images
`StuffContent` (base)	Iterates Pydantic model fields recursively
`StuffArtefact`	Delegates to underlying content

Architecture

flowchart TB
    subgraph TEMPLATE["Jinja2 Template"]
        VAR["{{ doc | with_images }}"]
    end

    subgraph FILTER["with_images Filter"]
        direction TB
        CHECK["isinstance(value, ImageRenderable)?"]
        CALL["value.render_with_images(registry, format)"]
        CHECK --> CALL
    end

    subgraph ARTEFACT["StuffArtefact"]
        DELEGATE["Delegates to _stuff.content"]
    end

    subgraph CONTENT["Content Implementations"]
        direction TB
        IMG["ImageContent: register + return token"]
        LIST["ListContent: iterate items"]
        TEXT["TextAndImagesContent: text then images"]
        BASE["StuffContent: iterate model_fields"]
    end

    subgraph REGISTRY["ImageRegistry"]
        direction TB
        STORE["Append to list (0-based)"]
        DEDUP["Skip if URL exists"]
    end

    VAR --> CHECK
    CALL --> DELEGATE
    DELEGATE --> IMG & LIST & TEXT & BASE
    IMG --> STORE
    STORE --> DEDUP

StuffArtefact Implementation

Attribute Access Priority

StuffArtefact uses __getattribute__ to intercept all attribute access:

def __getattribute__(self, key: str) -> Any:
    # 1. Passthrough: _stuff, methods, magic attributes
    if key in _PASSTHROUGH_ATTRS or key.startswith("__"):
        return object.__getattribute__(self, key)

    # 2. Content fields (highest priority for templates)
    content_fields = type(content).model_fields
    if key in content_fields:
        return getattr(content, key)

    # 3. Metadata accessors
    match key:
        case "_stuff_name": return stuff.stuff_name
        case "_content_class": return content.__class__.__name__
        case "_concept_code": return stuff.concept.code
        case "_stuff_code": return stuff.stuff_code
        case "_content": return content

    # 4. Fallback to normal lookup
    return object.__getattribute__(self, key)

Content Field Priority

Content fields shadow StuffArtefact methods. If your content has a field named items, accessing artefact.items returns the field value, not the iteration method. Use artefact.iter_items() for explicit dict-like iteration.

Dict-like Access

StuffArtefact supports bracket notation and iteration:

Method	Purpose
`artefact["field"]`	Bracket access (via `__getitem__`)
`artefact.get("field", default)`	Safe access with default
`"field" in artefact`	Membership test
`artefact.iter_keys()`	Iterate field names
`artefact.iter_items()`	Iterate (key, value) pairs
`artefact.iter_values()`	Iterate values

ImageRenderable Protocol

Definition

from typing import Protocol, runtime_checkable

@runtime_checkable
class ImageRenderable(Protocol):
    """Protocol for types that can render with image extraction."""

    def render_with_images(
        self,
        registry: ImageRegistry,
        text_format: TextFormat,
    ) -> str:
        """Render to string, registering images to the registry.

        Returns:
            String with [Image N] tokens where images appear.
        """
        ...

Why Protocol?

Avoids circular imports: tools/jinja2/ can check types from core/stuffs/ without importing them
Duck typing: No inheritance required—any matching method signature works
Runtime checking: isinstance(value, ImageRenderable) works at runtime

Content Implementations

ImageContent

Self-registers and returns a token:

def render_with_images(self, registry, text_format) -> str:
    image_index = registry.register_image(self)
    return f"[Image {image_index + 1}]"

ListContent

Iterates items, delegating to nested ImageRenderable objects:

def render_with_images(self, registry, text_format) -> str:
    parts: list[str] = []
    for item in self.items:
        if isinstance(item, ImageRenderable):
            rendered = item.render_with_images(registry, text_format)
        else:
            rendered = str(item)
        if rendered:
            parts.append(rendered)
    return "\n".join(parts)

TextAndImagesContent

Renders text first, then registers each image:

def render_with_images(self, registry, text_format) -> str:
    parts: list[str] = []
    if self.text:
        parts.append(self.text.rendered_plain())
    if self.images:
        for image in self.images:
            image_index = registry.register_image(image)
            parts.append(f"[Image {image_index + 1}]")
    return "\n".join(parts)

StuffContent (Base)

Default implementation iterates Pydantic model fields:

def render_with_images(self, registry, text_format) -> str:
    parts: list[str] = []
    for field_name in type(self).model_fields:
        field_value = getattr(self, field_name)
        if field_value is None:
            continue
        if isinstance(field_value, ImageRenderable):
            rendered = field_value.render_with_images(registry, text_format)
        else:
            rendered = str(field_value)
        if rendered:
            parts.append(f"{field_name}: {rendered}")
    return "\n".join(parts)

Image Numbering

Registry Behavior

Operation	Result
First image registered	Returns index `0`
Second image registered	Returns index `1`
Same URL registered again	Returns existing index (deduplication)
Display in template	`[Image {index + 1}]` (1-based)

Example

registry = ImageRegistry()

idx = registry.register_image(img_a)  # idx=0 → [Image 1]
idx = registry.register_image(img_b)  # idx=1 → [Image 2]
idx = registry.register_image(img_a)  # idx=0 → [Image 1] (same URL)

len(registry.images)  # 2 (deduplicated)

The `with_images` Filter

Execution Flow

@pass_context
def with_images(context: Context, value: Any, _: Any = None) -> str:
    # 1. Validate input
    if isinstance(value, Undefined):
        raise Jinja2ContextError("Cannot use with_images on undefined")

    # 2. Get/create registry from context
    registry = context.get(Jinja2ContextKey.IMAGE_REGISTRY)
    if registry is None:
        registry = ImageRegistry()

    # 3. Get text format
    text_format = TextFormat(context.get(Jinja2ContextKey.TEXT_FORMAT))

    # 4. Protocol-based rendering
    if isinstance(value, ImageRenderable):
        return value.render_with_images(registry, text_format)

    # 5. Handle plain sequences
    if isinstance(value, (list, tuple)):
        return _render_sequence_with_images(value, registry, text_format)

    # 6. Reject unsupported types
    raise Jinja2ContextError(f"{type(value).__name__} does not implement ImageRenderable")

Filter Order Matters

with_images must receive structured data to extract images. Place it before any filter that converts to string.

Pattern	Works?
`{{ doc \\| with_images }}`	Yes
`{{ doc \\| with_images \\| tag }}`	Yes
`{{ doc \\| tag \\| with_images }}`	No—`tag` stringifies first

Syntax Quick Reference

Pattern	Purpose
`{{ x.field }}`	Access content field
`{{ x["field"] }}`	Bracket access
`{{ x._stuff_name }}`	Variable name metadata
`{{ x._content_class }}`	Content class name
`{{ x \\| with_images }}`	Extract images as tokens
`{{ x \\| with_images \\| tag }}`	Extract then wrap in tags

Files Reference

File	Purpose
`pipelex/core/stuffs/stuff_artefact.py`	Thin delegation adapter for Jinja2
`pipelex/tools/jinja2/image_renderable.py`	`@runtime_checkable` Protocol for images
`pipelex/tools/jinja2/tag_renderable.py`	`@runtime_checkable` Protocol for tagging
`pipelex/tools/jinja2/jinja2_with_images_filter.py`	`with_images` filter
`pipelex/tools/jinja2/jinja2_filters.py`	`tag` and `format` filters
`pipelex/tools/jinja2/image_registry.py`	Image tracking with deduplication
`pipelex/core/stuffs/stuff_content.py`	Base `render_with_images()`
`pipelex/core/stuffs/image_content.py`	Self-registering image token
`pipelex/core/stuffs/list_content.py`	List iteration for images
`pipelex/core/stuffs/text_and_images_content.py`	Text + images rendering