Refining Concepts

Concept refinement allows you to create more specific versions of existing concepts while inheriting their structure. This provides semantic clarity and type safety for domain-specific workflows.

Refinement is the process of creating a specialized concept from a more general one. When you refine a concept, the new concept:

Inherits the structure of the base concept
Adds semantic specificity to clarify its purpose

Think of it as creating a subtype: an Invoice is a specific kind of PDF, and a Photo is a specific kind of Image.

Why Refine Concepts?

1. Semantic Clarity

Refined concepts make your pipeline's intent explicit:

# ❌ Less clear
[pipe.process_invoice]
inputs = { invoice = "PDF" }

# ✅ More clear
[pipe.process_invoice]
inputs = { invoice = "Invoice" }

2. Self-Documenting Code

[pipe.extract_contract_terms]
type = "PipeLLM"
description = "Extract key terms from a contract"
inputs = { contract = "Contract" }  # Clear what type of document is expected
output = "ContractTerms"

3. Domain-Specific Workflows

Build pipelines tailored to specific use cases:

domain = "finance"

[concept.Invoice]
description = "A commercial invoice"
refines = "PDF"

[concept.Receipt]
description = "Proof of payment"
refines = "PDF"

[pipe.process_invoice]
type = "PipeLLM"
inputs = { invoice = "Invoice" }
output = "InvoiceData"
# ... invoice-specific processing

[pipe.process_receipt]
type = "PipeLLM"
inputs = { receipt = "Receipt" }
output = "ReceiptData"
# ... receipt-specific processing

4. Type Validation

Using specific concept names helps catch errors early:

[pipe.analyze_invoice]
inputs = { invoice = "Invoice" }  # Only accepts Invoice
output = "Analysis"

Current Limitations

You can only refine native concepts (For Now)

Currently, you can only refine native concepts. Refining custom concepts will be supported in a future release

No structure on refined concepts (For Now)

When you refine a concept, you cannot add an inline structure or specify a structure_class_name. This limitation will be lifted in future releases.

Not allowed:

[concept.Invoice]
description = "A commercial invoice"
refines = "PDF"
structure_class_name = "InvoiceModel"  # ❌ Not allowed

[concept.Invoice.structure]  # ❌ Not allowed
invoice_number = "Invoice ID"

Allowed:

[concept.Invoice]
description = "A commercial invoice"
refines = "PDF"  # ✅ Inherits PDFContent structure

Define a refined concept using the refines field:

[concept.ConceptName]
description = "Description of the refined concept"
refines = "NativeConceptName"

Refining PDF

[concept.Invoice]
description = "A commercial document issued by a seller to a buyer"
refines = "PDF"

[concept.Contract]
description = "A legally binding agreement between parties"
refines = "PDF"

Both concepts inherit the PDFContent structure (with a url field) but represent semantically distinct document types.

Refining Image

[concept.ProductPhoto]
description = "A photograph of a product for marketing purposes"
refines = "Image"

[concept.Screenshot]
description = "A screen capture image"
refines = "Image"

Each inherits ImageContent structure (url, caption, base_64, etc.) with specific semantic meaning.

Refining Text

[concept.Article]
description = "A written composition on a specific topic"
refines = "Text"

[concept.Summary]
description = "A condensed version of a longer text"
refines = "Text"

Type Compatibility

Understanding how refined concepts interact with pipe inputs is crucial.

Key Rule

A pipe that accepts a native concept will NOT accept concepts that refine it.

[pipe.extract_text]
inputs = { document = "PDF" }  # Only accepts PDF, not Invoice or Contract

If you want a pipe to accept both a native concept and its refinements, you must explicitly define the pipe to accept the refined concepts or use a more general approach.

Practical Example

[concept.Invoice]
refines = "PDF"

[concept.Contract]
refines = "PDF"

# This pipe only accepts generic PDFs
[pipe.extract_from_pdf]
type = "PipeExtract"
inputs = { document = "PDF" }
output = "Page"

# This pipe only accepts Invoices
[pipe.process_invoice]
type = "PipeLLM"
inputs = { invoice = "Invoice" }
output = "InvoiceData"

# This pipe only accepts Contracts
[pipe.process_contract]
type = "PipeLLM"
inputs = { contract = "Contract" }
output = "ContractData"

In this setup:

extract_from_pdf expects exactly PDF (not Invoice or Contract)
process_invoice expects exactly Invoice
process_contract expects exactly Contract

Best Practices

1. Choose Meaningful Names

# ❌ Avoid generic or vague names
[concept.Document1]
refines = "PDF"

# ✅ Use specific, descriptive names
[concept.Invoice]
refines = "PDF"

2. Write Clear Descriptions

# ❌ Too vague
[concept.Invoice]
description = "A document"
refines = "PDF"

# ✅ Clear and specific
[concept.Invoice]
description = "A commercial document issued by a seller to a buyer, detailing products or services provided and payment terms"
refines = "PDF"

3. Don't Over-Refine

# ❌ Too specific, creates unnecessary complexity
[concept.SmallInvoice]
description = "An invoice under $100"
refines = "PDF"

[concept.LargeInvoice]
description = "An invoice over $1000"
refines = "PDF"

# ✅ Keep it general, handle specifics in processing logic
[concept.Invoice]
description = "A commercial invoice"
refines = "PDF"

When to Refine vs. When to Create New Concepts

Refine When:

✅ Your concept is semantically a specific type of a native concept
✅ The native structure is sufficient for your needs
✅ You want to inherit existing validation and behavior
✅ You're building domain-specific workflows with clear document/content types

Example:

[concept.Invoice]  # Clearly a type of PDF
refines = "PDF"

Create New Concept When:

✅ Your concept needs custom structure with multiple fields
✅ Your concept doesn't naturally fit any native concept
✅ You need complex validation or computed properties

Define Your Concepts - Introduction to concepts
Native Concepts - Complete guide to native concepts
Inline Structures - Add structure to concepts
Python StructuredContent Classes - Advanced customization

Refining Concepts

What is Concept Refinement?

Why Refine Concepts?

1. Semantic Clarity

2. Self-Documenting Code

3. Domain-Specific Workflows

4. Type Validation

Current Limitations

Basic Refinement Syntax

Refining PDF

Refining Image

Refining Text

Type Compatibility

How Refinement Affects Type Checking

Practical Example

Best Practices

1. Choose Meaningful Names

2. Write Clear Descriptions

3. Don't Over-Refine

When to Refine vs. When to Create New Concepts

Refine When:

Create New Concept When:

Refining Concepts

What is Concept Refinement?

Why Refine Concepts?

1. Semantic Clarity

2. Self-Documenting Code

3. Domain-Specific Workflows

4. Type Validation

Current Limitations

Basic Refinement Syntax

Refining PDF

Refining Image

Refining Text

Type Compatibility

How Refinement Affects Type Checking

Practical Example

Best Practices

1. Choose Meaningful Names

2. Write Clear Descriptions

3. Don't Over-Refine

When to Refine vs. When to Create New Concepts

Refine When:

Create New Concept When:

Related Documentation