Refining Concepts
Concept refinement allows you to create more specific versions of existing concepts while inheriting their structure. This provides semantic clarity and type safety for domain-specific workflows.
What is Concept Refinement?
Refinement is the process of creating a specialized concept from a more general one. When you refine a concept, the new concept:
- Inherits the structure of the base concept
- Adds semantic specificity to clarify its purpose
Think of it as creating a subtype: an Invoice is a specific kind of PDF, and a Photo is a specific kind of Image.
Why Refine Concepts?
1. Semantic Clarity
Refined concepts make your pipeline's intent explicit:
# ❌ Less clear
[pipe.process_invoice]
inputs = { invoice = "PDF" }
# ✅ More clear
[pipe.process_invoice]
inputs = { invoice = "Invoice" }
2. Self-Documenting Code
[pipe.extract_contract_terms]
type = "PipeLLM"
description = "Extract key terms from a contract"
inputs = { contract = "Contract" } # Clear what type of document is expected
output = "ContractTerms"
3. Domain-Specific Workflows
Build pipelines tailored to specific use cases:
domain = "finance"
[concept.Invoice]
description = "A commercial invoice"
refines = "PDF"
[concept.Receipt]
description = "Proof of payment"
refines = "PDF"
[pipe.process_invoice]
type = "PipeLLM"
inputs = { invoice = "Invoice" }
output = "InvoiceData"
# ... invoice-specific processing
[pipe.process_receipt]
type = "PipeLLM"
inputs = { receipt = "Receipt" }
output = "ReceiptData"
# ... receipt-specific processing
4. Type Validation
Using specific concept names helps catch errors early:
[pipe.analyze_invoice]
inputs = { invoice = "Invoice" } # Only accepts Invoice
output = "Analysis"
Current Limitations
You can only refine native concepts (For Now)
Currently, you can only refine native concepts. Refining custom concepts will be supported in a future release
No structure on refined concepts (For Now)
When you refine a concept, you cannot add an inline structure or specify a structure_class_name. This limitation will be lifted in future releases.
Not allowed:
[concept.Invoice]
description = "A commercial invoice"
refines = "PDF"
structure_class_name = "InvoiceModel" # ❌ Not allowed
[concept.Invoice.structure] # ❌ Not allowed
invoice_number = "Invoice ID"
Allowed:
[concept.Invoice]
description = "A commercial invoice"
refines = "PDF" # ✅ Inherits PDFContent structure
Basic Refinement Syntax
Define a refined concept using the refines field:
[concept.ConceptName]
description = "Description of the refined concept"
refines = "NativeConceptName"
Refining PDF
[concept.Invoice]
description = "A commercial document issued by a seller to a buyer"
refines = "PDF"
[concept.Contract]
description = "A legally binding agreement between parties"
refines = "PDF"
Both concepts inherit the PDFContent structure (with a url field) but represent semantically distinct document types.
Refining Image
[concept.ProductPhoto]
description = "A photograph of a product for marketing purposes"
refines = "Image"
[concept.Screenshot]
description = "A screen capture image"
refines = "Image"
Each inherits ImageContent structure (url, caption, base_64, etc.) with specific semantic meaning.
Refining Text
[concept.Article]
description = "A written composition on a specific topic"
refines = "Text"
[concept.Summary]
description = "A condensed version of a longer text"
refines = "Text"
Type Compatibility
Understanding how refined concepts interact with pipe inputs is crucial.
How Refinement Affects Type Checking
Key Rule
A pipe that accepts a native concept will NOT accept concepts that refine it.
[pipe.extract_text]
inputs = { document = "PDF" } # Only accepts PDF, not Invoice or Contract
If you want a pipe to accept both a native concept and its refinements, you must explicitly define the pipe to accept the refined concepts or use a more general approach.
Practical Example
[concept.Invoice]
refines = "PDF"
[concept.Contract]
refines = "PDF"
# This pipe only accepts generic PDFs
[pipe.extract_from_pdf]
type = "PipeExtract"
inputs = { document = "PDF" }
output = "Page"
# This pipe only accepts Invoices
[pipe.process_invoice]
type = "PipeLLM"
inputs = { invoice = "Invoice" }
output = "InvoiceData"
# This pipe only accepts Contracts
[pipe.process_contract]
type = "PipeLLM"
inputs = { contract = "Contract" }
output = "ContractData"
In this setup:
extract_from_pdfexpects exactlyPDF(notInvoiceorContract)process_invoiceexpects exactlyInvoiceprocess_contractexpects exactlyContract
Best Practices
1. Choose Meaningful Names
# ❌ Avoid generic or vague names
[concept.Document1]
refines = "PDF"
# ✅ Use specific, descriptive names
[concept.Invoice]
refines = "PDF"
2. Write Clear Descriptions
# ❌ Too vague
[concept.Invoice]
description = "A document"
refines = "PDF"
# ✅ Clear and specific
[concept.Invoice]
description = "A commercial document issued by a seller to a buyer, detailing products or services provided and payment terms"
refines = "PDF"
3. Don't Over-Refine
# ❌ Too specific, creates unnecessary complexity
[concept.SmallInvoice]
description = "An invoice under $100"
refines = "PDF"
[concept.LargeInvoice]
description = "An invoice over $1000"
refines = "PDF"
# ✅ Keep it general, handle specifics in processing logic
[concept.Invoice]
description = "A commercial invoice"
refines = "PDF"
When to Refine vs. When to Create New Concepts
Refine When:
- ✅ Your concept is semantically a specific type of a native concept
- ✅ The native structure is sufficient for your needs
- ✅ You want to inherit existing validation and behavior
- ✅ You're building domain-specific workflows with clear document/content types
Example:
[concept.Invoice] # Clearly a type of PDF
refines = "PDF"
Create New Concept When:
- ✅ Your concept needs custom structure with multiple fields
- ✅ Your concept doesn't naturally fit any native concept
- ✅ You need complex validation or computed properties
Related Documentation
- Define Your Concepts - Introduction to concepts
- Native Concepts - Complete guide to native concepts
- Inline Structures - Add structure to concepts
- Python StructuredContent Classes - Advanced customization