Pipelex · lchoquel · Dec 14, 2025 · Dec 14, 2025 · Jan 11, 2026 · Jan 12, 2026
diff --git a/BLACKBOX_RULES.md → .blackboxrules b/BLACKBOX_RULES.md → .blackboxrules
diff --git a/.cursor/rules/run_pipelex.mdc b/.cursor/rules/run_pipelex.mdc
@@ -55,7 +55,7 @@ async def extract_gantt(image_url: str) -> GanttChart:
     # Run the pipe
     pipe_output = await execute_pipeline(
         pipe_code="extract_gantt_by_steps",
-        input_memory={
+        inputs={
             "gantt_chart_image": {
                 "concept": "gantt.GanttImage",
                 "content": ImageContent(url=image_url),
@@ -94,26 +94,26 @@ So here are a few concrete examples of calls to execute_pipeline with various wa
 # If you assign a string, by default it will be considered as a TextContent.
     pipe_output = await execute_pipeline(
         pipe_code="master_advisory_orchestrator",
-        input_memory={
+        inputs={
             "user_input": problem_description,
         },
     )
 
-# Here we have a single input and it's a PDF.
-# Because PDFContent is a native concept, we can use it directly as a value,
+# Here we have a single input and it's a document.
+# Because DocumentContent is a native concept, we can use it directly as a value,
 # the system knows what content it corresponds to:
     pipe_output = await execute_pipeline(
         pipe_code="power_extractor_dpe",
-        input_memory={
-            "document": PDFContent(url=pdf_url),
+        inputs={
+            "document": DocumentContent(url=pdf_url),
         },
     )
 
 # Here we have a single input and it's an Image.
 # Because ImageContent is a native concept, we can use it directly as a value:
     pipe_output = await execute_pipeline(
         pipe_code="fashion_variation_pipeline",
-        input_memory={
+        inputs={
             "fashion_photo": ImageContent(url=image_url),
         },
     )
@@ -123,7 +123,7 @@ So here are a few concrete examples of calls to execute_pipeline with various wa
 # so we must provide it using a dict with the concept and the content:
     pipe_output = await execute_pipeline(
         pipe_code="extract_gantt_by_steps",
-        input_memory={
+        inputs={
             "gantt_chart_image": {
                 "concept": "gantt.GanttImage",
                 "content": ImageContent(url=image_url),
@@ -135,7 +135,7 @@ So here are a few concrete examples of calls to execute_pipeline with various wa
     pipe_output = await execute_pipeline(
         pipe_code="retrieve_then_answer",
         dynamic_output_concept_code="contracts.Fees",
-        input_memory={
+        inputs={
             "text": load_text_from_path(path=text_path),
             "question": {
                 "concept": "answer.Question",

diff --git a/.cursor/rules/write_pipelex.mdc b/.cursor/rules/write_pipelex.mdc
@@ -8,7 +8,10 @@ globs:
 # Guide to write or edit pipelines using the Pipelex language in .plx files
 
 - Always first write your "plan" in natural language, then transcribe it in pipelex.
-- You should ALWAYS RUN the terminal command `make validate` when you are writing or editing a `.plx` file. It will ensure the pipe is runnable. If not, iterate.
+- You should ALWAYS RUN validation when you are writing or editing a `.plx` file. It will ensure the pipe is runnable. If not, iterate.
+  - For a specific file: `pipelex validate path_to_file.plx`
+  - For all pipelines: `pipelex validate all`
+  - **IMPORTANT**: Ensure the Python virtual environment is activated before running `pipelex` commands. For standard installations, the venv is named `.venv` - always check that first. The commands will not work without proper venv activation.
 - Please use POSIX standard for files. (empty lines, no trailing whitespaces, etc.)
 
 ## Pipeline File Naming
@@ -24,10 +27,10 @@ A pipeline file has three main sections:
 
 ### Domain Statement
 ```plx
-domain = "domain_name"
+domain = "domain_code"
 description = "Description of the domain" # Optional
 ```
-Note: The domain name usually matches the plx filename for single-file domains. For multi-file domains, use the subdirectory name.
+Note: The domain code usually matches the plx filename for single-file domains. For multi-file domains, use the subdirectory name.
 
 ### Concept Definitions
 
@@ -42,10 +45,10 @@ ConceptName = "Description of the concept"
 - Use PascalCase for concept names
 - Never use plurals (no "Stories", use "Story") - lists are handled implicitly by Pipelex
 - Avoid circumstantial adjectives (no "LargeText", use "Text") - focus on the essence of what the concept represents
-- Don't redefine native concepts (Text, Image, PDF, TextAndImages, Number, Page)
+- Don't redefine native concepts (Text, Image, PDF, TextAndImages, Number, Page, JSON)
 
 **Native Concepts:**
-Pipelex provides built-in native concepts: `Text`, `Image`, `PDF`, `TextAndImages`, `Number`, `Page`. Use these directly or refine them when appropriate.
+Pipelex provides built-in native concepts: `Text`, `Image`, `PDF`, `TextAndImages`, `Number`, `Page`, `JSON`. Use these directly or refine them when appropriate.
 
 **Refining Native Concepts:**
 To create a concept that specializes a native concept without adding fields:
@@ -63,7 +66,7 @@ For details on how to structure concepts with fields, see the "Structuring Model
 ## Pipe Base Definition
 
 ```plx
-[pipe.your_pipe_name]
+[pipe.your_pipe_code]
 type = "PipeLLM"
 description = "A description of what your pipe does"
 inputs = { input_1 = "ConceptName1", input_2 = "ConceptName2" }
@@ -73,7 +76,7 @@ output = "ConceptName"
 The pipes will all have at least this base definition. 
 - `inputs`: Dictionary of key being the variable used in the prompts, and the value being the ConceptName. It should ALSO LIST THE INPUTS OF THE INTERMEDIATE STEPS (if PipeSequence) or of the conditional pipes (if PipeCondition).
 So If you have this error:
-`StaticValidationError: missing_input_variable • domain='expense_validator' • pipe='validate_expense' • 
+`PipeValidationError: missing_input_variable • domain='expense_validator' • pipe='validate_expense' • 
 variable='['invoice']'``
 That means that the pipe validate_expense is missing the input `invoice` because one of the subpipe is needing it.
 
@@ -128,16 +131,16 @@ For concepts with structured fields, define them inline using TOML syntax:
 description = "A commercial document issued by a seller to a buyer"
 
 [concept.Invoice.structure]
-invoice_number = "The unique invoice identifier"
+invoice_number = "The unique invoice identifier" # This will be optional by default
 issue_date = { type = "date", description = "The date the invoice was issued", required = true }
 total_amount = { type = "number", description = "The total invoice amount", required = true }
-vendor_name = "The name of the vendor"
-line_items = { type = "list", item_type = "text", description = "List of items", required = false }
+vendor_name = "The name of the vendor" # This will be optional by default
+line_items = { type = "list", item_type = "text", description = "List of items" }
 ```
 
 **Supported inline field types:** `text`, `integer`, `boolean`, `number`, `date`, `list`, `dict`
 
-**Field properties:** `type`, `description`, `required` (default: true), `default_value`, `choices`, `item_type` (for lists), `key_type` and `value_type` (for dicts)
+**Field properties:** `type`, `description`, `required` (default: false), `default_value`, `choices`, `item_type` (for lists), `key_type` and `value_type` (for dicts)
 
 **Simple syntax** (creates required text field):
 ```plx
@@ -146,7 +149,7 @@ field_name = "Field description"
 
 **Detailed syntax** (with explicit properties):
 ```plx
-field_name = { type = "text", description = "Field description", required = false, default_value = "default" }
+field_name = { type = "text", description = "Field description", default_value = "default" }
 ```
 
 **3. Python StructuredContent Class (For Advanced Features)**
@@ -472,7 +475,7 @@ The PipeExtract operator is used to extract text and images from an image or a P
 [pipe.extract_info]
 type = "PipeExtract"
 description = "extract the information"
-inputs = { document = "PDF" } # or { image = "Image" } if it's an image. This is the only input.
+inputs = { document = "Document" } # or { image = "Image" } if it's an image. This is the only input.
 output = "Page"
 ```
 
@@ -481,7 +484,7 @@ Using Extract Model Settings:
 [pipe.extract_with_model]
 type = "PipeExtract"
 description = "Extract with specific model"
-inputs = { document = "PDF" }
+inputs = { document = "Document" }
 output = "Page"
 model = "base_extract_mistral"  # Use predefined extract preset or model alias
 ```
@@ -589,25 +592,160 @@ $sales_rep.phone | $sales_rep.email
 """
 ```
 
-### Key Parameters
+### Key Parameters (Template Mode)
 
-- `template`: Inline template string (mutually exclusive with template_name)
+- `template`: Inline template string (mutually exclusive with template_name and construct)
 - `template_name`: Name of a predefined template (mutually exclusive with template)
 - `template_category`: Template type ("llm_prompt", "html", "markdown", "mermaid", etc.)
 - `templating_style`: Styling options for template rendering
 - `extra_context`: Additional context variables for template
 
 For more control, you can use a nested `template` section instead of the `template` field:
+
 - `template.template`: The template string
 - `template.category`: Template type
 - `template.templating_style`: Styling options
 
 ### Template Variables
 
 Use the same variable insertion rules as PipeLLM:
+
 - `@variable` for block insertion (multi-line content)
 - `$variable` for inline insertion (short text)
 
+### Construct Mode (for StructuredContent Output)
+
+PipeCompose can also generate `StructuredContent` objects using the `construct` section. This mode composes field values from fixed values, variable references, templates, or nested structures.
+
+**When to use construct mode:**
+
+- You need to output a structured object (not just Text)
+- You want to deterministically compose fields from existing data
+- No LLM is needed - just data composition and templating
+
+#### Basic Construct Usage
+
+```plx
+[concept.SalesSummary]
+description = "A structured sales summary"
+
+[concept.SalesSummary.structure]
+report_title = { type = "text", description = "Title of the report" }
+customer_name = { type = "text", description = "Customer name" }
+deal_value = { type = "number", description = "Deal value" }
+summary_text = { type = "text", description = "Generated summary text" }
+
+[pipe.compose_summary]
+type = "PipeCompose"
+description = "Compose a sales summary from deal data"
+inputs = { deal = "Deal" }
+output = "SalesSummary"
+
+[pipe.compose_summary.construct]
+report_title = "Monthly Sales Report"
+customer_name = { from = "deal.customer_name" }
+deal_value = { from = "deal.amount" }
+summary_text = { template = "Deal worth $deal.amount with $deal.customer_name" }
+```
+
+#### Field Composition Methods
+
+There are four ways to define field values in a construct:
+
+**1. Fixed Value (literal)**
+
+Use a literal value directly:
+
+```plx
+[pipe.compose_report.construct]
+report_title = "Annual Report"
+report_year = 2024
+is_draft = false
+```
+
+**2. Variable Reference (`from`)**
+
+Get a value from working memory using a dotted path:
+
+```plx
+[pipe.compose_report.construct]
+customer_name = { from = "deal.customer_name" }
+total_amount = { from = "order.total" }
+street_address = { from = "customer.address.street" }
+```
+
+**3. Template (`template`)**
+
+Render a Jinja2 template with variable substitution:
+
+```plx
+[pipe.compose_report.construct]
+invoice_number = { template = "INV-$order.id" }
+summary = { template = "Deal worth $deal.amount with $deal.customer_name on {{ current_date }}" }
+```
+
+**4. Nested Construct**
+
+For nested structures, use a TOML subsection:
+
+```plx
+[pipe.compose_invoice.construct]
+invoice_number = { template = "INV-$order.id" }
+total = { from = "order.total_amount" }
+
+[pipe.compose_invoice.construct.billing_address]
+street = { from = "customer.address.street" }
+city = { from = "customer.address.city" }
+country = "France"
+```
+
+#### Complete Construct Example
+
+```plx
+domain = "invoicing"
+
+[concept.Address]
+description = "A postal address"
+
+[concept.Address.structure]
+street = { type = "text", description = "Street address" }
+city = { type = "text", description = "City name" }
+country = { type = "text", description = "Country name" }
+
+[concept.Invoice]
+description = "An invoice document"
+
+[concept.Invoice.structure]
+invoice_number = { type = "text", description = "Invoice number" }
+total = { type = "number", description = "Total amount" }
+
+[pipe.compose_invoice]
+type = "PipeCompose"
+description = "Compose an invoice from order and customer data"
+inputs = { order = "Order", customer = "Customer" }
+output = "Invoice"
+
+[pipe.compose_invoice.construct]
+invoice_number = { template = "INV-$order.id" }
+total = { from = "order.total_amount" }
+
+[pipe.compose_invoice.construct.billing_address]
+street = { from = "customer.address.street" }
+city = { from = "customer.address.city" }
+country = "France"
+```
+
+#### Key Parameters (Construct Mode)
+
+- `construct`: Dictionary mapping field names to their composition rules
+- Each field can be:
+  - A literal value (string, number, boolean)
+  - A dict with `from` key for variable reference
+  - A dict with `template` key for template rendering
+  - A nested dict for nested structures
+
+**Note:** You must use either `template` or `construct`, not both. They are mutually exclusive.
+
 ## PipeImgGen operator
 
 The PipeImgGen operator is used to generate images using AI image generation models.
@@ -821,7 +959,7 @@ Presets are meant to record the choice of an llm with its hyper parameters (temp
 
 Examples:
 ```toml
-llm_for_complex_reasoning = { model = "base-claude", temperature = 1 }
+llm_to_engineer = { model = "base-claude", temperature = 1 }
 llm_to_extract_invoice = { model = "claude-3-7-sonnet", temperature = 0.1, max_tokens = "auto" }
 ```
 
@@ -850,6 +988,10 @@ You can override the predefined llm presets by setting them in `.pipelex/inferen
 
 ---
 
-ALWAYS RUN `make validate` when you are finished writing pipelines: This checks for errors. If there are errors, iterate until it works.
+ALWAYS RUN validation when you are finished writing pipelines: This checks for errors. If there are errors, iterate until it works.
+- For a specific bundle/file: `pipelex validate path_to_file.plx`
+- For all pipelines: `pipelex validate all`
+- Remember: Ensure your Python virtual environment is activated (typically `.venv` for standard installations) before running `pipelex` commands.
+
 Then, create an example file to run the pipeline in the `examples` folder.
 But don't write documentation unless asked explicitly to.
diff --git a/.env.example b/.env.example
@@ -1,6 +1,6 @@
 # [OPTIONAL] Free Pipelex Inference API key - Get yours on Discord: https://go.pipelex.com/discord
 # No credit card required, limited time offer
-PIPELEX_INFERENCE_API_KEY=
+PIPELEX_GATEWAY_API_KEY=
 
 # OpenAI: to use models like GPT-4o and GPT-5
 OPENAI_API_KEY=
@@ -21,10 +21,6 @@ ANTHROPIC_API_KEY=
 # To use Mistral models
 MISTRAL_API_KEY=
 
-# To use perplexity, including results from web search
-PERPLEXITY_API_KEY=
-PERPLEXITY_API_ENDPOINT=https://api.perplexity.ai
-
 # To generate images from fal.ai, the service of Forest Labs
 FAL_API_KEY=