Skip to main content
A comprehensive overview of Instructor’s capabilities.

Core Features

Structured Output Extraction

Define a PHP class, get a populated object back:
<?php
class Person {
    public string $name;
    public int $age;
    public string $occupation;
}

$person = (new StructuredOutput)
    ->withResponseClass(Person::class)
    ->withMessages("Extract: Sarah, 32, software architect")
    ->get();
Key capabilities:
  • Works with any PHP class with typed properties
  • Supports nested objects and arrays
  • Handles nullable fields gracefully
  • Preserves type information throughout

Automatic Validation

Built-in support for Symfony Validator:
<?php
use Symfony\Component\Validator\Constraints as Assert;

class User {
    #[Assert\NotBlank]
    #[Assert\Email]
    public string $email;

    #[Assert\Range(min: 18, max: 120)]
    public int $age;

    #[Assert\Choice(['active', 'inactive', 'pending'])]
    public string $status;
}
Validation features:
  • All Symfony validation constraints supported
  • Custom validators work out of the box
  • Validation errors trigger automatic retry
  • Error messages sent to LLM for self-correction

Self-Correcting Retries

When validation fails, Instructor automatically retries:
<?php
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Polyglot\Inference\LLMProvider;

$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
    ->withMaxRetries(3);

$result = (new StructuredOutput($runtime))
    ->withResponseClass(User::class)
    ->withMessages($text)
    ->get();
Retry behavior:
  1. LLM generates response
  2. Response validated against constraints
  3. On failure: errors sent back to LLM with context
  4. LLM attempts correction
  5. Repeat until valid or max retries reached

Input Flexibility

Text Input

Simple string input:
<?php
$instructor->withMessages("John is 25 years old and works at Acme Corp");

Chat Messages

OpenAI-style message arrays:
<?php
$instructor->withMessages([
    ['role' => 'system', 'content' => 'You are a data extraction expert.'],
    ['role' => 'user', 'content' => 'Extract the person: John, 25, engineer']
]);

Image Input

Process images with vision-capable models:
<?php
use Cognesy\Addons\Image\Image;

$instructor->with(messages: Image::fromFile('path/to/image.jpg')->toMessage())
    ->withPrompt("Extract all text from this document");
Supported formats: JPEG, PNG, GIF, WebP

Structured Input

Pass objects or arrays as input:
<?php
$inputData = [
    'document' => $documentText,
    'metadata' => ['source' => 'email', 'date' => '2024-01-15']
];

$result = (new StructuredOutput)
    ->withResponseClass(Analysis::class)
    ->withInput($inputData)
    ->get();

Output Modes

Tools Mode (Default)

Uses LLM function/tool calling:
<?php
use Cognesy\Instructor\Enums\OutputMode;
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Polyglot\Inference\LLMProvider;

$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
    ->withOutputMode(OutputMode::Tools);
Best for: OpenAI, Anthropic, most modern models

JSON Schema Mode

Strict schema enforcement:
<?php
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
    ->withOutputMode(OutputMode::JsonSchema);
Best for: GPT-4, models with strict JSON Schema support

JSON Mode

Basic JSON response format:
<?php
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
    ->withOutputMode(OutputMode::Json);
Best for: Models supporting JSON mode without strict schemas

Markdown JSON Mode

Prompting-based extraction:
<?php
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
    ->withOutputMode(OutputMode::MdJson);
Best for: Models without JSON mode, fallback option

Response Types

Single Object

<?php
$person = (new StructuredOutput)
    ->withResponseClass(Person::class)
    ->get();

Arrays of Objects

Use Sequence::of() to extract lists:
<?php
use Cognesy\Instructor\Extras\Sequence\Sequence;

$people = (new StructuredOutput)
    ->withResponseClass(Sequence::of(Person::class))
    ->withMessages($text)
    ->get();

// Iterate over results
foreach ($people as $person) {
    echo $person->name;
}

// Or use array-like access
$first = $people->first();
$count = $people->count();
$all = $people->toArray();

Scalar Values

Extract simple types with adapters:
<?php
use Cognesy\Instructor\Extras\Scalars\Scalar;

// Boolean
$isSpam = (new StructuredOutput)
    ->withResponseClass(Scalar::boolean('isSpam'))
    ->get();

// Integer
$count = (new StructuredOutput)
    ->withResponseClass(Scalar::integer('count'))
    ->get();

// String
$summary = (new StructuredOutput)
    ->withResponseClass(Scalar::string('summary'))
    ->get();

Enums

<?php
enum Sentiment: string {
    case Positive = 'positive';
    case Negative = 'negative';
    case Neutral = 'neutral';
}

$sentiment = (new StructuredOutput)
    ->withResponseClass(Scalar::enum(Sentiment::class, 'sentiment'))
    ->get();

Streaming

Partial Updates

Get incremental results as they arrive:
<?php
$stream = (new StructuredOutput)
    ->withResponseClass(Article::class)
    ->with(
        messages: $text,
        options: ['stream' => true]
    )
    ->stream();

foreach ($stream->partials() as $partial) {
    // $partial has incrementally populated fields
    updateUI($partial);
}

$final = $stream->finalValue();
Or subscribe to streaming events:
<?php
use Cognesy\Instructor\Events\PartialsGenerator\PartialResponseGenerated;

$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
    ->onEvent(PartialResponseGenerated::class, fn(PartialResponseGenerated $event) => updateUI($event->partialResponse));

$stream = (new StructuredOutput)
    ->withRuntime($runtime)
    ->with(
        responseModel: Article::class,
        messages: $text,
        options: ['stream' => true],
    )
    ->stream();

$article = $stream->finalValue();

Sequence Streaming

Stream sequence items as they complete:
<?php
use Cognesy\Instructor\Extras\Sequence\Sequence;

$list = (new StructuredOutput)
    ->withResponseClass(Sequence::of(Person::class))
    ->with(
        messages: $text,
        options: ['stream' => true],
    )
    ->stream();

foreach ($list->sequence() as $seq) {
    processComplete($seq->last());
}

$final = $list->finalValue();

LLM Providers

Supported Providers

ProviderAPI TypeStreamingVisionTool Calling
OpenAINative
AnthropicNative
Google GeminiNative
Azure OpenAIOpenAI-compatible
MistralNative-
CohereOpenAI-compatible-
GroqOpenAI-compatible-
Fireworks AIOpenAI-compatible
Together AIOpenAI-compatible
OllamaOpenAI-compatible
OpenRouterOpenAI-compatible
PerplexityOpenAI-compatible--
DeepSeekOpenAI-compatible-
xAI (Grok)OpenAI-compatible-
CerebrasOpenAI-compatible-
SambaNovaOpenAI-compatible-

Provider Selection

<?php
use Cognesy\Instructor\StructuredOutputRuntime;

// Use preset from config
$structuredOutput = StructuredOutput::using('anthropic');

// Or configure runtime explicitly (advanced)
$structuredOutput = (new StructuredOutput)->withRuntime(
    StructuredOutputRuntime::fromConfig(
        \Cognesy\Polyglot\Inference\Config\LLMConfig::fromDsn('preset=anthropic,model=claude-3-5-sonnet-latest')
    )
);

Schema Definition

Type-Hinted Classes

<?php
class Order {
    public string $orderId;
    public Customer $customer;
    /** @var LineItem[] */
    public array $items;
    public float $total;
    public string|null $notes;
}

PHP DocBlocks for Instructions

<?php
class Product {
    /** The product SKU, e.g., "SKU-12345" */
    public string $sku;

    /** Price in USD, without currency symbol */
    public float $price;

    /** @var string[] List of applicable categories */
    public array $categories;
}

Attributes for Detailed Control

<?php
use Cognesy\Instructor\Schema\Attributes\Description;
use Cognesy\Instructor\Schema\Attributes\Instructions;

class Analysis {
    #[Description("Sentiment score from -1.0 (negative) to 1.0 (positive)")]
    public float $sentiment;

    #[Instructions("Extract the 3 most important points only")]
    /** @var string[] */
    public array $keyPoints;
}

Dynamic Schemas with Structure

<?php
use Cognesy\Dynamic\StructureBuilder;
use Cognesy\Instructor\StructuredOutput;

$schema = StructureBuilder::define('user')
    ->string('name')
    ->int('age', required: false)
    ->collection('tags', 'string', required: false)
    ->build();

$result = (new StructuredOutput)
    ->with(
        messages: 'Extract user profile from this text...',
        responseModel: $schema,
    )
    ->get();

Advanced Features

Context Caching

Reduce costs with cached context (Anthropic):
<?php
->withCachedContext([
    'Large document or context here...',
    'This won\'t be re-sent on retries'
])

Custom Prompts

Override default extraction prompts:
<?php
use Cognesy\Instructor\Config\StructuredOutputConfig;

->withPrompt("Extract the following fields precisely: ...")
->withConfig(new StructuredOutputConfig(
    retryPrompt: "The previous attempt had errors: {errors}. Please correct."
))

Event System

Monitor internal processing:
<?php
use Cognesy\Instructor\Events\StructuredOutput\StructuredOutputRequestReceived;
use Cognesy\Instructor\Events\StructuredOutput\StructuredOutputResponseGenerated;

$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new());

$runtime->onEvent(StructuredOutputRequestReceived::class, function($event) {
    logger()->info('Request received', $event->toArray());
});

$runtime->onEvent(StructuredOutputResponseGenerated::class, function($event) {
    logger()->info('Response generated', $event->toArray());
});

Debug Mode

See all LLM interactions:
<?php
$runtime->wiretap(fn($event) => logger()->debug((string) $event));
Outputs:
  • Full request payloads
  • Raw LLM responses
  • Validation errors
  • Retry attempts

Framework Integration

Laravel

<?php
// Service provider auto-registers

// Use facade
use Cognesy\Instructor\Laravel\Facades\StructuredOutput;

$result = StructuredOutput::using('openai')
    ->with(
        messages: $text,
        responseModel: Person::class,
    )->get();
<?php
// Or inject via dependency injection
class MyController
{
    public function handle(\Cognesy\Instructor\StructuredOutput $instructor)
    {
        return $instructor
            ->with(messages: $text, responseModel: Person::class)
            ->get();
    }
}

Symfony

# services.yaml
services:
    Cognesy\Instructor\StructuredOutput:
        autowire: true
<?php
// Use in controller
class MyController extends AbstractController
{
    public function extract(\Cognesy\Instructor\StructuredOutput $instructor): Response
    {
        $result = $instructor->with(messages: $text, responseModel: Person::class)->get();
        return $this->json($result);
    }
}

Standalone

<?php
// No framework needed
require 'vendor/autoload.php';

$result = (new StructuredOutput)
    ->withResponseClass(Person::class)
    ->withMessages($text)
    ->get();

Observability

Token Usage

<?php
$response = (new StructuredOutput)
    ->withResponseClass(Person::class)
    ->withMessages($text)
    ->getResponse();

echo $response->usage->inputTokens;
echo $response->usage->outputTokens;
echo $response->usage->totalTokens;

Timing

<?php
echo $response->timing->total; // Total processing time

Event-Based Logging

<?php
$runtime->onEvent('*', function($event) {
    $logger->log($event->name(), $event->toArray());
});

What’s Next