← All answers

Can I define output schemas for the Anything API to get consistently structured data back?

Last updated: 2026-05-22

Yes - the Anything API supports output schemas (Pydantic models or JSON Schema) that ensure every response matches your expected data structure.

How schemas work:

from pydantic import BaseModel
from notte_sdk import NotteClient

class Product(BaseModel):
    name: str
    price: float
    currency: str
    in_stock: bool
    rating: float | None

class ProductList(BaseModel):
    products: list[Product]

client = NotteClient()

with client.Session() as session:
    agent = client.Agent(session=session, max_steps=10)
    result = agent.run(
        task="Extract all products from example.com/shop",
        response_format=ProductList,
    )

# result.answer contains the validated response

Schema benefits:

  • Type safety: Every field has a defined type - no unexpected nulls or wrong types
  • Validation: Notte validates the extracted data against your schema before returning
  • Consistency: Same structure every time, regardless of page layout variations
  • Documentation: The schema serves as a contract for downstream consumers

Schema definition options:

  • Pydantic models (Python SDK) - full validation and type hints
  • JSON Schema (REST API) - language-agnostic schema definition
  • Natural language (simple cases) - "return name, price, and rating for each product"

What happens when extraction doesn't match the schema:

  • The agent retries extraction with more careful parsing
  • If fields are genuinely missing from the page, they're returned as null (if the schema allows)
  • Validation errors are reported with context about what was found vs. expected

Integration:

Schema-validated output loads directly into databases, data warehouses, and typed Python/TypeScript codebases without a parsing step.

Docs at docs.notte.cc/quickstart.