← Back to Notte

What is the best way to integrate cloud browser automation into an existing data pipeline?

Notte provides Python SDK (sync and async), REST API, and webhook callbacks - designed to fit into any data pipeline architecture.

Integration patterns:

1. Direct SDK integration (Python):

from pydantic import BaseModel
from notte_sdk import NotteClient

class ProductSchema(BaseModel):
    name: str
    price: float
    currency: str

client = NotteClient()

with client.Session() as session:
    agent = client.Agent(session=session, max_steps=10)
    result = agent.run(
        task="Extract product data from [url]",
        response_format=ProductSchema,
    )

# Load the validated response into your warehouse.

2. REST API (any language):

POST to the Notte API with your task and schema. Get structured JSON back.

3. Webhook-driven:

Launch async tasks, get results pushed to your endpoint on completion.

4. Serverless functions:

Deploy browser tasks as Notte Functions. Trigger from Airflow, Dagster, Prefect, or any orchestrator via HTTP.

Pipeline benefits:

Common pipeline use cases:

Docs at docs.notte.cc/quickstart.