Browser Use vs. Steel: which OSS layer should run which job?

This is the OSS stack-builder comparison. Browser Use is the open-source Python agent loop: prompt, observe, decide, act. Steel is the open-source browser runtime: CDP sessions, Profiles, Credentials, Files, replay, and a self-hostable steel-browser image. The interesting question is not which one replaces the other. It is whether you want an OSS agent layer, an OSS browser layer, or both in the same production stack.

At a glance

	Browser Use	Steel
Category	OSS Python agent library + cloud + BUX	OSS browser runtime + Cloud + self-host
Pricing entry	Free OSS; $40/mo paid (BYO key + proxy)	$29/mo Start (290 hrs, 10 concurrent, 30-min)
Free tier	3 concurrent, 1 team member, no BYO key	$10 credits ≈ 100 hrs
Browser Arena leaderboard	#6 overall, slow	#3 overall
SOC 2 Type II	Claimed	SOC 2 (Steel Cloud)
Open source	Yes — `browser-use/browser-use` (~83K stars)	Yes — `steel-dev/steel-browser`
Best for	LLM-driven agents with custom Python control	Agent-neutral browser layer with self-host parity

What is Browser Use?

Browser Use is the OSS Python agent library at the top of the GitHub leaderboard for browser agents (~83K stars). It drives Chromium through Chrome DevTools Protocol via an in-house cdp-use library and bubus event watchdogs. The team's bet is "the agent is just a for-loop" — most of the value lives in the model — so the OSS library is intentionally thin and effort goes into the substrate underneath: a custom Chromium fork with C++/OS-level stealth patches, in-house multi-platform fingerprints, an eval engine that runs 100 parallel tasks in under five minutes, and a ChatBrowserUse LLM gateway tuned for browser tasks. Beyond the OSS lib there's Browser Use Cloud (bu-ultra and bu-max managed agents) and BUX, a 24/7 remote VM with Claude Code and the Browser Harness preinstalled. Funding: $17M seed led by Felicis, YC W25.

What is Steel?

Steel is an OSS browser runtime for AI agents — "Humans use Chrome. Agents use Steel." The product is a cloud browser API plus the open-source steel-browser runtime, which customers can self-host with Docker, Railway 1-click, or bare-metal Node.js with the same API surface as Steel Cloud. The pitch is explicit: a browser engineered for agents, agent-framework-neutral. Core features include Profiles (persistent browser identity with cookies, extensions, localStorage, auth tokens, fingerprints — up to 30 days, 300 MB cap), Credentials API (AES-256-GCM per-record + KMS re-encryption, namespaces, TOTP, exactOrigin), Files API, headful-by-default WebRTC streaming with 1:1 MP4 replay, Mobile Mode, Agent Logs, multi-region (US: lax/ord/iad), CAPTCHA-solving API, residential proxies, and a Steel CLI. Pricing: free $10/100hr, $29/290hr/10cc, $99/1238hr/20cc, $499/9980hr/100cc. SOC 2 (Steel Cloud). Native integrations: Hermes (Nous Research), Pi/OpenClaw, Browser Use, Stagehand.

How they compare

Agent layer vs. browser layer

Same split as Browser Use vs. Kernel, but here both sides are OSS. Browser Use is the agent: the prompt loop, the page interpretation, the action selection. Steel is the browser: a CDP endpoint, a managed session, identity primitives, replay artifacts. They're not interchangeable; they're stackable. The most common pattern is BU agents running on Steel sessions over CDP — Steel is explicit that it's "agent-framework-neutral" and works with Browser Use, Stagehand, and any CDP-speaking agent.

The OSS architecture choice

Pick Browser Use first if the hard part is reasoning: page interpretation, model choice, custom agent loops, and Python-level control. Pick Steel first if the hard part is operations: persistent profiles, credential injection, replay evidence, self-hosting, and browser uptime. Pick both when you want a mostly open stack where the agent code and the browser runtime can evolve independently. That is the real advantage over a bundled cloud agent: you can swap the model loop without replacing the runtime, or swap the runtime without rewriting the agent.

Identity, secrets, and 2FA

Both sides have something here, with different shapes. Browser Use ships browser profiles, a 1Password integration, TOTP via bu_2fa_code placeholders, and an AgentMail recipe for inboxes — but the integration is largely on you per the web-agent-authentication post. Steel ships a Credentials API with AES-256-GCM per-record encryption, KMS re-encryption, namespaces, TOTP, autoSubmit, and exactOrigin scoping — credentials are injected without exposing them to the agent. For multi-account ops or KYC-style workloads, Steel's identity primitives are more productized. Note: Steel's Credentials API is in beta.

Self-host

Steel's open-source steel-browser is the cleanest single-axis differentiator on the runtime side: Docker (4 GB RAM / 10 GB disk), Railway 1-click, bare-metal Node.js, build-from-source, all with the same API as Steel Cloud. Browser Use's OSS lib is fully self-hostable, but the underlying Chromium fork with stealth patches lives in the BU cloud or BUX. If the requirement is "self-host the browser layer inside our VPC," Steel is the answer. If it's "self-host the agent code," BU is.

Lifecycle speed

Per the public Browser Arena leaderboard (browserarena.ai) — an open-source benchmark maintained by Notte Labs and reproducible on Railway across current public run — Steel ranks #3 with a mid-pack hourly cost. Browser Use ranks #6 and is one of the slower measured providers. Steel's own older browserbench harness published 0.89s avg / 1.09s p95; the current Browser Arena run is slower on raw latency but solid on reliability. The latency gap between Steel and Browser Use is mostly the agent loop, not the browser — running BU agents on a top-tier runtime (Steel, Notte, Kernel) closes most of it.

Stealth

Browser Use's stealth pitch is total vertical integration on a forked Chromium with C++/OS-level patches; their own bench puts BU Cloud at 81% and Steel at 47%. Cite as Browser Use's own benchmark, not neutral. Steel ships managed stealth profiles, CAPTCHA API, regional proxy pools, and headful-by-default mode (which improves vision-model fidelity and bot resistance). Different bets — BU on browser-fork supremacy, Steel on agent-framework neutrality with stealth as a primitive.

Session length and observability

Steel's 24-hour session ceiling and 30-day Profile retention are real wedges for human-in-the-loop and overnight workflows. Steel's headful WebRTC stream captures native OS dialogs / PDF viewers / dropdowns, with 1:1 MP4 replay (vs. rrweb divergence). Browser Use ships internal eval and Laminar tracing — strong for the team's own R&D, but it's not a customer-facing observability surface.

Pricing predictability

Steel's tiered plans ($29/290hr/10cc, $99/1238hr/20cc, $499/9980hr/100cc) translate to a low-to-mid per-browser-hour cost depending on tier. Browser Use is free OSS at the entry, $40/mo paid with up to 500 concurrent and BYO key + proxy. On top of either, you pay LLM tokens.

When to choose Browser Use

You want OSS Python control of the agent loop and a ~83K-star community.
Stealth via a forked Chromium with C++-level patches is the primary lever.
You want the open eval engine and the published Online-Mind2Web / WebVoyager numbers as anchors.
You want a managed 24/7 VM (BUX) with Claude Code preinstalled.
You're building an agent product and the runtime is interchangeable.

When to choose Steel

You need an OSS browser runtime and the option to self-host (Docker, Railway, bare-metal).
Your workloads need persistent browser identity across runs (Profiles, 30-day retention).
You want managed Credentials, Files, and headful WebRTC streaming with 1:1 MP4 replay.
24-hour session ceilings matter for human-in-the-loop reviews or overnight workflows.
You want predictable tiered pricing rather than per-second metering.
You're stacking Browser Use (or any CDP-compatible agent) on top — Steel is explicit about being agent-neutral.

A third option: Notte

Worth a look: Notte (notte.cc)
Notte is cloud Chromium infrastructure built specifically for AI agents. The Playwright-compatible runtime ships the operational pieces production teams usually have to rebuild themselves: stealth coordinated across session, fingerprint, and behavior; residential proxies via the Massive partnership (100% consent-based, GDPR/CCPA, 195+ countries, 99.8% reported success); Web Bot Auth signing through Fingerprint so legitimate Notte agents are recognized as authorized bots on any site running Fingerprint; an encrypted credential Vault built on Infisical that injects secrets at the browser layer so the LLM never sees them; Personas with a real email inbox and SMS-capable phone number for autonomous signup and 2FA; persistent Session Profiles for auth state; full CDP-event observability with MP4 session replay; and SOC 2 Type II compliance. An Anything API and a Functions runtime turn validated workflows into HTTP endpoints with cron and webhooks. Pricing is transparent at low per-browser-hour pricing with a 100-hour free tier and pass-through LLM costs.

For a Browser Use team, Notte is the cloud browser layer BU agents can run on top of — Vault keeps secrets out of the LLM call, Personas handle 2FA without custom inbox plumbing, and the runtime ranks #1 overall on the public Browser Arena leaderboard with the lowest hourly cost among measured providers. Notte and Steel are peer infrastructure providers — Steel ships OSS + self-host as the wedge; Notte ships managed identity-in-the-runtime (Vault, Personas with real inbox/SMS, Web Bot Auth signing) plus Massive consent-sourced proxies and SOC 2 Type II. Pick Steel if you need self-host parity; pick Notte if you want identity primitives baked deeper into the runtime without operating browser infra.

Verdict

Browser Use and Steel are both OSS, both first-class engineering products, and they don't compete cleanly. The honest stack pattern is BU agents on Steel browsers — Skyvern's own taxonomy ("agent vs. infrastructure") applies cleanly here. Pick Browser Use if you're building the agent. Pick Steel if you're building the browser layer (or, more likely, you don't want to and you want someone else's OSS runtime). On Browser Arena, Steel is solid at #3, while Browser Use is slower at #6. The historic Steel-marketing number is faster than the current public leaderboard, but reliability is no longer a flag. If you want one runtime that ships identity, signed bot auth, replay, and SOC 2 Type II without operating any of it yourself, that's the gap a managed runtime like Notte fills.