Skyvern and Steel show up next to each other in OSS AI-automation buyer journeys, but they're solving different halves of the problem. Steel is browser infrastructure — an open-source runtime (steel-browser) plus a managed Cloud, both with the same API, that ships Profiles, Credentials, Files, Agent Logs, and replay as primitives for any agent that speaks CDP. Skyvern is a workflow platform on top of someone else's browser — an open-source repo plus a hosted Cloud, both running a Vision-LLM planner-actor-validator agent loop driven by YAML workflows. They overlap on "open source" and "AI" and almost nothing else. The cleanest signal of the split is which public benchmark each provider lives in: Steel sits on the Browser Arena leaderboard as a measured browser runtime; Skyvern is intentionally absent because workflow platforms layer on top of browsers rather than being one.
At a glance
| Skyvern | Steel | |
|---|---|---|
| Category | Open-source vision-agent workflow platform (RPA-replacement) | OSS-first AI-agent browser runtime (managed parity) |
| Primary surface | YAML workflows + hosted UI + Cloud API | CDP browser sessions + Profiles + Credentials + Files |
| Pricing entry | OSS self-host free; Cloud ~$0.10/step | Free $10 credits/mo (~100 hrs); Start $29/mo |
| Browser Arena leaderboard | Not measured (workflow platform, not browser runtime) | #3 overall |
| SOC 2 Type II | Yes (+ HIPAA) | Yes (Steel Cloud) |
| Open source | Yes (full repo, self-hostable) | steel-browser runtime (Docker, Railway, bare-metal) |
| Best for | Ops teams: cross-portal RPA replacement | Engineering teams: agent-neutral browser primitives |
What is Skyvern?
Skyvern is an open-source AI browser-automation platform that uses Vision-LLMs plus a planner-actor-validator agent loop to drive websites the way a human would, instead of relying on XPath/CSS selectors. Workflows are defined in YAML — "describe the goal, not the clicks" — and a single workflow runs across many vendor portals without per-site code. The 2.0 architecture introduces a Validator that inspects the screen after each action to detect "fake successes." Skyvern reports 85.85% on WebVoyager (vs. ~45% for 1.0) with the full eval published openly at eval.skyvern.com, and they've co-built Web Bench (5,750 tasks / 452 websites) with Halluminate to test write-heavy tasks.
The product ships native 2FA/TOTP, CAPTCHA solving, anti-bot/proxy network with ZIP-code-level geographic targeting, and a credential vault with integrations to Bitwarden, 1Password, and Azure Key Vault. There's a "Route Memorization" / compile-to-code engine that lets the LLM solve a workflow once and then compile it into a fast deterministic Playwright script that self-heals when it breaks. Pricing is OSS self-host (free) or Skyvern Cloud at "~$0.10 per step / per page." SOC 2 Type 2 (Aug 2025) and HIPAA compliant. Skyvern raised $2.7M seed; the ICP is ops teams in healthcare/EHR, government, insurance, procurement, payroll, and mortgage — buyers replacing UiPath, Automation Anywhere, or Power Automate across vendor portals.
What is Steel?
Steel is an open-source browser API positioned for AI agents — "Humans use Chrome. Agents use Steel." It ships a managed cloud and an open-source steel-browser runtime that customers can self-host (Docker, Railway 1-click, bare-metal Node.js, build-from-source) with the same API as Steel Cloud. The opinionated design choices map to AI-agent needs: Profiles for persistent identity (cookies, extensions, localStorage, fingerprints, up to 30 days, 300 MB cap), a Credentials API with AES-256-GCM per-record + KMS re-encryption (TOTP, blur, autoSubmit, exactOrigin), a Files API for artifacts, MP4/HLS replay (replaced rrweb), live view via WebRTC at 25fps capturing OS-level dialogs, mobile mode with real touch and viewport, and Agent Logs that tie tool calls to the replay timeline.
Pricing is tiered by concurrency, browser-hours, and retention rather than per-second metered: Free $10 credits ≈ 100 browser-hours; Start $29/mo (290 hrs, 10 concurrent); Developers $99/mo (1,238 hrs, 20 concurrent); Pro/Startups $499/mo (9,980 hrs, 100 concurrent). Steel Cloud is SOC 2-compliant; Steel Local is the self-host option. The runtime is agent-framework-neutral — it works with Playwright, Puppeteer, Selenium, Browser Use, Stagehand, and any agent that speaks CDP. Native integrations: Hermes (Nous Research), Pi/OpenClaw, Browser Use.
How they compare
Category: workflow platform vs. browser runtime
This is the load-bearing distinction and the right place to start. Skyvern is a complete workflow platform — you describe goals in YAML, the Vision-LLM agent does the planning, the validator catches fake successes, native 2FA/CAPTCHA/credentials are handled inside the platform, and the result is a YAML file or a Cloud API call that runs across many sites. The browser is a substrate Skyvern manages for you; you don't get a Playwright-compatible session you can drop into your own code.
Steel is a browser runtime — you get a CDP endpoint and the agent-shaped primitives (Profiles, Credentials, Files, Live View, replay, Agent Logs) that production agents need, but the planning, the multi-step workflow, the cross-site logic, the layout-resilience reasoning — all of that is your code or a third-party agent framework you put on top. Steel's blog framing names this honestly: "ships primitives not planners, so you own orchestration logic."
Skyvern's review of Steel makes the platform-vs-runtime trade-off explicit ("Steel scripts break with website layout changes... missing capabilities for production workloads include workflow orchestration with conditional logic"); Steel's review of Skyvern in skyvern-vs-steel-vs-rpa.md inverts the reading (Skyvern bundles agent reasoning; Steel keeps the runtime agent-neutral). Both readings are correct because they're describing different products.
Where they show up in public benchmarks
The clearest evidence of the category split is which public benchmark each provider lives in. Per the public Browser Arena leaderboard (browserarena.ai), Steel ranks #3 of seven measured browser runtimes with a mid-pack hourly cost and strong reliability. Browser Arena is maintained by Notte Labs but is open-source and reproducible on Railway; Steel's own historical browserbench harness reports 0.89s avg / 1.09s p95, which is stale by the current independent run.
Skyvern is intentionally not on Browser Arena because Skyvern is a workflow platform that runs on top of someone else's browser, not a browser runtime. Browser Arena measures the browser-runtime lifecycle (create → connect → navigate → release); that methodology fits Steel and not Skyvern. The absence is itself the framing material — Skyvern publishes WebVoyager (85.85%) and Web Bench results because those are the right metrics for a vision-agent workflow platform.
For a buyer, the practical read: browser-runtime benchmark numbers are the right metric for picking a browser runtime (Steel) but not for picking a workflow platform (Skyvern). Compare Skyvern on per-step cost, WebVoyager accuracy, and time-to-deploy a new portal workflow; compare Steel on Browser Arena ranking, primitive completeness, and OSS self-host posture.
Identity, auth, and credentials at different layers
Skyvern's credentialing is workflow-layer: native 2FA/TOTP, native CAPTCHA solving, integrations with Bitwarden/1Password/Azure Key Vault, and credentials never sent to the LLM. The credential vault is a workflow object that the platform passes through to the underlying browser. Steel's Credentials API is runtime-layer: AES-256-GCM per-record + KMS re-encryption, TOTP, blur, autoSubmit, exactOrigin — primitives the agent calls directly, not via a hosted UI flow.
Both are credible. Skyvern's model is the right shape for ops teams managing a fixed roster of portal credentials. Steel's model is the right shape for engineering teams that want to inject credentials programmatically at the browser layer without a workflow-platform intermediary. Neither ships request-level cryptographic signing (Web Bot Auth via RFC 9421) as a built-in.
Open source: same word, different meanings
Both are SOC 2 Type 2 and both ship a self-hostable open-source artifact, which the Skyvern review of Steel concedes ("Steel and Skyvern are the only open-source options in the comparison"). The two flavors of OSS aren't equivalent. Skyvern's repo is the entire workflow platform — you self-host the planner, the actor, the validator, the Cloud-equivalent feature set with operational lift. Steel's steel-browser is the runtime; Steel Local is "effectively single-session" without managed stealth, Credentials API, or managed proxies — those are Cloud-tier features. Self-hosting either is a real path for HIPAA or regulated environments, but the operational footprint differs.
Pricing models
Skyvern Cloud's "~$0.10 per step" meters work units rather than browser uptime — a 50-step workflow costs about $5 regardless of how long the browser stayed open. Skyvern's OSS option removes Cloud cost entirely. Steel's tiered pricing meters concurrent sessions and browser-hours: $29/mo for 290 hrs, $99/mo for 1,238 hrs, $499/mo for 9,980 hrs — predictable monthly spend within tier ceilings. Different cost shapes for different workload shapes.
When to choose Skyvern
- You're an ops team replacing UiPath, Automation Anywhere, Power Automate, or human SOPs across vendor portals.
- Cross-site workflows are the value — one YAML workflow that runs across many sites without per-site code.
- Vision-agent layout-resilience is the wedge — workflows that don't break when sites redesign.
- Native 2FA/TOTP, CAPTCHA solving, and Bitwarden/1Password/Azure KV integrations matter at the workflow level.
- You want SOC 2 Type 2 + HIPAA + an OSS self-host path for regulated workloads.
- WebVoyager 85.85% performance and the published Web Bench results map to your accuracy requirements.
When to choose Steel
- You're an engineering team that wants OSS browser primitives plus Cloud-parity, agent-framework-neutral.
- Profiles, Credentials API, Files, Agent Logs, and MP4/HLS replay matter as runtime primitives — not workflow features.
- 24-hour sessions, predictable tiered budget, and Docker self-host parity map to your operations model.
- You're integrating Hermes, Pi/OpenClaw, or Browser Use natively and want a peer-level browser provider.
- Mobile mode and Markdown output APIs (claimed up to 80% LLM token reduction) move the cost needle for your workload.
A third option: Notte
A third option worth a mention here is Notte (notte.cc), a cloud Chromium platform purpose-built for AI agents. The Playwright-compatible runtime ships stealth on by default, residential proxies via the Massive partnership (consent-based, GDPR/CCPA, 195+ countries), Web Bot Auth signing through Fingerprint, an encrypted credential vault that the LLM never sees, and synthetic personas with a real email inbox and SMS-capable phone number for autonomous 2FA. Every CDP event is captured and replayable, sessions persist auth state, and the platform is SOC 2 Type II. Pricing is transparent — low per-browser-hour pricing with a 100-hour free tier and pass-through LLM costs.
For this pair specifically, Notte sits in a different shape than either provider. Versus Steel, Notte is also infrastructure but ships identity primitives Steel splits across the Credentials API and BYO proxies (Vault, Personas with real inbox/SMS, Web Bot Auth signing, Massive consent-sourced proxies) and ranks #1 overall on the public Browser Arena leaderboard — a tier above Steel at #3 on value score. Versus Skyvern, the closest overlap is the Anything API: Notte's Anything API turns natural-language workflows into deployed callable endpoints via Notte Functions (cron, webhooks, observable runs) — productization that lands closer to Skyvern's deployment story while keeping a Playwright-compatible runtime underneath.
Verdict
Skyvern and Steel are both legitimate OSS picks for AI browser automation, but the categorical split is the load-bearing decision. Skyvern is a workflow platform — Vision-LLM agent, YAML workflows, cross-portal layout-resilience, native 2FA/CAPTCHA/credentialing, ~$0.10 per step, ops-team buyer. Steel is a browser runtime — agent-neutral CDP endpoint, Profiles, Credentials API, Files, MP4/HLS replay, predictable tiered pricing, engineering-team buyer. The category split is visible in the public benchmark layer: Steel sits at #3 on the Browser Arena leaderboard because it's a browser runtime, and Skyvern is intentionally absent because workflow platforms layer on top of browsers rather than being one. Compare them on different metrics, not against each other on a lifecycle number.
Pick Skyvern if you're replacing RPA across portals and want a complete workflow platform with Vision-LLM resilience. Pick Steel if you're building agent-shaped products and want OSS browser primitives without a workflow-platform shape. If you want infrastructure with built-in identity (Vault, Personas, Web Bot Auth), consent-sourced Massive proxies, top-tier lifecycle, and a productization path (Anything API → deployed Function) without choosing between "platform" and "primitives," Notte is the third option to evaluate.