AI & ML // // 3 min read

Claude Code in Your Actual Browser? Meet FoxPilot.

Ever wished Claude Code could just take over the browser tab you're already staring at and fill that form, scrape that page, or automate that workflow without spinning up a whole new sandboxed browser?

Bala Kumar Senior Software Engineer

Ever wished Claude Code could just take over the browser tab you're already staring at and fill that form, scrape that page, or automate that workflow without spinning up a whole new sandboxed browser?

That's exactly what FoxPilot does.

The Problem

Claude's native browser tool is impressive, but it has real limitations:

  • Bot detection blocks it constantly - sites see the automation signals and serve CAPTCHAs or block outright
  • Different sessions - it runs in an isolated browser, so you're not logged into your actual accounts
  • Browser extensions are flaky - existing solutions break on page refreshes, miss state changes, or just stop responding

Every time I wanted Claude to help with a form on a site I was already logged into, I'd hit a wall. The native browser couldn't see my session. Extensions would lose connection mid-task.

The Solution

FoxPilot is an MCP server paired with a Firefox/Chrome extension that lets AI assistants drive your actual browser - the one with your logins, your cookies, your history.

What it gives you out of the box:

FeatureWhat it means
Tab & window managementOpen, close, group, reorder tabs; resize windows
History searchSearch your actual browsing history
Page readingRead page text and links (with your consent)
Text findingFind and highlight text in tabs
Accessibility snapshotsTagged interactive elements for precise targeting

And when you flip on Automation Mode:

FeatureWhat it means
Page interactionClick, hover, fill fields, type text, press keys, drag elements
Form automationMulti-step form filling in one shot
File uploadsUpload files into file inputs
JavaScript evaluationRun JS in the page and get results back
ScreenshotsViewport, full page, or single element
Console & network captureSee logs and requests
Dialog handlingHandle native browser dialogs

Why this matters

Token efficiency. Because FoxPilot runs locally and communicates over stdio, there's no remote API roundtrip for every browser action. The MCP server is right there on your machine.

Privacy first. Page interaction, scripting, screenshots - all off by default. You explicitly enable Automation Mode when you need it.

Your session, your browser. This isn't a throwaway automation browser. It's Firefox or Chrome with your profile, your extensions, your bookmarks. Claude works with the web as you experience it.

Real use cases

  • "Close all tabs I haven't touched in 24 hours"
  • "Find that article about Milford Track in my history and reopen it"
  • "Go to Hacker News, open the top story, read it and the comments, tell me if they agree"
  • "Search Google Scholar for L-theanine papers from the last 3 years, open the 3 most cited, summarize them"
  • "Fill out this job application form using my resume"

Get it

FoxPilot is available now:

Open source, MIT licensed. Give it a spin and let me know what you automate first.

Source: github.com/balakumardev/foxpilot