How to Control the Browser with OpenClaw: Complete Beginner Tutorial

Prerequisites
- Node.js 22+ (recommended to install the latest LTS using nvm)
- Google Chrome or Chromium installed (latest stable version)
- Administrator privileges (for global installation and daemon)
- Basic command line proficiency
- (Optional) Telegram / Discord / Slack or other messaging channels for Agent interaction
- (Linux users) Google Chrome .deb package (to avoid Snap restrictions)
OpenClaw is an open-source AI Agent framework that enables true browser automation through the Chrome DevTools Protocol (CDP): opening web pages, clicking, typing, scrolling, taking screenshots, executing JS, and more. It supports two core modes: Managed Browser (isolated instances) and Relay Extension (controlling your real browser).
Step 1: Install OpenClaw
- Globally install the latest version:
npm install -g openclaw@latest
# or pnpm add -g openclaw@latest
- Run the initial setup (recommended):
openclaw onboard --install-daemon
This will automatically install the system service (macOS launchd / Linux systemd).
- Verify installation:
openclaw --version
openclaw gateway status
Seeing the version number and Gateway running status means success.
Step 2: Enable Managed Browser
This is the safest and most isolated approach. OpenClaw will launch an independent Chromium instance.
- Create and edit the configuration file:
mkdir -p ~/.openclaw
cat > ~/.openclaw/openclaw.json << EOF
{
"browser": {
"enabled": true,
"headless": false,
"defaultProfile": "openclaw",
"profiles": {
"openclaw": {
"cdpPort": 18800
}
}
}
}
EOF
- Start the browser instance:
openclaw browser --browser-profile openclaw start
- Test CLI operations:
openclaw browser --browser-profile openclaw open https://www.example.com
openclaw browser --browser-profile openclaw snapshot
Expected output: Browser window opens and the console returns the page snapshot as JSON.
Step 3: Configure Relay Extension (Control Already Logged-in Browser)
Perfect for scenarios needing cookies and logged-in states (e.g., Taobao, banking sites).
- Install the extension files (must run first):
openclaw browser extension install
openclaw browser extension path
Copy the output path (usually ~/.openclaw/browser/chrome-extension).
-
Load in Chrome:
- Open
chrome://extensions/ - Enable "Developer mode"
- Click "Load unpacked" → select the path above
- Open
-
Configure the extension (enter Gateway Token in the popup window):
- Token is usually the default (or run
openclaw gateway tokento check) - Confirm connection success (extension shows "Connected").
- Token is usually the default (or run
-
Add an existing-session profile to the config file:
{
"browser": {
"profiles": {
"user": {
"driver": "existing-session",
"attachOnly": true
}
}
}
}
- Restart the Gateway and test:
openclaw browser --browser-profile user status
Step 4: Let the Agent Control the Browser with Natural Language
OpenClaw's most powerful feature is that the AI Agent understands natural language instructions (including Chinese) — no coding needed.
- Start the Agent channel (Telegram example, assuming
botTokenis configured):
openclaw gateway start
- Send example commands in the chat:
- "Open https://baidu.com and search for OpenClaw"
- "Click the login button on the current page, then enter my email [email protected]"
- "Scroll to the bottom of the page and save a screenshot as report.png"
- "Search for iPhone on Taobao and tell me the prices of the first three products"
Typical Agent replies:
Opened the Baidu homepage and am now searching for OpenClaw... Found 1,234,567 results. The first one is the GitHub repository... Operation completed. Screenshot has been saved.
The Agent will automatically use the browser toolchain (navigate / click / type / snapshot, etc.).
Step 5: Advanced Operations & CLI Quick Commands
Debug directly with the CLI (no Agent needed):
# Navigate
openclaw browser navigate https://github.com/openclaw/openclaw
# Click element (using ref ID)
openclaw browser click 12
# Type text
openclaw browser type 23 "hello world"
# Full-page screenshot
openclaw browser screenshot --full-page
# Generate PDF
openclaw browser pdf
Use --browser-profile user to switch to your real browser.
Common Issues & Troubleshooting
- Browser fails to start (Failed to start Chrome CDP): Linux users should install the Google Chrome .deb package and set
"executablePath": "/usr/bin/google-chrome-stable"and"noSandbox": true. - "Bundled Chrome extension is missing": Reinstall globally (
npm install -g openclaw@latest) or check permissions. - "No Chrome tabs found": Ensure Chrome has at least one tab open, or switch to a managed profile.
- Relay won't connect: Restart extension + Gateway, verify Token, and allow localhost:18791 in firewall.
- Black screen in headless mode: Set
"headless": truefor managed mode; remote services (like Browserbase) need separate configuration. - Permission issues: Use
--no-sandboxon Linux; check user data directory on macOS.
Most problems are fixed by restarting the Gateway: openclaw gateway restart.
Next Steps
- Integrate remote browser services (Browserless or Browserbase) for 24/7 cloud operation.
- Add more Skills (Agent Browser Skill packages for better context).
- Deploy to VPS / Docker for fully automated operation.
- Explore multiple profile switching (separate work/personal profiles).
Run openclaw browser start now and start your browser automation journey!