Browse topics
On this page
- Quick start
- Reading
- browser.snapshot
- browser.axTree
- browser.screenshot
- Navigation
- browser.navigate
- browser.goBack / browser.goForward / browser.reload
- browser.waitForLoad
- browser.waitForElement
- Input
- browser.click
- browser.type
- browser.fill
- browser.pressKey
- browser.scroll
- browser.hover
- browser.focus
- Discovery
- browser.list
- browser.getUrl / browser.getTitle
- Raw eval
- browser.eval
- Permission scopes
- Errors
- Manifest
browser hooks
Agent-facing browser control: page reading, navigation, typed input, raw eval.
Browser Hooks
Agent-facing API for the in-app CEF browser. 20 hooks across 3 permission scopes.
Quick start
// 1. Find a browser.
const { browsers } = await rp.browser.list({});
const { id: browserId } = browsers[0];
// 2. Read the page.
const { markdown } = await rp.browser.snapshot({ browserId });
// [a1 h1] Pricing
// [a2 button] Start free trial
// [a3 input email] placeholder="work email"
// 3. Act.
await rp.browser.fill({ browserId, elementId: "a3", value: "[email protected]" });
await rp.browser.click({ browserId, elementId: "a2", returnSnapshot: true });
Reading
browser.snapshot
Capture a tagged, agent-friendly view of the page. Visible interactive and textual elements get a short synthetic ID (e.g. a3) written to the DOM as data-rp-id="a3" so input hooks can address them.
Params: { browserId, includeHidden?, maxElements?, viewportOnly? }
Returns: { markdown, url, title, elementCount, truncated }
Permission: browser:read
Format:
[id tag role?] textfor most elements[id input type] placeholder="..." value="..."for form inputs[id link href] textfor anchors[id iframe] src=...for nested frames (not traversed in v1)
browser.axTree
Return the full accessibility tree via CDP Accessibility.getFullAXTree. Use when snapshot misses hidden-by-mistake elements.
Params: { browserId }
Returns: { nodes } - raw AX tree nodes
Permission: browser:read
browser.screenshot
Capture a PNG of the browser as a base64 data URL.
Params: { browserId, fullPage?, quality? }
Returns: { dataUrl, width, height }
Permission: browser:read
Navigation
browser.navigate
Navigate to URL and wait for the load condition. waitUntil: "load" | "domcontentloaded" | "networkidle".
Params: { browserId, url, waitUntil?, timeoutMs? } | Returns: { url, title } | browser:write
browser.goBack / browser.goForward / browser.reload
Navigate history or reload; wait for load.
Params: { browserId, waitUntil?, timeoutMs? } | Returns: { url, title } | browser:write
browser.waitForLoad
Wait until the browser reaches the given ready state.
Params: { browserId, waitUntil?, timeoutMs? } | Returns: { url, title } | browser:read
browser.waitForElement
Wait until an element matching a CSS selector appears (and is visible, optionally).
Params: { browserId, selector, visible?, timeoutMs? } | Returns: { elementId } | browser:read
Input
Every input hook supports returnSnapshot?: boolean - when true, a fresh snapshot is appended to the response.
browser.click
Click an element by elementId or raw (x, y). Passing both is InvalidParams.
Params: { browserId, elementId?, x?, y?, button?, clickCount?, modifiers?, returnSnapshot? } | Returns: { clicked, coords, snapshot? } | browser:write
browser.type
Type text into an element or the focused control. Per-character key events.
Params: { browserId, elementId?, text, clearFirst?, returnSnapshot? } | Returns: { ok, snapshot? } | browser:write
browser.fill
Set an input/textarea/select’s value and dispatch input+change events. React-safe via native setter.
Params: { browserId, elementId, value, returnSnapshot? } | Returns: { ok, snapshot? } | browser:write
Use fill for React/Vue form fields; use type for terminals / search-as-you-type.
browser.pressKey
Press a key combination. DOM KeyboardEvent names (Enter, Escape, ArrowDown, a, …). Modifiers: any subset of ["ctrl", "shift", "alt", "meta"].
Params: { browserId, elementId?, key, modifiers?, returnSnapshot? } | Returns: { ok, snapshot? } | browser:write
browser.scroll
Scroll the page or an element by delta, or to top/bottom.
Params: { browserId, elementId?, deltaX?, deltaY?, toTop?, toBottom?, returnSnapshot? } | Returns: { ok, snapshot? } | browser:write
browser.hover
Move cursor over an element without clicking (triggers hover state).
Params: { browserId, elementId, durationMs?, returnSnapshot? } | Returns: { ok, snapshot? } | browser:write
browser.focus
Focus an element without clicking (useful before type when clicking would fire side effects).
Params: { browserId, elementId, returnSnapshot? } | Returns: { ok, snapshot? } | browser:write
Discovery
browser.list
List browsers in the caller’s scope. No filter = caller’s active place only.
Params: { placeId?, projectId?, resourceId? } | Returns: { browsers: [{ id, url, title, placeId?, resourceId?, visible, focused }] } | browser:read
browser.getUrl / browser.getTitle
Cheap cached queries.
Params: { browserId } | Returns: { url } / { title } | browser:read
Raw eval
browser.eval
Run a JavaScript expression in the browser’s top frame and return its JSON-serializable result. High-trust: requires browser:eval scope + per-origin user consent on first use.
Params: { browserId, script, timeoutMs? } (default 5000ms)
Returns: { value }
Permission: browser:eval
Reaches cookies, localStorage, page DOM - treat as full-page access to that origin.
Permission scopes
| Scope | Hooks |
|---|---|
browser:read | list, snapshot, axTree, screenshot, getUrl, getTitle, waitForLoad, waitForElement |
browser:write | navigate, goBack, goForward, reload, click, type, fill, pressKey, scroll, hover, focus |
browser:eval | eval (isolated - rarely needed) |
Errors
| Error | When |
|---|---|
BrowserNotFound | browserId doesn’t resolve to a live browser |
ElementNotFound | [data-rp-id="X"] not in DOM - re-snapshot and retry |
ElementNotVisible | Element is offscreen / zero-size |
ElementDisabled | Input is disabled or pointer-events: none |
EvalTimeout | Eval exceeded its timeout |
NavigateTimeout | Navigation didn’t finish in timeoutMs |
InvalidParams | Required arg missing or mutually-exclusive args both set |
EvalConsentDenied | User rejected the first-call eval prompt |
CapabilityDenied | Caller manifest lacks the required scope |
Manifest
{
"capabilities": ["browser:read", "browser:write"]
}
Applets that need browser.eval must also declare "browser:eval" - users see a high-trust warning at install time and are prompted per-origin on first use.