================================================================================ XGODO DOCUMENTATION (FULL) Generated: 2026-05-05T07:04:33.032Z ================================================================================ This file contains the Xgodo documentation for API and Automation. Use the search index at /docs/search-index.json for structured queries. ================================================================================ SECTION: API DOCUMENTATION ================================================================================ ## Authentication Content: Bearer Token Authentication All API endpoints require authentication using a Bearer token. Include the following header in your requests: Authentication -------------------------------------------------------------------------------- ## Jobs API Manage job postings, retrieve applicants, and handle job-related operations. ### POST /api/v2/jobs/applicants Title: Retrieve job applicants Description: Retrieves a list of job applicants for a specified job. Allows for sorting, pagination, and searching within job applicants. ### PUT /api/v2/jobs/applicants Title: Update job applicants Description: Updates the status of specified job tasks. It supports changing the status, adding comments, and handles referral bonuses if the task is confirmed. ### GET /api/v2/jobs/details Title: Get job details by ID Description: Returns job details for a specific job ID. When the job is linked to an automation (`automation_id` is set), the response also includes the automation ### POST /api/v2/jobs/check-uniquenes Title: Check for uniqueness of job proofs Description: This endpoint checks if a given search term exists in the job proofs. You can specify the search option to look for exact words or substrings. ### POST /api/v2/jobs/myjobs Title: Retrieve a list of jobs that you have posted Description: This endpoint allows a user to retrieve a paginated list of jobs associated with their account. The results can be filtered and sorted based on various parameters. ### POST /api/v2/jobs/search Title: Search and filter jobs Description: Returns a paginated list of jobs matching the provided filters. Supports search, category, price, location, and other filters. ### POST /api/v2/jobs/submit Title: Create a new job posting Description: This endpoint allows users to create a new job posting. It requires specific details about the job, including title, description, category, number of positions, job price, and duration. ### PUT /api/v2/jobs/update-status Title: Update job status Description: Updates the status of a specific job. Content: Jobs API Manage job postings, retrieve applicants, and handle job-related operations. Retrieve job applicants Retrieves a list of job applicants for a specified job. Allows for sorting, pagination, and searching within job applicants. Update job applicants Updates the status of specified job tasks. It supports changing the status, adding comments, and handles referral bonuses if the task is confirmed. Get job details by ID Returns job details for a specific job ID. When the job is linked to an automation (`automation_id` is set), the response also includes the automation Check for uniqueness of job proofs This endpoint checks if a given search term exists in the job proofs. You can specify the search option to look for exact words or substrings. Retrieve a list of jobs that you have posted This endpoint allows a user to retrieve a paginated list of jobs associated with their account. The results can be filtered and sorted based on various parameters. Search and filter jobs Returns a paginated list of jobs matching the provided filters. Supports search, category, price, location, and other filters. Create a new job posting This endpoint allows users to create a new job posting. It requires specific details about the job, including title, description, category, number of positions, job price, and duration. Update job status Updates the status of a specific job. -------------------------------------------------------------------------------- ## Tasks API Manage job tasks, apply for tasks, and submit task completions. ### GET /api/v2/tasks/apply Title: Apply for task submission to a job Description: This API must be called before tasks/submit API for jobs that have job variables. This endpoint allows workers to apply to an active job. Job variables (if any) will be returned. ### POST /api/v2/tasks/details Title: Get recent tasks or details of a specific job task Description: Returns the list of recent tasks of the user. If task_id query param is provided, it will return details of a single task. ### POST /api/v2/tasks/submit Title: Submit a task to a job Description: ### GET /api/v2/planned_tasks Title: List unassigned planned tasks for a job Description: Returns planned tasks for a job that have not yet been assigned to a job task (i.e. job_task_id is null). Supports search, pagination, and sort order. Only the job owner (or a user the job is shared with) can list its planned tasks. ### POST /api/v2/planned_tasks/submit Title: Submit planned tasks for a job Description: This endpoint allows users to submit multiple planned tasks for a specific job. Planned tasks are pre-defined inputs that will be used when workers apply for the job. ### DELETE /api/v2/tasks/delete Title: Delete a job task by ID Description: This endpoint allows employers to delete a job task. Accepts either a job_task_id or planned_task_id. Only tasks with certain statuses can be deleted to maintain data integrity (failed, declined, confirmed). When deleting via a planned_task_id, the planned task is preserved (with job_task_id set to null) so it can be reused, while only the job task is deleted. ### DELETE /api/v2/planned_tasks Title: Delete one or more planned tasks Description: Deletes unassigned planned tasks by their IDs. Planned tasks already assigned to a pending or successful job task cannot be deleted — reinitiate or delete the job task first. All IDs must belong to jobs owned by (or shared with edit permission to) the caller. ### POST /api/v2/planned_tasks/reinitiate Title: Re-initiate failed/declined tasks Description: This endpoint allows employers to re-initiate one or more failed or declined tasks. Accepts either a single task_id or an array of task_ids. This makes the planned tasks available again for workers to apply. Content: Tasks API Manage job tasks, apply for tasks, and submit task completions. Apply for task submission to a job This API must be called before tasks/submit API for jobs that have job variables. This endpoint allows workers to apply to an active job. Job variables (if any) will be returned. Get recent tasks or details of a specific job task Returns the list of recent tasks of the user. If task_id query param is provided, it will return details of a single task. Submit a task to a job List unassigned planned tasks for a job Returns planned tasks for a job that have not yet been assigned to a job task (i.e. job_task_id is null). Supports search, pagination, and sort order. Only the job owner (or a user the job is shared with) can list its planned tasks. Submit planned tasks for a job This endpoint allows users to submit multiple planned tasks for a specific job. Planned tasks are pre-defined inputs that will be used when workers apply for the job. Delete a job task by ID This endpoint allows employers to delete a job task. Accepts either a job_task_id or planned_task_id. Only tasks with certain statuses can be deleted to maintain data integrity (failed, declined, confirmed). When deleting via a planned_task_id, the planned task is preserved (with job_task_id set to null) so it can be reused, while only the job task is deleted. Delete one or more planned tasks Deletes unassigned planned tasks by their IDs. Planned tasks already assigned to a pending or successful job task cannot be deleted — reinitiate or delete the job task first. All IDs must belong to jobs owned by (or shared with edit permission to) the caller. Re-initiate failed/declined tasks This endpoint allows employers to re-initiate one or more failed or declined tasks. Accepts either a single task_id or an array of task_ids. This makes the planned tasks available again for workers to apply. -------------------------------------------------------------------------------- ## Payments API Handle payment operations including withdrawals. ### POST /api/v2/payments/withdraw Title: Withdraw user balance Description: Initiate a withdrawal to a wallet address. ### GET /api/v2/payments/withdrawals Title: Get recent withdrawal details Description: Returns the list & details of recent withdrawals made by the user. Content: Payments API Handle payment operations including withdrawals. Withdraw user balance Initiate a withdrawal to a wallet address. Get recent withdrawal details Returns the list & details of recent withdrawals made by the user. -------------------------------------------------------------------------------- ## User API User account management and balance information. ### GET /api/v2/user/balance Title: Get user balance Description: Returns the user -------------------------------------------------------------------------------- ## Devices API Device management and information retrieval endpoints. ### GET /api/v2/devices Title: Get debuggable devices Description: Retrieves a list of devices that are either owned by the user or rented with live payment type. ### GET /api/v2/devices/market Title: List market devices (action / online) Description: Returns devices that are listed on the market with payment_type ### GET /api/v2/devices/verified-phone-number Title: Get the device Description: Agent-scoped endpoint. Identifies the device from the automation agent token and returns the phone number stored on the device record only when phoneNumberVerified is true. Returns null otherwise. Backs `agent.info.getVerifiedPhoneNumber` in the automation bootstrap. Content: Devices API Device management and information retrieval endpoints. Get debuggable devices Retrieves a list of devices that are either owned by the user or rented with live payment type. List market devices (action / online) Returns devices that are listed on the market with payment_type Get the device Agent-scoped endpoint. Identifies the device from the automation agent token and returns the phone number stored on the device record only when phoneNumberVerified is true. Returns null otherwise. Backs `agent.info.getVerifiedPhoneNumber` in the automation bootstrap. -------------------------------------------------------------------------------- ## Proxy API Manage SOCKS5/HTTP proxy credentials. A connection is one credential pair (`connectionId` + `password`) routed through one of your devices. Re-pointing the association to a different device is the IP-rotation primitive — new sessions exit through the new device within ~100 ms; in-flight sessions stay on the old device until they close. Endpoints are scoped to devices the caller owns; the admin endpoint requires the admin token. ### POST /api/v2/devices/connections/list Title: List my connections Description: All connections owned by, or currently routed through, devices the caller owns. The `password` is plaintext — RMS owns the credential. ### POST /api/v2/devices/connections/create Title: Create a connection Description: Generates a fresh `connectionId` (UUID) and 16-char password, associates it with the chosen device, and syncs to the orchestrator + legacy mirror. The plaintext password is in the response — save it now. ### POST /api/v2/devices/connections/delete Title: Delete a connection Description: Cascades on RMS, orchestrator, and the legacy mirror. Active sessions are terminated. ### POST /api/v2/devices/connections/password/rotate Title: Rotate password Description: Replaces the connection ### POST /api/v2/devices/connections/association/set Title: Move connection to another device (IP rotation) Description: Hot-swap the exit device. Owner must own both the source connection and the target device. ### POST /api/v2/devices/connections/association/delete Title: Remove association Description: Connection stays alive but cannot authenticate until re-associated. ### POST /api/v2/devices/connections/blacklist/list Title: List per-connection blacklist Description: Per-connection block list (in addition to global + per-device). ### POST /api/v2/devices/connections/blacklist/add Title: Add domains to blacklist Description: Adds one or more domains to the connection ### POST /api/v2/devices/connections/blacklist/remove Title: Remove domains from blacklist Description: Removes one or more domains from the connection ### POST /api/v2/devices/proxy-logs/list Title: Query request logs Description: One log entry per SOCKS5 / HTTP CONNECT attempt. Backed by ClickHouse; results are post-filtered to devices owned by the caller. ### POST /api/v2/devices/proxy-bandwidth Title: Query bandwidth aggregates Description: Per-(device, connection) byte counts, optionally grouped by hour or day. When `deviceId` is omitted the backend issues one query per owned device and merges. ### GET /api/v2/admin/devices/proxies Title: List all proxy connections (admin) Description: Returns every proxy connection across all users, joined with the associated device Content: Proxy API Manage SOCKS5/HTTP proxy credentials. A connection is one credential pair (`connectionId` + `password`) routed through one of your devices. Re-pointing the association to a different device is the IP-rotation primitive — new sessions exit through the new device within ~100 ms; in-flight sessions stay on the old device until they close. Endpoints are scoped to devices the caller owns; the admin endpoint requires the admin token. List my connections All connections owned by, or currently routed through, devices the caller owns. The `password` is plaintext — RMS owns the credential. Create a connection Generates a fresh `connectionId` (UUID) and 16-char password, associates it with the chosen device, and syncs to the orchestrator + legacy mirror. The plaintext password is in the response — save it now. Delete a connection Cascades on RMS, orchestrator, and the legacy mirror. Active sessions are terminated. Rotate password Replaces the connection Move connection to another device (IP rotation) Hot-swap the exit device. Owner must own both the source connection and the target device. Remove association Connection stays alive but cannot authenticate until re-associated. List per-connection blacklist Per-connection block list (in addition to global + per-device). Add domains to blacklist Adds one or more domains to the connection Remove domains from blacklist Removes one or more domains from the connection Query request logs One log entry per SOCKS5 / HTTP CONNECT attempt. Backed by ClickHouse; results are post-filtered to devices owned by the caller. Query bandwidth aggregates Per-(device, connection) byte counts, optionally grouped by hour or day. When `deviceId` is omitted the backend issues one query per owned device and merges. List all proxy connections (admin) Returns every proxy connection across all users, joined with the associated device -------------------------------------------------------------------------------- ## Bucket API Per-(job, device) JSON bucket — the persistent state automations attach to a single hire on a single device. The endpoints below are the **employer-facing** surface, gated by job ownership / share permission. They mirror the same model as the **Buckets** tab on the Job details page. (The agent surface used from inside automations is `agent.utils.bucket` — see the automation docs.) ### GET /api/v2/bucket/:job_id Title: List buckets for a job Description: Returns a paginated list of every bucket attached to this job, with the device display name resolved for each. ### DELETE /api/v2/bucket/:job_id Title: Delete all buckets for a job Description: Wipes every bucket associated with this job. Caller must have `edit` access (owner or shared-edit). ### GET /api/v2/bucket/:job_id/:device_id Title: Read a single bucket Description: Returns the bucket for this `(job_id, device_id)` pair. 404 when none exists or the job is inaccessible. ### PUT /api/v2/bucket/:job_id/:device_id Title: Write a bucket Description: Replaces the bucket payload for this `(job_id, device_id)` pair (no merge — `data` becomes the new bucket). Validated against the automation ### DELETE /api/v2/bucket/:job_id/:device_id Title: Delete a bucket Description: Removes the bucket for this `(job_id, device_id)` pair. Caller must have `edit` access. Content: Bucket API Per-(job, device) JSON bucket — the persistent state automations attach to a single hire on a single device. The endpoints below are the **employer-facing** surface, gated by job ownership / share permission. They mirror the same model as the **Buckets** tab on the Job details page. (The agent surface used from inside automations is `agent.utils.bucket` — see the automation docs.) List buckets for a job Returns a paginated list of every bucket attached to this job, with the device display name resolved for each. Delete all buckets for a job Wipes every bucket associated with this job. Caller must have `edit` access (owner or shared-edit). Read a single bucket Returns the bucket for this `(job_id, device_id)` pair. 404 when none exists or the job is inaccessible. Write a bucket Replaces the bucket payload for this `(job_id, device_id)` pair (no merge — `data` becomes the new bucket). Validated against the automation Delete a bucket Removes the bucket for this `(job_id, device_id)` pair. Caller must have `edit` access. -------------------------------------------------------------------------------- ## Device Bucket API Per-device JSON bucket, shared across every job and automation running on the device. Useful for cross-job state — login cookies, account tokens, app config. Distinct from the per-job bucket: there is no `bucket_schema` validation, and no job context is required, so direct-run automations can use it too. The agent calls these via `agent.utils.deviceBucket`. ### POST /api/v2/device-bucket Title: Read device bucket Description: Reads the device-scoped bucket. The `remote_device_id` is taken from the token when it carries a device context; non-agent callers must supply it in the body. ### POST /api/v2/device-bucket/set Title: Merge into device bucket Description: Merges the supplied object into the existing device bucket. Existing keys not in the payload are preserved. Content: Device Bucket API Per-device JSON bucket, shared across every job and automation running on the device. Useful for cross-job state — login cookies, account tokens, app config. Distinct from the per-job bucket: there is no `bucket_schema` validation, and no job context is required, so direct-run automations can use it too. The agent calls these via `agent.utils.deviceBucket`. Read device bucket Reads the device-scoped bucket. The `remote_device_id` is taken from the token when it carries a device context; non-agent callers must supply it in the body. Merge into device bucket Merges the supplied object into the existing device bucket. Existing keys not in the payload are preserved. -------------------------------------------------------------------------------- ## Other API File upload and temporary file access endpoints. ### POST /api/v2/files/upload Title: Upload temporary file Description: Uploads a file to temporary storage. Files are automatically deleted after 15 minutes. Maximum file size is 50 MB. Use multipart/form-data with ### GET /api/v2/files/temp/:filename Title: Get uploaded file Description: Retrieves a previously uploaded temporary file. Files expire after 15 minutes from upload time. Content: Other API File upload and temporary file access endpoints. Upload temporary file Uploads a file to temporary storage. Files are automatically deleted after 15 minutes. Maximum file size is 50 MB. Use multipart/form-data with Get uploaded file Retrieves a previously uploaded temporary file. Files expire after 15 minutes from upload time. -------------------------------------------------------------------------------- ================================================================================ SECTION: AUTOMATION API DOCUMENTATION ================================================================================ ### GUIDE ## Guide / Automation API Documentation Path: /docs/automation Description: Complete reference for automating Android devices. Build powerful automation scripts with full access to device controls, screen content, file operations, and more. Sections: - Guide - Reference - Quick Links - Getting Started Content: Guide Learn how to build automations with step-by-step tutorials and best practices. Reference Complete API reference for all interfaces, methods, types, and constants. Quick Links Getting Started agent Automation API Documentation Complete reference for automating Android devices. Build powerful automation scripts with full access to device controls, screen content, file operations, and more. Agent Actions All automation actions: tap, swipe, screenshot, screenContent, launchApp, and more AndroidNode Working with the accessibility tree - properties and methods for finding and interacting with UI elements AndroidNodeFilter Builder pattern for complex node queries - isButton(), hasText(), isClickable(), and more File Operations Read, write, list, and manage files on the device -------------------------------------------------------------------------------- ## Guide / Guide Path: /docs/automation/guide Description: Learn how to build automations with tutorials and best practices Sections: - Getting Started - Core Concepts - Running & Tutorial - Looking for API details? Content: This guide walks you through everything you need to know to create powerful Android automations. From setting up your first project to building production-ready scripts with proper error handling. Getting Started Core Concepts Running & Tutorial Looking for API details? API Reference Guide Learn how to build automations with tutorials and best practices Getting Started Create your first automation project and learn the IDE Configuration Set up parameters, job variables, requirements, and sharing Writing Scripts TypeScript basics, Agent API, and common patterns Screen States Detect and respond to different UI states Stages Organize automation into phases with step limits Task Submission Submit results, collect data, and handle job tasks Error Handling Handle crashes, dialogs, network issues, and recovery Running Automations Execute automations on devices and collect results Full Tutorial: MySocial Auto-Responder Complete example with stages, screen states, and data collection -------------------------------------------------------------------------------- ## Guide / Getting Started Path: /docs/automation/guide/getting-started Description: Create your first automation project and learn the IDE Sections: - Creating a New Project - The IDE - Project Structure - Your First Automation - Keyboard Shortcuts - TypeScript Compilation - ES6 Imports - Next Steps - Code Editor - File Explorer - Git Integration - Options Panel - Compilation Errors - Import Extension - Configuration → - Writing Scripts → Content: Creating a New Project To create a new automation project: Make sure you are logged in Navigate to "} AUTOMATIONS in your dashboard Click the New Automation button Enter a project name and optional description Click Create Project main.ts The IDE The automation IDE provides a full development environment with: Code Editor Full TypeScript support with syntax highlighting, autocomplete, and error checking. The editor automatically compiles TypeScript to JavaScript on save. File Explorer Manage your project files and folders. Create, rename, and delete files. Organize your code across multiple TypeScript files with ES6 imports. Git Integration Built-in version control with commit history, diff viewer, and revert functionality. Track changes and roll back to previous versions when needed. Options Panel Configuration Project Structure A typical automation project structure: main.ts Your First Automation Here's a simple automation that launches an app and takes a screenshot: Keyboard Shortcuts Shortcut Action Ctrl/Cmd + S Save current file Ctrl/Cmd + Space Trigger autocomplete Ctrl/Cmd + / Toggle line comment Ctrl/Cmd + F Find in file TypeScript Compilation .ts .js Compilation Errors If your TypeScript has errors, they'll be displayed when you save. Fix the errors and save again to update the compiled JavaScript. ES6 Imports Organize your code across multiple files using ES6 imports: Import Extension Always use .js .ts files. This is because the runtime executes the compiled JavaScript. Next Steps Configuration → Set up parameters, job variables, and sharing Writing Scripts → Learn the Agent API and common patterns my-automation/ ├── main.ts # Entry point (required) ├── stages.ts # Stage enum definitions ├── screenStates.ts # Screen state enum ├── detection.ts # Screen detection logic ├── handlers.ts # Stage handlers └── utils.ts # Shared utilities // Simple automation example const PACKAGE_NAME = "com.example.myapp"; async function main() { try { // Launch the app await agent.actions.launchApp(PACKAGE_NAME); // Wait for app to load await sleep(3000); // Get screen content const screen = await agent.actions.screenContent(); // Find all text on screen const allNodes = getAllNodes(screen); const textNodes = allNodes.filter(node => node.text); console.log("Found text nodes:", textNodes.length); // Take a screenshot const screenshot = await agent.actions.screenshot(1080, 1920, 80); console.log("Screenshot taken!"); // Submit success await agent.utils.job.submitTask("success", { textNodesFound: textNodes.length }); } catch (error) { console.error("Automation failed:", error); await agent.utils.job.submitTask("failed", { error: String(error) }); } finally { // Always stop the automation stopCurrentAutomation(); } } // Helper function for delays function sleep(ms: number) { return new Promise(resolve => setTimeout(resolve, ms)); } // Start the automation main(); // Export enum from stages.ts export enum Stage { Initialize = "Initialize", LaunchApp = "LaunchApp", ProcessData = "ProcessData", Complete = "Complete", } // Import in main.ts import { Stage } from "./stages.js"; let currentStage = Stage.Initialize; // Use the imported enum if (currentStage === Stage.Initialize) { // ... } Getting Started Create your first automation project and learn the IDE Project Structure main.ts stages.ts main.ts -------------------------------------------------------------------------------- ## Guide / Configuration Path: /docs/automation/guide/configuration Description: Set up parameters, job variables, requirements, and sharing Sections: - Metadata - Requirements - Automation Parameters - Job Variables - Sharing - Auto-Generated Types - Next Steps - Minimum Android Version - Minimum App Version - Supported Field Types - Parameters vs Job Variables - Execution Access - Editor Access - Logs Access - Managing Access - Type Safety - Writing Scripts → - API Reference → Content: Access configuration options through the Options panel on the right side of the IDE. These settings control how your automation behaves and who can access it. Metadata Basic information about your automation: Field Description Limits Name Display name for your automation Max 225 characters Description Explain what the automation does Max 10,100 characters Icon Visual identifier for the automation 512x512 WebP, max 2MB Requirements Set minimum version requirements to ensure compatibility: Minimum Android Version The lowest Android version your automation supports. Options range from Android 7.0 (API 24) to Android 16 (API 36). Tip: dpad() inputKey() Minimum App Version The minimum version of the Xgodo app required on the device. Select from available versions sorted by version code. Automation Parameters Define configuration options that users can set when running your automation. These are static values set once before execution. Supported Field Types Type Validations Use Case string minLength, maxLength, pattern, enum Text input, selections number min, max, integer Counts, limits, delays boolean Feature toggles array minItems, maxItems, item schema Lists of values object Nested properties Complex configs any Flexible input Job Variables Define runtime variables provided for each job task. Unlike automation parameters, job variables can change between tasks in the same job. Parameters vs Job Variables Automation Parameters: Set once, same for all tasks (e.g., retry count, feature flags) in a job. Job Variables: Different per task (e.g., account credentials, target data) Sharing Control who can access your automation. Only project owners can manage sharing settings. Execution Access Users who can run "} Automations and execute it. Editor Access Users who can edit the automation's code. They have full access to the IDE and can modify files. Logs Access Users who can view logs from automation runs. Useful for debugging and monitoring without edit access. Managing Access For each sharing type: Open the Sharing section in Options panel Enter a username in the input field Click Add or press Enter To remove access, click the remove button next to a user Auto-Generated Types When you define parameter schemas, TypeScript types are automatically generated. These provide full autocomplete and type checking in the IDE. Type Safety The IDE will show errors if you access non-existent properties or use wrong types. This helps catch bugs before running the automation. Next Steps Writing Scripts → Learn the Agent API and common patterns API Reference → Complete documentation of all available methods // Accessible via agent.arguments.automationParameters interface AutomationParameters { targetUsername: string; // Required string maxRetries: number; // Number with validation enableNotifications: boolean; // Toggle option messageTemplate: string; // Text with default value } // Access automation parameters const { targetUsername, maxRetries, enableNotifications } = agent.arguments.automationParameters; console.log("Target:", targetUsername); console.log("Max retries:", maxRetries); if (enableNotifications) { await agent.actions.showNotification("Started", "Automation is running"); } // Accessible via agent.arguments.jobVariables interface JobVariables { email: string; // Account to process password: string; // Credentials proxyUrl?: string; // Optional proxy } // Get current task data const task = await agent.utils.job.getCurrentTask(); if (task.success) { const { email, password } = agent.arguments.jobVariables; // Use the job variables await processAccount(email, password); // Submit results await agent.utils.job.submitTask("success", { email, processedAt: new Date().toISOString() }); // Stop the automation stopCurrentAutomation(); } // Auto-generated from your schema interface AutomationParameters { /** Target user to process */ targetUsername: string; /** Maximum retry attempts */ maxRetries: number; /** Enable push notifications */ enableNotifications: boolean; } interface JobVariables { /** Account email */ email: string; /** Account password */ password: string; } interface AgentArguments { automationParameters: AutomationParameters; jobVariables: JobVariables; } // Available on agent object interface Agent { arguments: AgentArguments; // ... other properties } Configuration Set up parameters, job variables, requirements, and sharing Example: Automation Parameters Schema Accessing Parameters in Code Example: Job Variables Schema Accessing Job Variables Generated Type Definitions -------------------------------------------------------------------------------- ## Guide / Writing Scripts Path: /docs/automation/guide/writing-scripts Description: TypeScript basics, Agent API, and common patterns Sections: - The Agent Object - Common Actions - Screen Content - Screenshots - Common Patterns - Logging - File Operations - Network Monitoring - Best Practices - Next Steps - Touch Gestures - Text Input - Navigation - App Management - Helper Functions - Interacting with Nodes - Sleep/Wait Function - Wait for Screen State - Retry with Backoff - Use Random Delays - Use randomClick for Taps - Prefer performAction over tap - Handle Unknown Screens - Screen States → - Stages → Content: agent The Agent Object agent Common Actions Touch Gestures Text Input Navigation App Management Screen Content Get the current screen content as an accessibility tree: Helper Functions These helper functions are globally available: Interacting with Nodes Screenshots Common Patterns Sleep/Wait Function setTimeout(resolve, ms)); } // Sleep with random range (more human-like) function sleepRandom(min: number, max: number): Promise Wait for Screen State boolean, timeout: number = 10000 ): Promise Retry with Backoff Promise , maxAttempts: number = 3, baseDelay: number = 1000 ): Promise Logging File Operations Network Monitoring Best Practices Use Random Delays sleepRandom(1000, 2000) Use randomClick for Taps node.randomClick() Prefer performAction over tap node.performAction() Handle Unknown Screens Always have fallback logic for screens you don't recognize. Log unknown states and retry after a short delay. Next Steps Screen States → Detect and respond to different UI states Stages → Organize automation into phases with step limits agent = { actions: { ... }, // Device interactions (tap, swipe, type, etc.) utils: { ... }, // Utilities (random helpers, job tasks, files) info: { ... }, // Device information control: { ... }, // Automation control (pause, delay) display: { ... }, // Display settings email: { ... }, // Email utilities notifications: { ... }, // Notification callbacks constants: { ... }, // Action constants arguments: { ... }, // Parameters and job variables } // Simple tap at coordinates await agent.actions.tap(500, 1000); // Swipe from point A to B over 500ms await agent.actions.swipe(500, 1500, 500, 500, 500); // Long press for 2 seconds await agent.actions.hold(500, 1000, 2000); // Double tap with 100ms interval await agent.actions.doubleTap(500, 1000, 100); // Human-like random tap within node bounds const button = screen.findTextOne("Submit"); if (button) { button.randomClick(); } // Type text (requires focused input field) await agent.actions.writeText("Hello, World!"); // Copy text to clipboard await agent.actions.copyText("Text to copy"); // Paste from clipboard await agent.actions.paste(); // Hide keyboard await agent.actions.hideKeyboard(); // System navigation await agent.actions.goBack(); await agent.actions.goHome(); await agent.actions.recents(); // D-pad navigation (Android 13+ only) await agent.actions.dpad("down"); await agent.actions.dpad("center"); // Launch app by package name await agent.actions.launchApp("com.example.myapp"); // Launch fresh (tries closing the existing app first) await agent.actions.launchApp("com.example.myapp", true); // Open URL on in-app browser await agent.actions.browse("https://example.com"); // List all installed apps const apps = await agent.actions.listApps(); console.log(apps["com.android.chrome"]); // "Chrome" // Get current screen content const screen = await agent.actions.screenContent(); // Get all nodes recursively const allNodes = getAllNodes(screen); // Find nodes by text const buttons = allNodes.filter(node => node.text?.toLowerCase().includes("submit") ); // Find nodes by ID const loginBtn = allNodes.find(node => node.viewId === "com.example:id/login_button" ); // Find nodes by class const editTexts = allNodes.filter(node => node.className === "android.widget.EditText" ); // Get all descendant nodes recursively const allNodes = getAllNodes(screen); // Find nodes by viewId (resourceId) const nodes = findNodesById(screen, "com.example:id/button"); // Find nodes by exact text const textNodes = findNodesByText(screen, "Submit"); // Check if node has specific text const hasText = nodeHasText(node, "Continue"); // Find a button and tap its center const button = allNodes.find(n => n.text === "Submit" && n.clickable); if (button) { const { left, top, right, bottom } = button.boundsInScreen; const centerX = (left + right) / 2; const centerY = (top + bottom) / 2; await agent.actions.tap(centerX, centerY); } // Or use accessibility action for more reliable clicks await button.performAction(agent.constants.ACTION_CLICK); // Scroll a node await scrollableNode.performAction(agent.constants.ACTION_SCROLL_FORWARD); // Focus an input field await inputField.performAction(agent.constants.ACTION_FOCUS); // Take a screenshot (maxWidth, maxHeight, quality) const screenshot = await agent.actions.screenshot(1080, 1920, 80); // Result contains base64 image data const { base64, width, height } = screenshot; // Use for debugging or storing console.log(\`Screenshot: \${width}x\${height}\`); // Simple sleep function function sleep(ms: number): Promise { return new Promise(resolve => setTimeout(resolve, ms)); } // Sleep with random range (more human-like) function sleepRandom(min: number, max: number): Promise { const ms = Math.floor(Math.random() * (max - min) + min); return new Promise(resolve => setTimeout(resolve, ms)); } // Usage await sleep(2000); // Wait 2 seconds await sleepRandom(1000, 3000); // Wait 1-3 seconds randomly // Wait for a specific element to appear async function waitForElement( condition: (nodes: AndroidNode[]) => boolean, timeout: number = 10000 ): Promise { const startTime = Date.now(); while (Date.now() - startTime < timeout) { const screen = await agent.actions.screenContent(); const allNodes = getAllNodes(screen); if (condition(allNodes)) { return true; } await sleep(500); } return false; } // Usage: Wait for "Home" text to appear const found = await waitForElement(nodes => nodes.some(n => n.text === "Home") ); async function withRetry( fn: () => Promise, maxAttempts: number = 3, baseDelay: number = 1000 ): Promise { let lastError: Error | undefined; for (let attempt = 1; attempt <= maxAttempts; attempt++) { try { return await fn(); } catch (error) { lastError = error as Error; console.log(\`Attempt \${attempt} failed: \${lastError.message}\`); if (attempt < maxAttempts) { const delay = baseDelay * Math.pow(2, attempt - 1); await sleep(delay); } } } throw lastError; } // Standard console methods are available console.log("Info message"); console.warn("Warning message"); console.error("Error message"); // Log objects console.log("Screen content:", { nodeCount: allNodes.length }); // Debug with context console.log(\`[Stage: \${currentStage}] Processing screen...\`); // Check if file exists const exists = agent.utils.files.exists("/sdcard/Download/data.json"); // Read file content const content = agent.utils.files.readFullFile("/sdcard/Download/data.json"); const data = JSON.parse(content); // List directory const files = agent.utils.files.list("/sdcard/Download"); for (const file of files) { console.log(file.name, file.isDirectory); } // Save file to device await agent.actions.saveFile("result.json", JSON.stringify(data)); // Track network connectivity let isOnline = true; agent.utils.setNetworkCallback((networkAvailable) => { isOnline = networkAvailable; if (!networkAvailable) { console.warn("Network connection lost!"); } }); // Check before network operations if (!isOnline) { console.log("Waiting for network..."); await sleep(5000); } // Refresh mobile IP (airplane mode toggle) await agent.actions.airplane(); Writing Scripts TypeScript basics, Agent API, and common patterns Tapping and Swiping Typing and Clipboard Navigation Actions Launching Apps Reading Screen Content Node Helper Functions Node Interactions Taking Screenshots Delay Utilities Waiting for Conditions Retry Pattern Console Logging Reading and Writing Files Network Callback -------------------------------------------------------------------------------- ## Guide / Screen States Path: /docs/automation/guide/screen-states Description: Detect and respond to different UI states Sections: - The ScreenState Pattern - Detection Strategies - Complete Detection Function - Handling Unknown States - Stage-Specific Detection - Best Practices - Next Steps - 1. Text-Based Detection - 2. ID-Based Detection - 3. Composite Detection - 4. Package Name Filtering - Check System States First - Use Multiple Conditions - Order Matters - Filter by Package Name - Store Unknown Screens - Stages → - Error Handling → Content: Screen state detection is the foundation of robust automation. By identifying what's currently on screen, your automation can respond appropriately to each situation. The ScreenState Pattern Define all possible screen states as an enum: Detection Strategies 1. Text-Based Detection The simplest approach - look for specific text on screen: 2. ID-Based Detection More reliable - use resource IDs which don't change with language: 3. Composite Detection Combine multiple conditions for accuracy: 4. Package Name Filtering Filter by app package to avoid false positives from system UI: Complete Detection Function Handling Unknown States Unknown screens will happen. Handle them gracefully: Stage-Specific Detection Different stages may need different detection logic: Best Practices Check System States First Always check for crashes, dialogs, and system UI before app-specific states. These can appear at any time and need immediate handling. Use Multiple Conditions Don't rely on a single text match. Combine multiple checks (text + ID + class + package) for reliable detection. Order Matters Check more specific states before general ones. For example, check "LoginEnterPassword" before "LoginScreen". Filter by Package Name Use package name filtering to avoid false positives from system UI, notifications, or other apps. Store Unknown Screens Use agent.utils.outOfSteps.storeScreen() for unknown states. This helps debug what screens you missed. Next Steps Stages → Organize automation into phases with step limits Error Handling → Handle crashes, dialogs, and recovery export enum ScreenState { // System states Unknown = "Unknown", Crash = "Crash", PhoneDialog = "PhoneDialog", NoInternet = "NoInternet", NotificationShade = "NotificationShade", // App states SplashScreen = "SplashScreen", LoginScreen = "LoginScreen", LoginEnterPassword = "LoginEnterPassword", HomeScreen = "HomeScreen", ProfileScreen = "ProfileScreen", SettingsScreen = "SettingsScreen", // Dialog states PermissionDialog = "PermissionDialog", ConfirmDialog = "ConfirmDialog", ErrorDialog = "ErrorDialog", // Loading states Loading = "Loading", // Success/failure states Success = "Success", RateLimited = "RateLimited", } function detectScreenState(screen: AndroidNode): ScreenState { const allNodes = getAllNodes(screen); // Check for specific text if (allNodes.find(n => n.text === "Sign in")) { return ScreenState.LoginScreen; } if (allNodes.find(n => n.text === "Enter your password")) { return ScreenState.LoginEnterPassword; } if (allNodes.find(n => n.text?.toLowerCase() === "home")) { return ScreenState.HomeScreen; } return ScreenState.Unknown; } function detectScreenState(screen: AndroidNode): ScreenState { const allNodes = getAllNodes(screen); // Check for specific view IDs if (findNodesById(screen, "com.example:id/login_form").length) { return ScreenState.LoginScreen; } if (findNodesById(screen, "com.example:id/home_feed").length) { return ScreenState.HomeScreen; } if (findNodesById(screen, "android:id/aerr_close").find(n => n.clickable)) { return ScreenState.Crash; } return ScreenState.Unknown; } function detectScreenState(screen: AndroidNode): ScreenState { const allNodes = getAllNodes(screen); // Login screen: has email field AND sign-in button const hasEmailField = allNodes.find(n => n.className === "android.widget.EditText" && (n.hintText?.toLowerCase()?.includes("email") || n.viewId?.includes("email")) ); const hasSignInButton = allNodes.find(n => n.clickable && n.text?.toLowerCase() === "sign in" ); if (hasEmailField && hasSignInButton) { return ScreenState.LoginScreen; } // Password screen: has password field (isPassword = true) const hasPasswordField = allNodes.find(n => n.className === "android.widget.EditText" && n.isPassword ); if (hasPasswordField && !hasEmailField) { return ScreenState.LoginEnterPassword; } return ScreenState.Unknown; } const APP_PACKAGE = "com.example.myapp"; function detectScreenState(screen: AndroidNode): ScreenState { const allNodes = getAllNodes(screen); // Only consider nodes from our target app const appNodes = allNodes.filter(n => n.packageName === APP_PACKAGE); // Check for phone dialog (different package) if (allNodes.every(n => n.packageName === "com.android.phone")) { return ScreenState.PhoneDialog; } // Check for system crash dialog if (allNodes.find(n => n.viewId === "android:id/aerr_close" && n.packageName === "android" )) { return ScreenState.Crash; } // Now check app-specific screens if (appNodes.find(n => n.viewId?.includes("login"))) { return ScreenState.LoginScreen; } return ScreenState.Unknown; } import { ScreenState } from "./screenStates.js"; const APP_PACKAGE = "com.mysocial.app"; export function detectScreenState(screen: AndroidNode): ScreenState { const allNodes = getAllNodes(screen); // === SYSTEM STATES (check first) === // Crash dialog if (findNodesById(screen, "android:id/aerr_close").find(n => n.clickable)) { return ScreenState.Crash; } // Phone dialog if (allNodes.every(n => n.packageName === "com.android.phone")) { return ScreenState.PhoneDialog; } // No internet if (allNodes.find(n => n.text?.toLowerCase()?.includes("no internet") || n.text?.toLowerCase()?.includes("you're offline") )) { return ScreenState.NoInternet; } // Loading (only loading indicator visible) if (allNodes.length <= 3 && allNodes.find(n => n.className?.includes("ProgressBar") )) { return ScreenState.Loading; } // === APP-SPECIFIC STATES === // Login screen if (allNodes.find(n => n.className === "android.widget.EditText" && n.hintText?.toLowerCase()?.includes("email") && n.packageName === APP_PACKAGE )) { return ScreenState.LoginScreen; } // Password entry if (allNodes.find(n => n.isPassword && n.packageName === APP_PACKAGE )) { return ScreenState.LoginEnterPassword; } // Home screen (has bottom navigation with Home selected) const homeTab = allNodes.find(n => n.description === "Home" && n.isSelected && n.packageName === APP_PACKAGE ); if (homeTab) { return ScreenState.HomeScreen; } // Profile screen if (allNodes.find(n => n.description === "Profile" && n.isSelected && n.packageName === APP_PACKAGE )) { return ScreenState.ProfileScreen; } // Permission dialog if (allNodes.find(n => n.packageName === "com.android.permissioncontroller" || (n.text?.toLowerCase()?.includes("allow") && n.clickable) )) { return ScreenState.PermissionDialog; } return ScreenState.Unknown; } let unknownScreenCount = 0; const MAX_UNKNOWN_SCREENS = 3; async function getCurrentScreenState(): Promise<{ state: ScreenState; screen: AndroidNode; }> { let screen = await agent.actions.screenContent(); let state = detectScreenState(screen); // Retry a few times if unknown if (state === ScreenState.Unknown) { for (let i = 0; i < 3; i++) { await sleep(2000); screen = await agent.actions.screenContent(); state = detectScreenState(screen); if (state !== ScreenState.Unknown) { unknownScreenCount = 0; break; } } } // Track consecutive unknown screens if (state === ScreenState.Unknown) { unknownScreenCount++; console.warn(\`Unknown screen #\${unknownScreenCount}\`); // Store for debugging await agent.utils.outOfSteps.storeScreen( screen, "unknown", "Unknown", MAX_UNKNOWN_SCREENS - unknownScreenCount, ScreenshotRecord.HIGH_QUALITY ); if (unknownScreenCount >= MAX_UNKNOWN_SCREENS) { throw new Error("Too many unknown screens"); } } else { unknownScreenCount = 0; } return { state, screen }; } import { Stage } from "./stages.js"; import { ScreenState } from "./screenStates.js"; export function detectScreenState( screen: AndroidNode, currentStage: Stage ): ScreenState { // Always check system states first const systemState = detectSystemStates(screen); if (systemState !== ScreenState.Unknown) { return systemState; } // Stage-specific detection switch (currentStage) { case Stage.Login: return detectLoginStates(screen); case Stage.NavigateToMessages: case Stage.ProcessMessages: return detectMessageStates(screen); case Stage.NavigateToFeed: case Stage.LikePosts: return detectFeedStates(screen); default: return detectCommonStates(screen); } } function detectSystemStates(screen: AndroidNode): ScreenState { // ... crash, phone dialog, no internet } function detectLoginStates(screen: AndroidNode): ScreenState { // ... login-specific screens } function detectMessageStates(screen: AndroidNode): ScreenState { // ... message-specific screens } Screen States Detect and respond to different UI states screenStates.ts Text-based detection ID-based detection Composite detection Package name filtering detection.ts Unknown state handling Stage-aware detection -------------------------------------------------------------------------------- ## Guide / Stages Path: /docs/automation/guide/stages Description: Organize automation into phases with step limits Sections: - The Stage Pattern - Stage State Management - Max Steps Per Stage - Stage Transitions - Out of Steps Handling - Dynamic Step Limits - Same Screen Detection - Complete Example - Best Practices - Next Steps - Reset Step Counter on Stage Change - Report Progress on Stage Change - Store Screens Continuously - Detect Stuck States - Task Submission → - Error Handling → Content: Stages help organize complex automations into logical phases. Each stage has its own step limit, making it easier to debug issues and track progress. The Stage Pattern Define your automation stages as an enum: Stage State Management Track the current stage and step count: Max Steps Per Stage Step limits prevent infinite loops and help identify stuck automations: Stage Transitions Transition between stages based on progress: Out of Steps Handling When the automation runs out of steps, submit data for analysis: Dynamic Step Limits Some stages may need more steps than others: Same Screen Detection Detect when stuck on the same screen: Complete Example Best Practices Reset Step Counter on Stage Change maxSteps when entering a new stage. This gives each phase its own budget of steps. Report Progress on Stage Change submitTask("running", ...) Store Screens Continuously storeScreen() on every iteration. This creates a trail for debugging when things go wrong. Detect Stuck States Track same-screen counts and implement recovery logic. Don't let the automation spin on the same screen forever. Next Steps Task Submission → Submit results, collect data, and handle job tasks Error Handling → Handle crashes, dialogs, and recovery export enum Stage { Initialize = "Initialize", LaunchApp = "LaunchApp", HandleLogin = "HandleLogin", NavigateToMessages = "NavigateToMessages", SelectUnreadChat = "SelectUnreadChat", ProcessMessages = "ProcessMessages", NavigateToFeed = "NavigateToFeed", LikePosts = "LikePosts", Complete = "Complete", } import { Stage } from "./stages.js"; const MAX_STEPS_PER_STAGE = 48; let currentStage: Stage = Stage.Initialize; let maxSteps = MAX_STEPS_PER_STAGE; async function setCurrentStage(newStage: Stage) { currentStage = newStage; maxSteps = MAX_STEPS_PER_STAGE; // Reset step counter console.log(\`[Stage] Entering: \${newStage}\`); // Report progress to server (files ignored when finish=false) await agent.utils.job.submitTask( "running", { stage: newStage, timestamp: Date.now() }, false // Don't finish the task ); } async function runAutomation() { await setCurrentStage(Stage.Initialize); // Main automation loop while (maxSteps-- > 0) { const { state, screen } = await getCurrentScreenState(); // Store screen for debugging await agent.utils.outOfSteps.storeScreen( screen, currentStage, state, maxSteps, state === ScreenState.Unknown ? ScreenshotRecord.HIGH_QUALITY : ScreenshotRecord.LOW_QUALITY ); // Handle the current screen state await handleScreenState(state, screen); // Check if we've completed if (currentStage === Stage.Complete) { break; } } // Out of steps - submit for analysis if (maxSteps < 0) { await agent.utils.outOfSteps.submit("outOfSteps"); throw new Error("OUT_OF_STEPS"); } } async function handleScreenState( state: ScreenState, screen: AndroidNode ) { // Handle system states first (any stage) if (await handleSystemStates(state, screen)) { return; } // Stage-specific handling switch (currentStage) { case Stage.Initialize: await handleInitialize(state, screen); break; case Stage.LaunchApp: await handleLaunchApp(state, screen); break; case Stage.HandleLogin: await handleLogin(state, screen); break; case Stage.NavigateToMessages: await handleNavigateToMessages(state, screen); break; // ... other stages } } async function handleInitialize(state: ScreenState, screen: AndroidNode) { // Check if app is installed const apps = await agent.actions.listApps(); if (!apps[APP_PACKAGE]) { throw new Error("App not installed"); } // Set up callbacks agent.utils.setNetworkCallback(handleNetworkChange); // Move to next stage await setCurrentStage(Stage.LaunchApp); } async function handleLaunchApp(state: ScreenState, screen: AndroidNode) { if (state === ScreenState.HomeScreen) { // Already on home screen, skip to messages await setCurrentStage(Stage.NavigateToMessages); return; } if (state === ScreenState.SplashScreen || state === ScreenState.Loading) { // Wait for app to load await sleep(2000); return; } if (state === ScreenState.LoginScreen) { // Need to login first await setCurrentStage(Stage.HandleLogin); return; } // Launch the app await agent.actions.launchApp(APP_PACKAGE); await sleep(3000); } // Store screens during automation await agent.utils.outOfSteps.storeScreen( screen, currentStage, // Which stage screenState, // What screen state maxSteps, // Remaining steps ScreenshotRecord.LOW_QUALITY // Screenshot quality ); // Submit when out of steps if (maxSteps < 0) { const result = await agent.utils.outOfSteps.submit("outOfSteps"); if (result.success) { console.log("Out of steps report ID:", result.id); } // Fail the task await agent.utils.job.submitTask("failed", { reason: "OUT_OF_STEPS", stage: currentStage, outOfStepsId: result.success ? result.id : null }); // Always stop the automation stopCurrentAutomation(); } function getMaxStepsForStage(stage: Stage): number { switch (stage) { // Login might need more steps (captchas, 2FA, etc.) case Stage.HandleLogin: return 100; // Message processing depends on conversation length case Stage.ProcessMessages: return 60; // Most stages need fewer steps default: return 48; } } async function setCurrentStage(newStage: Stage) { currentStage = newStage; maxSteps = getMaxStepsForStage(newStage); console.log(\`[Stage] \${newStage} (max steps: \${maxSteps})\`); await agent.utils.job.submitTask( "running", { stage: newStage }, false ); } let lastScreenState: ScreenState | null = null; let sameScreenCount = 0; const MAX_SAME_SCREEN = 5; async function checkSameScreen(state: ScreenState) { if (state === lastScreenState) { sameScreenCount++; if (sameScreenCount >= MAX_SAME_SCREEN) { console.warn(\`Stuck on \${state} for \${sameScreenCount} iterations\`); // Try recovery actions if (state === ScreenState.Loading) { // Wait longer for loading await sleep(5000); } else { // Try going back await agent.actions.goBack(); await sleep(1000); } // Reset counter after recovery attempt sameScreenCount = 0; } } else { lastScreenState = state; sameScreenCount = 0; } } import { Stage } from "./stages.js"; import { ScreenState } from "./screenStates.js"; import { detectScreenState } from "./detection.js"; const MAX_STEPS_PER_STAGE = 48; let currentStage = Stage.Initialize; let maxSteps = MAX_STEPS_PER_STAGE; let lastScreenState: ScreenState | null = null; let sameScreenCount = 0; async function setCurrentStage(newStage: Stage) { currentStage = newStage; maxSteps = MAX_STEPS_PER_STAGE; lastScreenState = null; sameScreenCount = 0; console.log(\`=== Stage: \${newStage} ===\`); await agent.utils.job.submitTask( "running", await collectData(), false ); } async function collectData() { return { stage: currentStage, timestamp: new Date().toISOString(), // Add other data you want to track }; } async function main() { try { await setCurrentStage(Stage.Initialize); while (maxSteps-- > 0) { // Get current screen const screen = await agent.actions.screenContent(); const state = detectScreenState(screen); // Store for out-of-steps analysis await agent.utils.outOfSteps.storeScreen( screen, currentStage, state, maxSteps, state === ScreenState.Unknown ? ScreenshotRecord.HIGH_QUALITY : ScreenshotRecord.LOW_QUALITY ); // Check for stuck state await checkSameScreen(state); // Handle the state await handleScreenState(state, screen); // Check completion if (currentStage === Stage.Complete) { await agent.utils.job.submitTask("success", await collectData()); return; } // Small delay between iterations await sleep(500, 1000); } // Out of steps await agent.utils.outOfSteps.submit("outOfSteps"); throw new Error("OUT_OF_STEPS"); } catch (error) { console.error("Automation failed:", error); await agent.utils.job.submitTask("failed", { ...await collectData(), error: String(error) }); } finally { // Always stop the automation stopCurrentAutomation(); } } main(); Stages Organize automation into phases with step limits stages.ts Stage tracking Main loop with step counting Stage handler with transitions Out of steps submission Dynamic step limits Same screen detection Full stage management -------------------------------------------------------------------------------- ## Guide / Task Submission Path: /docs/automation/guide/tasks Description: Submit results, collect data, and handle job tasks Sections: - Job Task Overview - Submitting Task Results - File Attachments - Getting Current Task - Data Collection Pattern - Error Handling - Best Practices - Next Steps - Status Values - Reporting Progress - Completing Tasks - Report Progress Regularly - Include Useful Data - Attach Screenshots on Failure - Use "declined" for Invalid Data - Error Handling → - Running Automations → Content: Job tasks allow your automation to process multiple items (accounts, orders, etc.) in sequence. You can submit results, collect data, and request new tasks. Job Task Overview agent.utils.job ; useAnotherTask(): Promise ; getCurrentTask(): Promise Submitting Task Results submitTask() Status Values Status Description Finish? "running" Task is in progress, reporting intermediate data false "success" Task completed successfully true "failed" Task failed (error, blocked, etc.) true "declined" Task was declined (invalid data, etc.) true Reporting Progress Completing Tasks File Attachments Attach files (screenshots, logs, etc.) to task submissions: Getting Current Task Get information about the current job task: Data Collection Pattern A common pattern for collecting and submitting data throughout the automation: Error Handling Best Practices Report Progress Regularly "running" Include Useful Data Include timestamps, stage info, and any data that helps understand what happened. This is invaluable for debugging failed tasks. Attach Screenshots on Failure When a task fails, try to capture a screenshot of the current screen. This makes it much easier to understand what went wrong. Use "declined" for Invalid Data "declined" "failed" Next Steps Error Handling → Handle crashes, dialogs, and recovery Running Automations → Execute automations on devices and collect results agent.utils.job = { submitTask(status, data, finish, files): Promise; useAnotherTask(): Promise; getCurrentTask(): Promise; } await agent.utils.job.submitTask( status, // "running" | "success" | "failed" | "declined" data, // Record - data to store finish?, // boolean - true to complete the task (default: true) files? // File[] - files to upload (default: []) ); // Important notes: // - Once finish=true is passed, you cannot submit again for the same task // - The files parameter is ignored when finish=false // Report progress without finishing // Note: files parameter is ignored when finish=false await agent.utils.job.submitTask( "running", { stage: "login", progress: 25, timestamp: Date.now() }, false // Don't finish - must specify when not finishing! ); // Later, report more progress await agent.utils.job.submitTask( "running", { stage: "processing", progress: 75, itemsProcessed: 10 }, false // Must be false for progress updates ); // Complete with success (files and finish default to [] and true) await agent.utils.job.submitTask("success", { email: "user@example.com", orderId: "12345", completedAt: new Date().toISOString(), itemsProcessed: 15 }); // Don't forget to stop the automation! stopCurrentAutomation(); // Complete with failure await agent.utils.job.submitTask("failed", { email: "user@example.com", error: "Account locked", stage: "login", failedAt: new Date().toISOString() }); stopCurrentAutomation(); // Collect files during automation const files: { name: string; extension: string; base64Data: string }[] = []; // Take a screenshot const screenshot = await agent.actions.screenshot(1080, 1920, 80); files.push({ name: "final_screen", extension: "jpg", base64Data: screenshot.base64 }); // Submit with files (finish defaults to true) await agent.utils.job.submitTask( "success", { email: "user@example.com", orderPlaced: true }, true, // Final submission files // Pass files array ); stopCurrentAutomation(); const task = await agent.utils.job.getCurrentTask(); if (task.success) { // Task data is available const { job_proof, ...otherData } = task; console.log("Job proof:", job_proof); // Job variables are also available const { email, password } = agent.arguments.jobVariables; // Process the task... } else { console.log("No task available or error:", task.error); } // Global data object const collectedData: Record = { startedAt: new Date().toISOString(), stages: [], errors: [], metrics: { screensProcessed: 0, actionsPerformed: 0 } }; function recordStage(stage: string, data?: object) { collectedData.stages.push({ stage, timestamp: Date.now(), ...data }); } function recordError(error: string, context?: object) { collectedData.errors.push({ error, timestamp: Date.now(), ...context }); } function recordMetric(metric: string, value: number) { collectedData.metrics[metric] = (collectedData.metrics[metric] || 0) + value; } // Usage during automation recordStage("login_started"); recordMetric("screensProcessed", 1); // Submit progress (files ignored when finish=false) await agent.utils.job.submitTask( "running", { ...collectedData, currentStage: "login" }, false // Don't finish for progress updates ); // Submit final collectedData.completedAt = new Date().toISOString(); await agent.utils.job.submitTask("success", collectedData); // Always stop the automation when done stopCurrentAutomation(); async function runTask() { const files: File[] = []; try { // Get task info const task = await agent.utils.job.getCurrentTask(); if (!task.success) { throw new Error("No task available"); } // Process... const result = await processAutomation(); // Success await agent.utils.job.submitTask("success", result, true, files); } catch (error) { console.error("Task failed:", error); // Take failure screenshot try { const screenshot = await agent.actions.screenshot(1080, 1920, 80); files.push({ name: "error_screen", extension: "jpg", base64Data: screenshot.base64 }); } catch (e) { // Ignore screenshot errors } // Determine failure type const errorMessage = String(error); let status: "failed" | "declined" = "failed"; if (errorMessage.includes("invalid") || errorMessage.includes("not found")) { status = "declined"; // Invalid input data } // Submit failure await agent.utils.job.submitTask(status, { error: errorMessage, stage: currentStage, timestamp: Date.now() }, true, files); } finally { // Always stop the automation stopCurrentAutomation(); } } Task Submission Submit results, collect data, and handle job tasks submitTask Parameters Progress updates Success and failure Submitting with files getCurrentTask Data collection Robust task handling -------------------------------------------------------------------------------- ## Guide / Error Handling Path: /docs/automation/guide/error-handling Description: Handle crashes, dialogs, network issues, and recovery Sections: - System Interruptions - Network Connectivity - Complete Error Handler - Retry Strategies - Recovery Actions - Global Error Handler - Best Practices - Next Steps - Crash Dialogs - Phone Dialogs - Notification Shade - Simple Retry - Conditional Retry - Always Check System States First - Set Up Network Callback Early - Capture Screenshots on Failure - Use Meaningful Error Reasons - Clean Up Resources - Running Automations → - Full Tutorial → Content: Robust error handling is critical for production automations. This guide covers common error scenarios and how to handle them gracefully. System Interruptions System dialogs can appear at any time. Check for and handle them before processing app screens. Crash Dialogs App crashes show a system dialog with options to close or report: n.clickable); } // Handle crash dialog async function handleCrashDialog(screen: AndroidNode): Promise Phone Dialogs Incoming calls or phone-related dialogs can interrupt automation: n.packageName === PHONE_PACKAGE); } async function handlePhoneDialog(screen: AndroidNode): Promise Notification Shade Dismiss expanded notifications that might be blocking the UI: Network Connectivity Complete Error Handler Put it all together in a system error handler: Retry Strategies Simple Retry Promise , maxAttempts: number = 3 ): Promise Conditional Retry Promise boolean, maxAttempts: number = 3 ): Promise Recovery Actions Global Error Handler Best Practices Always Check System States First Before processing any app screen, check for crashes, dialogs, and notification shade. These can appear at any time and block automation. Set Up Network Callback Early Register the network callback at the start of your automation. Track total downtime for reporting. Capture Screenshots on Failure Always try to capture a screenshot when an error occurs. This is invaluable for debugging what went wrong. Use Meaningful Error Reasons Categorize errors into meaningful reasons (OUT_OF_STEPS, NETWORK_ERROR, etc.). This makes it easier to analyze failure patterns. Clean Up Resources Use try/finally to clean up callbacks and resources. This prevents memory leaks and unexpected behavior. Next Steps Running Automations → Execute automations on devices and collect results Full Tutorial → Complete example putting everything together // Detect crash dialog function isCrashDialog(screen: AndroidNode): boolean { return !!findNodesById(screen, "android:id/aerr_close") .find(n => n.clickable); } // Handle crash dialog async function handleCrashDialog(screen: AndroidNode): Promise { const closeButton = findNodesById(screen, "android:id/aerr_close") .find(n => n.clickable); if (closeButton) { console.warn("Crash dialog detected, closing..."); await closeButton.performAction(agent.constants.ACTION_CLICK); await sleep(2000); return true; } return false; } const PHONE_PACKAGE = "com.android.phone"; const PHONE_DIALOG_IDS = [ "com.android.phone:id/floating_end_call_action_button", "com.android.phone:id/declineButton", "com.android.phone:id/dismiss_button", ]; function isPhoneDialog(screen: AndroidNode): boolean { const allNodes = getAllNodes(screen); // All nodes are from phone package return allNodes.every(n => n.packageName === PHONE_PACKAGE); } async function handlePhoneDialog(screen: AndroidNode): Promise { const allNodes = getAllNodes(screen); // Look for dismiss/decline button const dismissButton = allNodes.find(n => PHONE_DIALOG_IDS.includes(n.viewId || "") ); if (dismissButton) { console.warn("Phone dialog detected, dismissing..."); if (dismissButton.actions.includes(agent.constants.ACTION_CLICK)) { await dismissButton.performAction(agent.constants.ACTION_CLICK); } else { // Fallback to random tap within bounds dismissButton.randomClick(); } await sleep(2000); return true; } return false; } async function dismissNotificationShade(): Promise { // Get all screens (including notification shade) const screens = await agent.actions.allScreensContent(); const allNodes = screens.flatMap(s => getAllNodes(s)); // Look for notification rows const notificationRow = allNodes.find(n => n.viewId === "com.android.systemui:id/expandableNotificationRow" && n.actions.includes(agent.constants.ACTION_DISMISS) ); if (notificationRow) { console.log("Dismissing notification..."); await notificationRow.performAction(agent.constants.ACTION_DISMISS); await sleep(500); return true; } // Check if shade is open (swipe down to close) const shadePanel = allNodes.find(n => n.viewId === "com.android.systemui:id/notification_panel" ); if (shadePanel) { console.log("Closing notification shade..."); await agent.actions.swipe(540, 1500, 540, 500, 300); await sleep(500); return true; } return false; } let isNetworkAvailable = true; let networkDownTime = 0; let networkDownTimestamp: number | undefined; // Set up network callback agent.utils.setNetworkCallback((available) => { isNetworkAvailable = available; if (!available && networkDownTimestamp === undefined) { networkDownTimestamp = Date.now(); console.warn("Network disconnected!"); } else if (available && networkDownTimestamp) { networkDownTime += Date.now() - networkDownTimestamp; networkDownTimestamp = undefined; console.log("Network restored"); } }); // Handle no internet screen async function handleNoInternet(screen: AndroidNode): Promise { const allNodes = getAllNodes(screen); const noInternetIndicator = allNodes.find(n => n.text?.toLowerCase()?.includes("no internet") || n.text?.toLowerCase()?.includes("you're offline") || n.text?.toLowerCase()?.includes("check your connection") ); if (noInternetIndicator || !isNetworkAvailable) { console.warn("No internet, waiting..."); // Wait for network to come back for (let i = 0; i < 30; i++) { if (isNetworkAvailable) { // Try to refresh const retryButton = allNodes.find(n => n.clickable && (n.text?.toLowerCase() === "retry" || n.text?.toLowerCase() === "try again") ); if (retryButton) { await retryButton.performAction(agent.constants.ACTION_CLICK); } await sleep(2000); return true; } await sleep(2000); } // Network didn't come back throw new Error("NETWORK_TIMEOUT"); } return false; } // Call this at the start of each main loop iteration async function handleSystemErrors(screen: AndroidNode): Promise { // Check and dismiss notification shade first if (await dismissNotificationShade()) { return true; // Screen changed, re-check } // Check for crash dialog if (await handleCrashDialog(screen)) { return true; } // Check for phone dialog if (await handlePhoneDialog(screen)) { return true; } // Check for no internet if (await handleNoInternet(screen)) { return true; } return false; // No system errors found } // Usage in main loop async function mainLoop() { while (maxSteps-- > 0) { const screen = await agent.actions.screenContent(); // Handle system errors first if (await handleSystemErrors(screen)) { continue; // Re-check after handling } // Get screen state and process... const state = detectScreenState(screen); await handleScreenState(state, screen); } } async function withRetry( fn: () => Promise, maxAttempts: number = 3 ): Promise { let lastError: Error | undefined; for (let attempt = 1; attempt <= maxAttempts; attempt++) { try { return await fn(); } catch (error) { lastError = error as Error; console.warn(\`Attempt \${attempt} failed: \${lastError.message}\`); if (attempt < maxAttempts) { await sleep(1000 * attempt); // Increasing delay } } } throw lastError; } async function retryOnCondition( fn: () => Promise, shouldRetry: (error: Error) => boolean, maxAttempts: number = 3 ): Promise { let lastError: Error | undefined; for (let attempt = 1; attempt <= maxAttempts; attempt++) { try { return await fn(); } catch (error) { lastError = error as Error; if (!shouldRetry(lastError) || attempt >= maxAttempts) { throw lastError; } console.warn(\`Retrying after: \${lastError.message}\`); await sleep(2000); } } throw lastError; } // Usage: Only retry on network errors await retryOnCondition( () => performNetworkOperation(), (error) => error.message.includes("network") || error.message.includes("timeout"), 5 ); async function attemptRecovery( state: ScreenState, screen: AndroidNode ): Promise { console.log(\`Attempting recovery from \${state}\`); switch (state) { case ScreenState.Unknown: // Try going back await agent.actions.goBack(); await sleep(1000); return true; case ScreenState.Loading: // Wait longer for loading await sleep(5000); return true; case ScreenState.NoInternet: // Toggle airplane mode to refresh connection await agent.actions.airplane(); await sleep(5000); return true; case ScreenState.RateLimited: // Wait before retrying console.log("Rate limited, waiting 60 seconds..."); await sleep(60000); return true; case ScreenState.ErrorDialog: // Try to dismiss error const okButton = getAllNodes(screen).find(n => n.clickable && (n.text?.toLowerCase() === "ok" || n.text?.toLowerCase() === "dismiss") ); if (okButton) { await okButton.performAction(agent.constants.ACTION_CLICK); await sleep(1000); return true; } break; } return false; } async function runAutomation() { const files: File[] = []; try { // Set up callbacks agent.utils.setNetworkCallback(handleNetworkChange); agent.notifications.setNotificationCallback(handleNotification); // Run main automation await main(); // Success await agent.utils.job.submitTask("success", await collectFinalData(), true, files); } catch (error) { console.error("Automation error:", error); // Capture error screenshot try { const screenshot = await agent.actions.screenshot(1080, 1920, 80); files.push({ name: "error_screenshot", extension: "jpg", base64Data: screenshot.base64 }); } catch (e) { console.error("Failed to capture screenshot:", e); } // Determine failure type const errorMessage = String(error); let failureReason = "UNKNOWN_ERROR"; if (errorMessage.includes("OUT_OF_STEPS")) { failureReason = "OUT_OF_STEPS"; } else if (errorMessage.includes("NETWORK")) { failureReason = "NETWORK_ERROR"; } else if (errorMessage.includes("not found")) { failureReason = "ELEMENT_NOT_FOUND"; } // Submit failure await agent.utils.job.submitTask("failed", { error: errorMessage, reason: failureReason, stage: currentStage, networkDownTime }, true, files); } finally { // Clean up callbacks agent.utils.setNetworkCallback(null); agent.notifications.setNotificationCallback(null); // Always stop the automation stopCurrentAutomation(); } } // Start automation runAutomation(); Error Handling Handle crashes, dialogs, network issues, and recovery Crash dialog handling Phone dialog handling Notification shade handling Network monitoring System error handler Basic retry Conditional retry Recovery strategies Wrapping the automation -------------------------------------------------------------------------------- ## Guide / Running Automations Path: /docs/automation/guide/running Description: Execute automations on devices and collect results Sections: - The Automation Runner - Execution Flow - Parameter Validation - Data Collection - Start and Stop Commands - Multi-Device Execution - Status Updates - Monitoring logs - Debugging Tips - Requirements Check - Next Steps - Select Devices - Configure Automation Parameters - Provide Job Variables - Monitor & Collect Results - Downloading Results - Same Parameters - Same Job Variables - Independent Execution - Check Device Connection - Review Parameter Values - Check Console Output - Review Submitted Data - Version Requirements - Full Tutorial → - API Reference → Content: Once your automation is ready, you can run it on connected devices through the dashboard. This guide covers the execution flow and data collection. The Automation Runner Access the runner from the "} AUTOMATIONS Select Devices Choose one or more connected devices to run the automation on. Each device will execute the automation independently. Configure Automation Parameters Fill in automation parameters defined in your project. These are validated before execution starts. Provide Job Variables agent.arguments.jobVariables Monitor & Collect Results View collected data, download results, and track automation status in the data section. Execution Flow Select automation from your available automations Select devices to run on (must be connected) Fill parameters as defined in your automation schema Fill job variables for the specific task Click Start to begin execution Monitor progress via status updates View/download results when complete Parameter Validation Parameters are validated against your schema before execution: Data Collection submitTask() Downloading Results Collected data can be downloaded as CSV for analysis: Click the Download submitTask() Start and Stop Commands Multi-Device Execution Running on multiple devices simultaneously: Same Parameters All selected devices receive the same automation parameters. Use this for consistent configuration across devices. Same Job Variables Job variables are also shared. If you need different data per device, you will need to run separate automation jobs with different job variable sets. Independent Execution Each device executes independently. One device failing doesn't affect others. Results are collected separately per device. Status Updates The runner shows submitted data for each execution (click refresh to update). Status Description Starting Command sent, waiting for device to begin Running Automation is executing on device Success Automation completed successfully Failed Automation encountered an error Stopped Automation was manually stopped */} Monitoring logs Debugging Tips Check Device Connection Ensure devices show as "Connected" before starting. Disconnected devices won't receive the automation command. Review Parameter Values Double-check parameter values before starting. Invalid values will cause validation errors or unexpected behavior. Check Console Output console.log() Review Submitted Data Check the collected data table for insights. Failed tasks often include error details in the submitted data. Requirements Check Before execution, the system verifies: Device meets minimum Android version requirement (only when running through a job) Device has minimum app version installed (only when running through a job) User has execution permission for the automation All required parameters are provided and valid Version Requirements dpad() or inputKey() , set the minimum Android version to 13+ (API 33) in your project configuration. Next Steps Full Tutorial → Complete MySocial example putting everything together API Reference → Complete documentation of all available methods // Your schema definition { "fields": [ { "name": "maxRetries", "type": "number", "required": true, "min": 1, "max": 10, "integer": true }, { "name": "targetUrl", "type": "string", "required": true, "pattern": "^https?://.*" } ] } // Validation errors shown to user: // - "maxRetries: Value must be at least 1" // - "targetUrl: Value does not match pattern" // In your automation await agent.utils.job.submitTask("success", { email: "user@example.com", orderId: "12345", itemsProcessed: 10, completedAt: new Date().toISOString() }); // Don't forget to stop the automation! stopCurrentAutomation(); // This data appears in the runner as: // | Device | email | orderId | itemsProcessed | completedAt | // |------------|------------------|---------|----------------|----------------------| // | Device-001 | user@example.com | 12345 | 10 | 2024-01-15T10:30:00Z | // When you click "Start": POST /api/v1/device/automation { "device_ids": ["device-001", "device-002"], "automationId": "automation-uuid", "command": "start", "automationParameters": { ... }, "jobVariables": { ... } } // When you click "Stop": POST /api/v1/device/automation { "device_ids": ["device-001", "device-002"], "automationId": "automation-uuid", "command": "stop" } Running Automations Execute automations on devices and collect results Schema validation Data flow Runner commands Live Control -------------------------------------------------------------------------------- ## Guide / Full Tutorial: MySocial Auto-Responder Path: /docs/automation/guide/tutorial Description: Build a complete automation from scratch Sections: - Step 1: Define Stages - Step 2: Define Screen States - Step 3: Utility Functions - Step 4: Screen Detection - Step 5: Stage Handlers - Step 6: Main Entry Point - Configuration Schema - Key Patterns Used - Running the Automation - Next Steps - Project Structure - Stage-Based Organization - Comprehensive Screen Detection - Error Recovery - Data Collection - Out-of-Steps Handling - ES6 Module Organization - Congratulations! - API Reference → - Actions Reference → - AndroidNode Reference → Content: This tutorial walks through building a complete automation for a fictional social media app called "MySocial". The automation will: Launch the app and verify login state Navigate to the Messages section Process unread conversations Send automated responses Like posts in the feed Log all interactions to task data Project Structure This example uses ES6 imports to organize code across multiple files: Step 1: Define Stages First, define the automation stages: Step 2: Define Screen States Define all possible screen states: Step 3: Utility Functions Create shared utility functions: Step 4: Screen Detection Implement screen state detection: Step 5: Stage Handlers Implement handlers for each stage: Step 6: Main Entry Point Finally, create the main entry point that ties everything together: Configuration Schema Configure the automation parameters and job variables in the Options panel: Key Patterns Used Stage-Based Organization The automation is divided into clear stages (Initialize, LaunchApp, HandleLogin, etc.) with step counters reset at each transition. Comprehensive Screen Detection Detection handles system states (crash, phone, permissions) before app states, using multiple conditions for reliable identification. Error Recovery Stuck state detection with automatic recovery (go back), notification dismissal, and network monitoring throughout execution. Data Collection submitTask("running", ...) , and final results include all collected metrics. Out-of-Steps Handling storeScreen() , and out-of-steps reports are submitted for debugging. ES6 Module Organization Code is organized into separate files with clean imports, making it easy to maintain and test individual components. Running the Automation Save all files in the IDE Configure automation parameters and job variables schemas Go to Devices → Automation Runner Select your automation and target device(s) Fill in parameters (response message, max likes) Fill in job variables (email, password) Click Start and monitor progress Congratulations! You've built a complete automation with all the essential patterns. Use this as a template for your own automations, adapting the stages, screen states, and handlers for your target app. Next Steps API Reference → Explore all available methods and types Actions Reference → All touch, navigation, and input actions AndroidNode Reference → Screen content structure and filtering // Define all automation stages export enum Stage { Initialize = "Initialize", LaunchApp = "LaunchApp", HandleLogin = "HandleLogin", NavigateToMessages = "NavigateToMessages", SelectUnreadChat = "SelectUnreadChat", ProcessMessages = "ProcessMessages", NavigateToFeed = "NavigateToFeed", LikePosts = "LikePosts", Complete = "Complete", } // System states (can appear anytime) export enum ScreenState { Unknown = "Unknown", Crash = "Crash", PhoneDialog = "PhoneDialog", NoInternet = "NoInternet", Loading = "Loading", // Login states SplashScreen = "SplashScreen", LoginScreen = "LoginScreen", LoginEnterPassword = "LoginEnterPassword", LoginTwoFactor = "LoginTwoFactor", LoginError = "LoginError", // Main app states HomeTab = "HomeTab", SearchTab = "SearchTab", MessagesTab = "MessagesTab", NotificationsTab = "NotificationsTab", ProfileTab = "ProfileTab", // Message states ChatList = "ChatList", ChatListEmpty = "ChatListEmpty", ChatConversation = "ChatConversation", NewMessageDialog = "NewMessageDialog", // Feed states PostDetail = "PostDetail", CommentSheet = "CommentSheet", // Dialog states PermissionDialog = "PermissionDialog", UpdateRequired = "UpdateRequired", RateLimited = "RateLimited", ErrorDialog = "ErrorDialog", } // App package name export const APP_PACKAGE = "com.mysocial.app"; // Sleep with optional random range export function sleep(min: number, max?: number): Promise { const ms = max ? Math.floor(Math.random() * (max - min) + min) : min; return new Promise(resolve => setTimeout(resolve, ms)); } // Random number in range export function randomRange(min: number, max: number): number { return Math.floor(Math.random() * (max - min) + min); } // Collected data storage export const collectedData: { messagesResponded: number; postsLiked: number; errors: string[]; stages: { stage: string; timestamp: number }[]; } = { messagesResponded: 0, postsLiked: 0, errors: [], stages: [] }; export function recordStage(stage: string) { collectedData.stages.push({ stage, timestamp: Date.now() }); } export function recordError(error: string) { collectedData.errors.push(error); console.error(error); } import { ScreenState } from "./screenStates.js"; import { APP_PACKAGE } from "./utils.js"; export function detectScreenState(screen: AndroidNode): ScreenState { const allNodes = getAllNodes(screen); // === SYSTEM STATES (check first) === // Crash dialog if (findNodesById(screen, "android:id/aerr_close").find(n => n.clickable)) { return ScreenState.Crash; } // Phone dialog if (allNodes.every(n => n.packageName === "com.android.phone")) { return ScreenState.PhoneDialog; } // No internet if (allNodes.find(n => n.text?.toLowerCase()?.includes("no internet") || n.text?.toLowerCase()?.includes("offline") )) { return ScreenState.NoInternet; } // Loading screen (minimal nodes with progress indicator) if (allNodes.length <= 5 && allNodes.find(n => n.className?.includes("ProgressBar") )) { return ScreenState.Loading; } // Permission dialog if (allNodes.find(n => n.packageName === "com.android.permissioncontroller" )) { return ScreenState.PermissionDialog; } // === APP STATES === const appNodes = allNodes.filter(n => n.packageName === APP_PACKAGE); // Splash screen if (appNodes.find(n => n.viewId === \`\${APP_PACKAGE}:id/splash_logo\` )) { return ScreenState.SplashScreen; } // Login screen (email field) if (appNodes.find(n => n.className === "android.widget.EditText" && (n.hintText?.toLowerCase()?.includes("email") || n.hintText?.toLowerCase()?.includes("username")) )) { return ScreenState.LoginScreen; } // Password screen if (appNodes.find(n => n.isPassword)) { return ScreenState.LoginEnterPassword; } // Two-factor screen if (appNodes.find(n => n.text?.toLowerCase()?.includes("verification code") || n.text?.toLowerCase()?.includes("2fa") )) { return ScreenState.LoginTwoFactor; } // Check bottom navigation tabs const homeTab = appNodes.find(n => n.description === "Home" && n.className === "android.widget.Button" ); const messagesTab = appNodes.find(n => n.description === "Messages" && n.className === "android.widget.Button" ); // Messages tab selected if (messagesTab?.isSelected) { // Chat conversation (has message input) if (appNodes.find(n => n.hintText?.toLowerCase()?.includes("message") && n.className === "android.widget.EditText" )) { return ScreenState.ChatConversation; } // Chat list (has conversation items) if (appNodes.find(n => n.viewId === \`\${APP_PACKAGE}:id/chat_list\` )) { const hasUnread = appNodes.find(n => n.viewId === \`\${APP_PACKAGE}:id/unread_badge\` ); return hasUnread ? ScreenState.ChatList : ScreenState.ChatListEmpty; } return ScreenState.MessagesTab; } // Home tab selected if (homeTab?.isSelected) { return ScreenState.HomeTab; } // Profile tab if (appNodes.find(n => n.description === "Profile" && n.isSelected )) { return ScreenState.ProfileTab; } // Error dialog if (appNodes.find(n => n.text?.toLowerCase()?.includes("error") || n.text?.toLowerCase()?.includes("something went wrong") ) && appNodes.find(n => n.text?.toLowerCase() === "ok" && n.clickable )) { return ScreenState.ErrorDialog; } // Rate limited if (appNodes.find(n => n.text?.toLowerCase()?.includes("rate limit") || n.text?.toLowerCase()?.includes("try again later") )) { return ScreenState.RateLimited; } return ScreenState.Unknown; } import { Stage } from "./stages.js"; import { ScreenState } from "./screenStates.js"; import { APP_PACKAGE, sleep, collectedData, recordError } from "./utils.js"; // State variables let currentStage: Stage = Stage.Initialize; let maxSteps = 48; let isNetworkAvailable = true; export function getCurrentStage() { return currentStage; } export function getMaxSteps() { return maxSteps; } export async function setCurrentStage(newStage: Stage) { currentStage = newStage; maxSteps = 48; // Reset step counter console.log(\`=== Stage: \${newStage} ===\`); await agent.utils.job.submitTask( "running", { stage: newStage, ...collectedData }, false // Don't finish the task ); } // Network callback agent.utils.setNetworkCallback((available) => { isNetworkAvailable = available; if (!available) { recordError("Network disconnected"); } }); // === SYSTEM HANDLERS === export async function handleCrash(screen: AndroidNode): Promise { const closeBtn = findNodesById(screen, "android:id/aerr_close") .find(n => n.clickable); if (closeBtn) { console.log("Closing crash dialog..."); await closeBtn.performAction(agent.constants.ACTION_CLICK); await sleep(2000); return true; } return false; } export async function handlePhoneDialog(screen: AndroidNode): Promise { const allNodes = getAllNodes(screen); const dismissBtn = allNodes.find(n => n.viewId?.includes("dismiss") || n.viewId?.includes("decline") ); if (dismissBtn) { console.log("Dismissing phone dialog..."); dismissBtn.randomClick(); await sleep(2000); return true; } return false; } export async function handlePermissionDialog(screen: AndroidNode): Promise { const allNodes = getAllNodes(screen); const allowBtn = allNodes.find(n => n.text?.toLowerCase()?.includes("allow") && n.clickable ); if (allowBtn) { console.log("Granting permission..."); await allowBtn.performAction(agent.constants.ACTION_CLICK); await sleep(1000); return true; } return false; } export async function handleErrorDialog(screen: AndroidNode): Promise { const allNodes = getAllNodes(screen); const okBtn = allNodes.find(n => n.text?.toLowerCase() === "ok" && n.clickable ); if (okBtn) { console.log("Dismissing error dialog..."); await okBtn.performAction(agent.constants.ACTION_CLICK); await sleep(1000); return true; } return false; } // === STAGE HANDLERS === export async function handleInitialize() { // Verify app is installed const apps = await agent.actions.listApps(); if (!apps[APP_PACKAGE]) { throw new Error("MySocial app not installed"); } console.log("App installed, proceeding..."); await setCurrentStage(Stage.LaunchApp); } export async function handleLaunchApp( state: ScreenState, screen: AndroidNode ) { if (state === ScreenState.HomeTab || state === ScreenState.MessagesTab) { // Already in app await setCurrentStage(Stage.NavigateToMessages); return; } if (state === ScreenState.LoginScreen || state === ScreenState.LoginEnterPassword) { await setCurrentStage(Stage.HandleLogin); return; } if (state === ScreenState.SplashScreen || state === ScreenState.Loading) { await sleep(2000, 3000); return; } // Launch the app console.log("Launching MySocial..."); await agent.actions.launchApp(APP_PACKAGE); await sleep(3000); } export async function handleLogin( state: ScreenState, screen: AndroidNode ) { const allNodes = getAllNodes(screen); const { email, password } = agent.arguments.jobVariables; if (state === ScreenState.LoginScreen) { // Enter email const emailField = allNodes.find(n => n.className === "android.widget.EditText" && n.hintText?.toLowerCase()?.includes("email") ); if (emailField) { await emailField.performAction(agent.constants.ACTION_FOCUS); await sleep(500); await agent.actions.writeText(email); await sleep(500); // Click next/continue const nextBtn = allNodes.find(n => n.clickable && (n.text?.toLowerCase() === "next" || n.text?.toLowerCase() === "continue") ); if (nextBtn) { await nextBtn.performAction(agent.constants.ACTION_CLICK); } await sleep(2000); } return; } if (state === ScreenState.LoginEnterPassword) { // Enter password const passwordField = allNodes.find(n => n.isPassword); if (passwordField) { await passwordField.performAction(agent.constants.ACTION_FOCUS); await sleep(500); await agent.actions.writeText(password); await sleep(500); await agent.actions.hideKeyboard(); // Click login const loginBtn = allNodes.find(n => n.clickable && (n.text?.toLowerCase() === "login" || n.text?.toLowerCase() === "sign in") ); if (loginBtn) { await loginBtn.performAction(agent.constants.ACTION_CLICK); } await sleep(3000); } return; } if (state === ScreenState.HomeTab || state === ScreenState.MessagesTab) { console.log("Login successful!"); await setCurrentStage(Stage.NavigateToMessages); } } export async function handleNavigateToMessages( state: ScreenState, screen: AndroidNode ) { if (state === ScreenState.MessagesTab || state === ScreenState.ChatList) { await setCurrentStage(Stage.SelectUnreadChat); return; } // Click messages tab const allNodes = getAllNodes(screen); const messagesTab = allNodes.find(n => n.description === "Messages" && n.className === "android.widget.Button" && n.clickable ); if (messagesTab) { console.log("Navigating to messages..."); await messagesTab.performAction(agent.constants.ACTION_CLICK); await sleep(2000); } } export async function handleSelectUnreadChat( state: ScreenState, screen: AndroidNode ) { if (state === ScreenState.ChatConversation) { await setCurrentStage(Stage.ProcessMessages); return; } if (state === ScreenState.ChatListEmpty) { console.log("No unread messages, moving to feed..."); await setCurrentStage(Stage.NavigateToFeed); return; } const allNodes = getAllNodes(screen); // Find unread chat const unreadChat = allNodes.find(n => n.viewId === \`\${APP_PACKAGE}:id/chat_item\` && getAllNodes(n).find(child => child.viewId === \`\${APP_PACKAGE}:id/unread_badge\` ) ); if (unreadChat) { console.log("Opening unread chat..."); unreadChat.randomClick(); await sleep(2000); } else { // No more unread, move to feed console.log("No more unread chats, moving to feed..."); await setCurrentStage(Stage.NavigateToFeed); } } export async function handleProcessMessages( state: ScreenState, screen: AndroidNode ) { const allNodes = getAllNodes(screen); const { responseMessage } = agent.arguments.automationParameters; // Find message input const messageInput = allNodes.find(n => n.hintText?.toLowerCase()?.includes("message") && n.className === "android.widget.EditText" ); if (messageInput) { // Type response await messageInput.performAction(agent.constants.ACTION_FOCUS); await sleep(500); await agent.actions.writeText(responseMessage || "Thanks for your message!"); await sleep(500); // Send message const sendBtn = allNodes.find(n => (n.description?.toLowerCase() === "send" || n.viewId?.includes("send")) && n.clickable ); if (sendBtn) { await sendBtn.performAction(agent.constants.ACTION_CLICK); collectedData.messagesResponded++; console.log(\`Message sent! Total: \${collectedData.messagesResponded}\`); await sleep(1500); } } // Go back to chat list await agent.actions.goBack(); await sleep(1000); await setCurrentStage(Stage.SelectUnreadChat); } export async function handleNavigateToFeed( state: ScreenState, screen: AndroidNode ) { if (state === ScreenState.HomeTab) { await setCurrentStage(Stage.LikePosts); return; } const allNodes = getAllNodes(screen); const homeTab = allNodes.find(n => n.description === "Home" && n.className === "android.widget.Button" && n.clickable ); if (homeTab) { console.log("Navigating to feed..."); await homeTab.performAction(agent.constants.ACTION_CLICK); await sleep(2000); } } export async function handleLikePosts( state: ScreenState, screen: AndroidNode ) { const { maxLikes } = agent.arguments.automationParameters; if (collectedData.postsLiked >= (maxLikes || 5)) { console.log("Reached max likes, completing..."); await setCurrentStage(Stage.Complete); return; } const allNodes = getAllNodes(screen); // Find like button (not already liked) const likeBtn = allNodes.find(n => n.viewId === \`\${APP_PACKAGE}:id/like_button\` && n.description !== "Unlike" && n.clickable ); if (likeBtn) { console.log("Liking post..."); await likeBtn.performAction(agent.constants.ACTION_CLICK); collectedData.postsLiked++; await sleep(1000, 2000); // Scroll to next post await agent.actions.swipe(540, 1500, 540, 800, 500); await sleep(1000); } else { // Scroll to find more posts await agent.actions.swipe(540, 1500, 540, 500, 500); await sleep(1500); } } import { Stage } from "./stages.js"; import { ScreenState } from "./screenStates.js"; import { detectScreenState } from "./detection.js"; import { getCurrentStage, getMaxSteps, setCurrentStage, handleCrash, handlePhoneDialog, handlePermissionDialog, handleErrorDialog, handleInitialize, handleLaunchApp, handleLogin, handleNavigateToMessages, handleSelectUnreadChat, handleProcessMessages, handleNavigateToFeed, handleLikePosts } from "./handlers.js"; import { sleep, collectedData, recordError } from "./utils.js"; // File storage for screenshots const files: { name: string; extension: string; base64Data: string }[] = []; // Dismiss notification shade if visible async function dismissNotifications(): Promise { const screens = await agent.actions.allScreensContent(); const allNodes = screens.flatMap(s => getAllNodes(s)); const notif = allNodes.find(n => n.viewId === "com.android.systemui:id/expandableNotificationRow" && n.actions.includes(agent.constants.ACTION_DISMISS) ); if (notif) { await notif.performAction(agent.constants.ACTION_DISMISS); await sleep(500); return true; } return false; } // Handle system-level interruptions async function handleSystemStates( state: ScreenState, screen: AndroidNode ): Promise { // First check notifications if (await dismissNotifications()) { return true; } switch (state) { case ScreenState.Crash: return await handleCrash(screen); case ScreenState.PhoneDialog: return await handlePhoneDialog(screen); case ScreenState.PermissionDialog: return await handlePermissionDialog(screen); case ScreenState.ErrorDialog: return await handleErrorDialog(screen); case ScreenState.NoInternet: console.log("Waiting for network..."); await sleep(5000); return true; case ScreenState.Loading: await sleep(2000); return true; case ScreenState.RateLimited: console.log("Rate limited, waiting 60 seconds..."); await sleep(60000); return true; } return false; } // Main screen state handler async function handleScreenState( state: ScreenState, screen: AndroidNode ) { // Handle system states first if (await handleSystemStates(state, screen)) { return; } const currentStage = getCurrentStage(); switch (currentStage) { case Stage.Initialize: await handleInitialize(); break; case Stage.LaunchApp: await handleLaunchApp(state, screen); break; case Stage.HandleLogin: await handleLogin(state, screen); break; case Stage.NavigateToMessages: await handleNavigateToMessages(state, screen); break; case Stage.SelectUnreadChat: await handleSelectUnreadChat(state, screen); break; case Stage.ProcessMessages: await handleProcessMessages(state, screen); break; case Stage.NavigateToFeed: await handleNavigateToFeed(state, screen); break; case Stage.LikePosts: await handleLikePosts(state, screen); break; } } // Main automation function async function main() { console.log("=== MySocial Auto-Responder Starting ==="); try { await setCurrentStage(Stage.Initialize); let maxSteps = 48; let sameStateCount = 0; let lastState: ScreenState | null = null; while (maxSteps-- > 0) { // Get current screen const screen = await agent.actions.screenContent(); const state = detectScreenState(screen); console.log(\`[Step \${48 - maxSteps}] State: \${state}, Stage: \${getCurrentStage()}\`); // Store screen for debugging await agent.utils.outOfSteps.storeScreen( screen, getCurrentStage(), state, maxSteps, state === ScreenState.Unknown ? ScreenshotRecord.HIGH_QUALITY : ScreenshotRecord.LOW_QUALITY ); // Check for stuck state if (state === lastState) { sameStateCount++; if (sameStateCount >= 5) { console.warn(\`Stuck on \${state}, attempting recovery...\`); await agent.actions.goBack(); await sleep(1000); sameStateCount = 0; } } else { lastState = state; sameStateCount = 0; } // Handle the screen state await handleScreenState(state, screen); // Check if complete if (getCurrentStage() === Stage.Complete) { break; } // Delay between iterations await sleep(500, 1000); } // Check completion status if (getCurrentStage() === Stage.Complete) { console.log("=== Automation Complete! ==="); console.log(\`Messages responded: \${collectedData.messagesResponded}\`); console.log(\`Posts liked: \${collectedData.postsLiked}\`); await agent.utils.job.submitTask("success", { ...collectedData, completedAt: new Date().toISOString() }, true, files); } else { // Out of steps console.warn("Out of steps!"); const result = await agent.utils.outOfSteps.submit("outOfSteps"); await agent.utils.job.submitTask("failed", { ...collectedData, error: "OUT_OF_STEPS", outOfStepsId: result.success ? result.id : null }, true, files); } } catch (error) { console.error("Automation error:", error); // Capture error screenshot try { const screenshot = await agent.actions.screenshot(1080, 1920, 80); files.push({ name: "error_screenshot", extension: "jpg", base64Data: screenshot.base64 }); } catch (e) { // Ignore screenshot errors } await agent.utils.job.submitTask("failed", { ...collectedData, error: String(error), stage: getCurrentStage() }, true, files); } finally { // Always stop the automation stopCurrentAutomation(); } } // Start the automation main(); { "fields": [ { "name": "responseMessage", "type": "string", "required": false, "description": "Message to send as response", "defaultValue": "Thanks for reaching out!" }, { "name": "maxLikes", "type": "number", "required": false, "description": "Maximum posts to like", "defaultValue": 5, "min": 1, "max": 20 } ] } { "fields": [ { "name": "email", "type": "string", "required": true, "description": "MySocial account email" }, { "name": "password", "type": "string", "required": true, "description": "MySocial account password" } ] } Full Tutorial: MySocial Auto-Responder Build a complete automation from scratch stages.ts screenStates.ts utils.ts detection.ts handlers.ts main.ts Automation Parameters Schema Job Variables Schema -------------------------------------------------------------------------------- ### REFERENCE ## Reference / API Reference Path: /docs/automation/reference Description: Complete reference for all automation interfaces, methods, and types Sections: - Agent Namespaces - Screen Content - Other Content: The Automation API is accessed through the global agent object. It provides comprehensive control over Android devices including touch gestures, text input, app management, file operations, and screen content access. Agent Namespaces Screen Content Other // The agent object is globally available in automation scripts declare const agent: Agent; interface Agent { constants: AgentConstants; // Accessibility action constants actions: AgentActions; // Device automation methods utils: AgentUtils; // Utility functions & file operations info: AgentInfo; // Device & automation metadata control: AgentControl; // Automation control display: AgentDisplay; // HTML overlay display email: AgentEmail; // Email operations notifications: AgentNotifications; // Notification handling } API Reference Complete reference for all automation interfaces, methods, and types Entry Point Constants 44 accessibility action constants for nodeAction() operations Actions 26 methods for device interaction: tap, swipe, screenshot, screenContent, launchApp, and more Utils & Files Utility functions and 14 file operation methods Info Get automation and device metadata Control Stop automation execution Display Show HTML overlays on screen Email Read emails via IMAP Notifications Handle system notifications AndroidNode Accessibility tree nodes with 35+ properties and 11 methods for traversing and querying UI elements AndroidNodeFilter Builder pattern with 28 chainable methods for finding nodes: isButton(), hasText(), isClickable(), and more Types All supporting types: FileInfo, Email, OCR types, callbacks, and more Helper Functions 5 standalone utility functions for working with nodes -------------------------------------------------------------------------------- ## Reference / Agent Path: /docs/automation/reference/agent Description: The main entry point for all automation functionality Sections: - Namespaces - Quick Example Content: The agent object is globally available in all automation scripts. It provides access to device controls, screen content, file operations, and more through its namespaced properties. Namespaces Quick Example interface Agent { /** Accessibility action constants */ constants: AgentConstants; /** All automation actions */ actions: AgentActions; /** Utility functions */ utils: AgentUtils; /** Device and automation info */ info: AgentInfo; /** Automation control */ control: AgentControl; /** HTML overlay display */ display: AgentDisplay; /** Email operations */ email: AgentEmail; /** Notification handling */ notifications: AgentNotifications; /** Record usage statistics */ recordUsage(type: string, usage: number): void; } // Tap at coordinates await agent.actions.tap(100, 200); // Get screen content and find elements const screen = await agent.actions.screenContent(); const button = screen.findTextOne("Submit"); // Perform action on a node if (button) { await button.performAction(agent.constants.ACTION_CLICK); } // Get device info const device = agent.info.getDeviceInfo(); console.log(device.brand, device.model); // Read a file const content = agent.utils.files.readFullFile("/sdcard/data.txt"); Agent The main entry point for all automation functionality Agent Interface agent.constants 44 accessibility action constants used with nodeAction() agent.actions 26 methods for device interaction: touch gestures, text input, app management, screenshots, and more agent.utils Utility functions including randomClick, randomSwipe, callbacks, and file operations agent.info Get automation and device metadata agent.control Stop automation execution agent.display Display HTML overlays on screen agent.email Read emails via IMAP protocol agent.notifications Handle and respond to system notifications -------------------------------------------------------------------------------- ## Reference / AgentConstants Path: /docs/automation/reference/agent/constants Description: Accessibility action constants for use with performAction() Sections: - Action Constants - Argument Constants Content: These constants represent Android accessibility actions that can be performed on UI nodes. Use them with the node.performAction() method. Action Constants Argument Constants These constants are used as keys in the data parameter of performAction() to pass additional arguments. // Click a node using accessibility action const screen = await agent.actions.screenContent(); const button = screen.findTextOne("Submit"); if (button) { await button.performAction(agent.constants.ACTION_CLICK); } // Set text on an input field const input = screen.findAdvanced(f => f.isEditText()); if (input) { await input.performAction( agent.constants.ACTION_SET_TEXT, { [agent.constants.ACTION_ARGUMENT_SET_TEXT_CHARSEQUENCE]: "Hello World" } ); } AgentConstants Accessibility action constants for use with performAction() Usage Example Basic Actions Navigation Actions Scroll Actions Editing Actions Expand/Collapse Actions Advanced Actions Drag Actions -------------------------------------------------------------------------------- ## Reference / AgentActions Path: /docs/automation/reference/agent/actions Description: All automation actions for device interaction Sections: - Action Categories - Quick Example Content: Access these methods through agent.actions . All action methods are asynchronous and return Promises. ; swipe(x1: number, y1: number, x2: number, y2: number, duration: number): Promise ; hold(x: number, y: number, duration: number): Promise ; doubleTap(x: number, y: number, interval: number): Promise ; multiTap(sequence: MultiTapSequenceItem[]): Promise ; swipePoly(startX: number, startY: number, sequence: Point[], duration: number, bezier?: boolean): Promise ; // Navigation (dpad requires Android 13+) goHome(): Promise ; goBack(): Promise ; recents(): Promise ; dpad(direction: "up" | "down" | "left" | "right" | "center"): Promise ; // Text Input (inputKey requires Android 13+) writeText(text: string): Promise ; copyText(text: string): Promise ; paste(): Promise ; reverseCopy(): Promise ; hideKeyboard(): Promise ; inputKey(keyCode: number, duration?: number, state?: "down" | "up" | null): Promise ; // App Management launchApp(packageName: string, clearExisting?: boolean): Promise ; launchIntent(...): Promise ; listApps(): Promise ; browse(url: string, clearExistingData?: boolean): Promise ; // Screen Operations screenContent(): Promise ; allScreensContent(): Promise ; screenshot(maxWidth: number, maxHeight: number, quality: number, ...): Promise ; nodeAction(node: AndroidNode, actionInt: number, data?: object): Promise ; showNotification(title: string, message: string): Promise ; // File Operations saveFile(fileName: string, data: string, base64?: boolean): Promise ; // Network airplane(): Promise ; // Image Recognition recognizeText(imageBase64: string): Promise Action Categories Quick Example interface AgentActions { // Touch Gestures tap(x: number, y: number): Promise; swipe(x1: number, y1: number, x2: number, y2: number, duration: number): Promise; hold(x: number, y: number, duration: number): Promise; doubleTap(x: number, y: number, interval: number): Promise; multiTap(sequence: MultiTapSequenceItem[]): Promise; swipePoly(startX: number, startY: number, sequence: Point[], duration: number, bezier?: boolean): Promise; // Navigation (dpad requires Android 13+) goHome(): Promise; goBack(): Promise; recents(): Promise; dpad(direction: "up" | "down" | "left" | "right" | "center"): Promise; // Text Input (inputKey requires Android 13+) writeText(text: string): Promise; copyText(text: string): Promise; paste(): Promise; reverseCopy(): Promise; hideKeyboard(): Promise; inputKey(keyCode: number, duration?: number, state?: "down" | "up" | null): Promise; // App Management launchApp(packageName: string, clearExisting?: boolean): Promise; launchIntent(...): Promise; listApps(): Promise<{[packageName: string]: string}>; browse(url: string, clearExistingData?: boolean): Promise; // Screen Operations screenContent(): Promise; allScreensContent(): Promise; screenshot(maxWidth: number, maxHeight: number, quality: number, ...): Promise; nodeAction(node: AndroidNode, actionInt: number, data?: object): Promise<{actionPerformed: boolean}>; showNotification(title: string, message: string): Promise; // File Operations saveFile(fileName: string, data: string, base64?: boolean): Promise; // Network airplane(): Promise; // Image Recognition recognizeText(imageBase64: string): Promise; // ADB Actions (app version 2.141+) adb: AgentAdbActions; } // Get the current screen content const screen = await agent.actions.screenContent(); // Find a button with text "Submit" const submitBtn = screen.findTextOne("Submit"); // Tap the button if (submitBtn) { const { left, top, right, bottom } = submitBtn.boundsInScreen; await agent.actions.tap((left + right) / 2, (top + bottom) / 2); } // Or use performAction for accessibility-based click await submitBtn.performAction(agent.constants.ACTION_CLICK); AgentActions All automation actions for device interaction AgentActions Interface Overview Touch Actions tap, swipe, hold, doubleTap, multiTap, swipePoly Navigation goHome, goBack, recents, dpad (Android 13+) Text Input writeText, copyText, paste, reverseCopy, hideKeyboard, inputKey (Android 13+) App Management launchApp, launchIntent, listApps, browse Screen Operations screenContent, allScreensContent, screenshot, nodeAction, showNotification File Operations saveFile - save files to device storage Network airplane - toggle airplane mode to refresh IP on mobile network Image Recognition recognizeText - OCR using ML Kit ADB Actions ADB shell-based alternatives: tap, swipe, hold, goHome, screenContent, listApps, and more (app v2.141+) -------------------------------------------------------------------------------- ## Reference / Touch Actions Path: /docs/automation/reference/agent/actions/touch Description: Touch gestures for device interaction Methods: ### tap() Signature: tap(x: number, y: number): Promise<void> Performs a single tap at the specified screen coordinates. ### swipe() Signature: swipe(x1: number, y1: number, x2: number, y2: number, duration: number): Promise<void> Performs a swipe gesture from one point to another. ### hold() Signature: hold(x: number, y: number, duration: number): Promise<void> Performs a long press at the specified coordinates. ### doubleTap() Signature: doubleTap(x: number, y: number, interval: number): Promise<void> Performs a double tap at the specified coordinates. ### multiTap() Signature: multiTap(sequence: MultiTapSequenceItem[]): Promise<void> Performs multiple taps in sequence with configurable delays. ### swipePoly() Signature: swipePoly(startX: number, startY: number, sequence: {x: number, y: number, duration?: number}[], duration: number, bezier?: boolean): Promise<void> Performs a multi-point swipe through a sequence of coordinates. Each point can optionally specify its own segment duration for fine-grained timing control. Content: Access these methods through agent.actions . All touch methods are asynchronous and return Promises. Touch Actions Touch gestures for device interaction tap Performs a single tap at the specified screen coordinates. swipe Performs a swipe gesture from one point to another. hold Performs a long press at the specified coordinates. doubleTap Performs a double tap at the specified coordinates. multiTap Performs multiple taps in sequence with configurable delays. swipePoly Performs a multi-point swipe through a sequence of coordinates. Each point can optionally specify its own segment duration for fine-grained timing control. -------------------------------------------------------------------------------- ## Reference / Navigation Actions Path: /docs/automation/reference/agent/actions/navigation Description: Device navigation and system button actions Methods: ### goHome() Signature: goHome(): Promise<void> Returns to the home screen. ### goBack() Signature: goBack(): Promise<void> Presses the system back button. ### recents() Signature: recents(): Promise<void> Opens the recent apps screen. ### dpad() Signature: dpad(direction: Sends a D-pad navigation event. Useful for navigating lists and menus. Requires Android 13+ (SDK level 33+). Content: Access these methods through agent.actions . Navigate between screens and interact with system buttons. Note: dpad() requires Android 13+ (SDK level 33+). Check the device SDK version using agent.info.getDeviceInfo().sdkVersion Navigation Actions Device navigation and system button actions goHome Returns to the home screen. goBack Presses the system back button. recents Opens the recent apps screen. dpad Sends a D-pad navigation event. Useful for navigating lists and menus. Requires Android 13+ (SDK level 33+). -------------------------------------------------------------------------------- ## Reference / Text Input Actions Path: /docs/automation/reference/agent/actions/text Description: Keyboard input and clipboard operations Methods: ### writeText() Signature: writeText(text: string): Promise<void> Types text using keyboard input. The keyboard must be visible. ### copyText() Signature: copyText(text: string): Promise<void> Copies the specified text to the system clipboard. ### paste() Signature: paste(): Promise<void> Pastes content from the clipboard at the current cursor position. ### reverseCopy() Signature: reverseCopy(): Promise<{text: string, data?: any, files?: {uri: string, mimeType: string, name: string, dataBase64: string}[]}> Gets the current clipboard content including text, data, and files. ### hideKeyboard() Signature: hideKeyboard(): Promise<void> Hides the software keyboard if it ### inputKey() Signature: inputKey(keyCode: number, duration?: number, state?: Sends a raw key event by Android KeyEvent code. Requires Android 13+ (SDK level 33+) and only works when the on-screen keyboard is visible. Content: Access these methods through agent.actions . Type text, manage clipboard, and handle keyboard interactions. Note: inputKey() requires Android 13+ (SDK level 33+) and will only work when the on-screen keyboard is visible. Tap on an input field first to show the keyboard. Text Input Actions Keyboard input and clipboard operations writeText Types text using keyboard input. The keyboard must be visible. copyText Copies the specified text to the system clipboard. paste Pastes content from the clipboard at the current cursor position. reverseCopy Gets the current clipboard content including text, data, and files. hideKeyboard Hides the software keyboard if it inputKey Sends a raw key event by Android KeyEvent code. Requires Android 13+ (SDK level 33+) and only works when the on-screen keyboard is visible. -------------------------------------------------------------------------------- ## Reference / App Management Path: /docs/automation/reference/agent/actions/apps Description: Launch apps, manage intents, and browse URLs Methods: ### launchApp() Signature: launchApp(packageName: string, clearExisting?: boolean): Promise<void> Launches an app by its package name. ### launchIntent() Signature: launchIntent(intentName: string, packageName: string | null, data: string | null, type: string | null, extras: object | null, flags: number, component: Launches an Android Intent with full configuration options. ### listApps() Signature: listApps(): Promise<{[packageName: string]: string}> Gets a list of all installed apps. ### browse() Signature: browse(url: string, clearExistingData?: boolean): Promise<void> Opens a URL in the default browser. Content: agent.actions . Launch and manage applications on the device. App Management Launch apps, manage intents, and browse URLs launchApp Launches an app by its package name. launchIntent Launches an Android Intent with full configuration options. listApps Gets a list of all installed apps. browse Opens a URL in the default browser. -------------------------------------------------------------------------------- ## Reference / Screen Actions Path: /docs/automation/reference/agent/actions/screen Description: Screen content, screenshots, and node interactions Methods: ### screenContent() Signature: screenContent(): Promise<AndroidNode> Gets the accessibility tree of the currently focused window. Returns an AndroidNode representing the root of the UI hierarchy. ### allScreensContent() Signature: allScreensContent(): Promise<AndroidNode[]> Gets the accessibility trees from all visible windows (useful for dialogs, overlays). ### screenshot() Signature: screenshot(maxWidth: number, maxHeight: number, quality: number, cropX1?: number, cropY1?: number, cropX2?: number, cropY2?: number): Promise<{screenshot: string | null, compressedWidth: number, compressedHeight: number, originalWidth: number, originalHeight: number}> Takes a screenshot with optional scaling and cropping. ### nodeAction() Signature: nodeAction(node: AndroidNode | object, actionInt: number, data?: object, fieldsToIgnore?: string[]): Promise<{actionPerformed: boolean}> Performs an accessibility action on a node. ### showNotification() Signature: showNotification(title: string, message: string): Promise<void> Shows a system notification. Content: Access these methods through agent.actions . Get screen content, take screenshots, and interact with UI nodes. Screen Actions Screen content, screenshots, and node interactions screenContent Gets the accessibility tree of the currently focused window. Returns an AndroidNode representing the root of the UI hierarchy. allScreensContent Gets the accessibility trees from all visible windows (useful for dialogs, overlays). screenshot Takes a screenshot with optional scaling and cropping. nodeAction Performs an accessibility action on a node. showNotification Shows a system notification. -------------------------------------------------------------------------------- ## Reference / File Actions Path: /docs/automation/reference/agent/actions/files Description: Save files to the device Methods: ### saveFile() Signature: saveFile(fileName: string, data: string, base64?: boolean): Promise<void> Saves data to a file in the Downloads folder. Content: Access these methods through agent.actions . Save data to files on the device. Note: Utils > Files async function exportResults(results: any[]) { // Save as JSON await agent.actions.saveFile( "results.json", JSON.stringify(results, null, 2) ); // Save as CSV const headers = Object.keys(results[0]); const csvRows = [ headers.join(","), ...results.map(r => headers.map(h => r[h]).join(",")) ]; await agent.actions.saveFile("results.csv", csvRows.join("\\n")); // Take a confirmation screenshot const { screenshot } = await agent.actions.screenshot(1080, 1920, 80); if (screenshot) { await agent.actions.saveFile("confirmation.jpg", screenshot, true); } console.log("Results exported to Downloads folder"); } File Actions Save files to the device saveFile Saves data to a file in the Downloads folder. Complete Example: Export Automation Results -------------------------------------------------------------------------------- ## Reference / Network Actions Path: /docs/automation/reference/agent/actions/network Description: Network and connectivity operations Methods: ### airplane() Signature: airplane(): Promise<void> Toggles airplane mode on and off to refresh the mobile network connection. This method turns airplane mode ON, waits briefly, then turns it OFF. Primarily used for changing IP address when connected to a mobile/cellular network. Sections: - Important Notes Content: Access these methods through agent.actions . Manage network connectivity and IP address changes. Important Notes Only works when the device is connected to a mobile/cellular network Wi-Fi connections are not affected by this method The new IP address is assigned by your mobile carrier There may be a brief period of no connectivity during the toggle // When rate limited, get new IP and continue async function handleRateLimit() { console.log("Rate limited, refreshing IP..."); await agent.actions.airplane(); // Wait for network to fully reconnect const status = await agent.utils.isServerReachable(); if (!status.reachable) { await agent.control.wait(3000); } console.log("IP refreshed, continuing automation"); } Network Actions Network and connectivity operations airplane Toggles airplane mode on and off to refresh the mobile network connection. This method turns airplane mode ON, waits briefly, then turns it OFF. Primarily used for changing IP address when connected to a mobile/cellular network. Use Case: Rate Limit Bypass -------------------------------------------------------------------------------- ## Reference / Image Recognition Path: /docs/automation/reference/agent/actions/recognition Description: OCR and image analysis capabilities Methods: ### recognizeText() Signature: recognizeText(imageBase64: string): Promise<TextJSON> Performs OCR on an image using ML Kit. Sections: - Return Types Content: Access these methods through agent.actions . Extract text and analyze images using ML Kit. Return Types interface TextJSON { text: string; // Complete recognized text textBlocks: TextBlock[]; // Array of text blocks } interface TextBlock { text: string; boundingBox: BoundingBox; cornerPoints: Point[]; recognizedLanguages: string[]; lines: TextLine[]; } interface TextLine { text: string; boundingBox: BoundingBox; cornerPoints: Point[]; recognizedLanguages: string[]; elements: TextElement[]; confidence: number; angle: number; } interface TextElement { text: string; boundingBox: BoundingBox; cornerPoints: Point[]; recognizedLanguages: string[]; symbols: TextSymbol[]; confidence: number; angle: number; } interface TextSymbol { text: string; boundingBox: BoundingBox; cornerPoints: Point[]; confidence: number; angle: number; } interface BoundingBox { left: number; top: number; right: number; bottom: number; } Image Recognition OCR and image analysis capabilities recognizeText Performs OCR on an image using ML Kit. TextJSON Root level OCR result containing all recognized text. TextBlock A block of text, typically a paragraph. TextLine A line of text within a block. TextElement Individual text element (usually a word). TextSymbol Individual character/symbol. BoundingBox -------------------------------------------------------------------------------- ## Reference / AgentUtils Path: /docs/automation/reference/agent/utils Description: Utility functions, job management, and file operations Sections: - Utility Categories - Quick Examples Content: agent.utils . Includes random gesture helpers, event callbacks, job task management, server connectivity, and comprehensive file operations. ; submit(type: "outOfSteps" | "timeout" | "debug"): Promise , finish: boolean, files: File[]): Promise ; useAnotherTask(): Promise ; getCurrentTask(): Promise ; set(data: Partial ): Promise ; set(data: Record ): Promise Utility Categories Note: agent.notifications Quick Examples interface AgentUtils { // Gesture helpers randomClick(x1: number, y1: number, x2: number, y2: number): void; randomSwipe(x1: number, y1: number, x2: number, y2: number, direction: Direction): void; // Server connectivity isServerReachable(): Promise<{ reachable: true } | { reachable: false; error: string }>; // Event callbacks (for notifications, see agent.notifications) setNetworkCallback(callback: NetworkCallback | null): void; toastCallback: ToastCallback | null; // Step tracking & debugging outOfSteps: { storeScreen(screen: AndroidNode, stage: string, screenState: string, remainingSteps: number, screenshotRecord: ScreenshotRecord): Promise; submit(type: "outOfSteps" | "timeout" | "debug"): Promise; }; // Job task management job: { submitTask(status: AutomationStatus, data: Record, finish: boolean, files: File[]): Promise; useAnotherTask(): Promise; getCurrentTask(): Promise; }; // Bucket storage (device+job scoped) bucket: { get(): Promise; set(data: Partial): Promise; }; // Device bucket storage (device-only scoped) deviceBucket: { get(): Promise; set(data: Record): Promise; }; // File operations files: AgentFiles; } // Random tap within a button area (more human-like) const button = screen.findTextOne("Submit"); if (button) { button.randomClick(); } // Get current task and submit results const task = await agent.utils.job.getCurrentTask(); if (task.success) { const proof = task.job_proof; // ... perform automation ... await agent.utils.job.submitTask( "success", { orderId: "12345", completed: true }, true, // Final submission [] ); } // Read and process files const files = agent.utils.files.list("/sdcard/Download"); for (const file of files) { if (file.name.endsWith(".json")) { const content = agent.utils.files.readFullFile(file.path); const data = JSON.parse(content); console.log(data); } } AgentUtils Utility functions, job management, and file operations AgentUtils Interface Overview Helper Utilities randomClick, randomSwipe, isServerReachable Event Callbacks setNetworkCallback, toastCallback Job Task Management submitTask, useAnotherTask, getCurrentTask Step Tracking & Debugging storeScreen, submit - debug automation failures Bucket Storage get, set - device+job scoped persistent storage Device Bucket Storage get, set - device-scoped persistent storage across all jobs File Operations exists, readFullFile, list, getHashes, and more Human-like Interactions Job Task Processing File Operations -------------------------------------------------------------------------------- ## Reference / Helper Utilities Path: /docs/automation/reference/agent/utils/helpers Description: Random gesture helpers, server connectivity, and node waiting utilities Methods: ### randomClick() Signature: randomClick(x1: number, y1: number, x2: number, y2: number): void Performs a tap at a random position within the specified rectangle. Useful for making automations appear more human-like. ### randomSwipe() Signature: randomSwipe(x1: number, y1: number, x2: number, y2: number, direction: Performs a swipe starting from a random position within the rectangle, moving in the specified direction. ### isServerReachable() Signature: isServerReachable(): Promise<{ reachable: true } | { reachable: false; error: string }> Checks if the server is reachable. Useful for verifying connectivity before performing server-dependent operations. ### waitForNode() Signature: waitForNode(condition: (node: AndroidNode) => boolean, durationMs?: number, intervalMs?: number): Promise<boolean> Waits for a node matching the condition to appear on screen. Polls the screen at regular intervals until the condition is met or timeout is reached. ### waitForNodeGone() Signature: waitForNodeGone(condition: (node: AndroidNode) => boolean, durationMs?: number, intervalMs?: number): Promise<boolean> Waits for a node matching the condition to disappear from screen. Polls the screen at regular intervals until the node is gone or timeout is reached. Content: Access these methods through agent.utils . Utility methods for human-like interactions, server connectivity, and waiting for UI elements. Helper Utilities Random gesture helpers, server connectivity, and node waiting utilities randomClick Performs a tap at a random position within the specified rectangle. Useful for making automations appear more human-like. randomSwipe Performs a swipe starting from a random position within the rectangle, moving in the specified direction. isServerReachable Checks if the server is reachable. Useful for verifying connectivity before performing server-dependent operations. waitForNode Waits for a node matching the condition to appear on screen. Polls the screen at regular intervals until the condition is met or timeout is reached. waitForNodeGone Waits for a node matching the condition to disappear from screen. Polls the screen at regular intervals until the node is gone or timeout is reached. -------------------------------------------------------------------------------- ## Reference / Event Callbacks Path: /docs/automation/reference/agent/utils/callbacks Description: Network and toast event handlers Methods: ### setNetworkCallback() Signature: setNetworkCallback(callback: NetworkCallback | null): void Registers a callback to receive network state changes. Pass null to unregister. Sections: - Callback Types Content: Access these methods through agent.utils . Register callbacks to receive system events during automation. Note: agent.notifications toastCallback Set this property to receive toast messages shown by apps. Callback Types agent.utils.toastCallback = (packageName, data) => { console.log("Toast from", packageName + ":", data.message); }; type NetworkCallback = (networkAvailable: boolean) => void; type ToastCallback = ( packageName: string, data: { message: string } ) => void; Event Callbacks Network and toast event handlers setNetworkCallback Registers a callback to receive network state changes. Pass null to unregister. Example NetworkCallback ToastCallback -------------------------------------------------------------------------------- ## Reference / Job Task Management Path: /docs/automation/reference/agent/utils/job Description: Manage job tasks, submit results, and request new tasks Methods: ### submitTask() Signature: submitTask(automationStatus: Submits the current task result to the server. Use this to report progress, success, or failure of job tasks. Note: Once you call submitTask with finish=true, you cannot submit again for the same task. The files parameter is ignored when finish=false. ### useAnotherTask() Signature: useAnotherTask(): Promise<{ job_task_id: string; job_proof: string } | null> Accesses another task ### getCurrentTask() Signature: getCurrentTask(): Promise<{ success: false; error: string } | { success: true; parent_task_id: string; job_proof: any, timeout: number }> Gets information about the currently assigned task, including the parent task ID and job proof data. ### getSubTasks() Signature: getSubTasks(planned_task_ids: string[]): Promise<...> Gets the status and details of sub-tasks by their planned task IDs. Returns a map keyed by planned_task_id. Only returns data for planned tasks whose parent_task_id matches the current task. Status is ### setAutomationVariables() Signature: setAutomationVariables(variables: object): Promise<{ success: false; error: string } | { success: true; automation_variables: any }> Sets automation variables for the current job task. These variables can be used to coordinate between tasks in the same job. Set { waiting: true } to make the current task ### getAutomationVariables() Signature: getAutomationVariables(): Promise<{ success: false; error: string } | { success: true; automation_variables: any }> Retrieves the automation variables previously set for the current job task. Use this to check the current state of task coordination variables. Sections: - Automation Variables Content: agent.utils.job . Used for managing job tasks, submitting results, and requesting new tasks. ; submitTaskToAnotherJob( job_id: string, data: Record ; useAnotherTask(): Promise ; getCurrentTask(): Promise ; addSubTasks( job_variables: JobVariables[], run_immediately?: boolean ): Promise ; addSubTasks( job_variables: Record [], run_immediately: boolean, job_id: string ): Promise []; run_immediately?: boolean; job_id?: string; remote_device_id?: string; }): Promise ; getSubTasks(planned_task_ids: string[]): Promise ; getAutomationVariables(): Promise []", description: "Array of job variable objects. Each object becomes the input for one sub-task. Uses JobVariables type (matching the automation's schema) when targeting the current job, or generic Record Automation Variables agent.utils interface JobUtils { submitTask( automationStatus: "running" | "success" | "failed" | "declined", data: Record, finish: boolean, files: { name: string; extension: string; base64Data: string }[] ): Promise<{ success: false; error: string } | { success: true }>; submitTaskToAnotherJob( job_id: string, data: Record, status?: "pending" | "failed" | "declined", files?: { name: string; extension: string; base64Data: string }[] ): Promise< { success: false; error: string } | { success: true; job_task_id: string; job_id: string } >; useAnotherTask(): Promise<{ job_task_id: string; job_proof: string } | null>; getCurrentTask(): Promise< { success: false; error: string } | { success: true; parent_task_id: string; job_proof: any, timeout: number } >; addSubTasks( job_variables: JobVariables[], run_immediately?: boolean ): Promise< { success: false; error: string } | { success: true; inserted_ids: { planned_task_id: string; input: string }[]; assignment_results?: { planned_task_id: string; assigned: boolean }[]; } >; addSubTasks( job_variables: Record[], run_immediately: boolean, job_id: string ): Promise< { success: false; error: string } | { success: true; inserted_ids: { planned_task_id: string; input: string }[]; assignment_results?: { planned_task_id: string; assigned: boolean }[]; } >; addSubTasks(options: { job_variables: Record[]; run_immediately?: boolean; job_id?: string; remote_device_id?: string; }): Promise< { success: false; error: string } | { success: true; inserted_ids: { planned_task_id: string; input: string }[]; assignment_results?: { planned_task_id: string; assigned: boolean }[]; } >; getSubTasks(planned_task_ids: string[]): Promise< { success: false; error: string } | { success: true; tasks: Record; } >; } // On agent.utils (for task coordination) interface AgentUtils { setAutomationVariables(variables: object): Promise< { success: false; error: string } | { success: true; automation_variables: any } >; getAutomationVariables(): Promise< { success: false; error: string } | { success: true; automation_variables: any } >; } Job Task Management Manage job tasks, submit results, and request new tasks JobUtils Interface submitTask Submits the current task result to the server. Use this to report progress, success, or failure of job tasks. Note: Once you call submitTask with finish=true, you cannot submit again for the same task. The files parameter is ignored when finish=false. submitTaskToAnotherJob useAnotherTask Accesses another task getCurrentTask Gets information about the currently assigned task, including the parent task ID and job proof data. addSubTasks getSubTasks Gets the status and details of sub-tasks by their planned task IDs. Returns a map keyed by planned_task_id. Only returns data for planned tasks whose parent_task_id matches the current task. Status is setAutomationVariables Sets automation variables for the current job task. These variables can be used to coordinate between tasks in the same job. Set { waiting: true } to make the current task getAutomationVariables Retrieves the automation variables previously set for the current job task. Use this to check the current state of task coordination variables. -------------------------------------------------------------------------------- ## Reference / Step Tracking & Debugging Path: /docs/automation/reference/agent/utils/out-of-steps Description: Track automation progress and debug failures Methods: ### storeScreen() Signature: storeScreen(screen: AndroidNode, stage: string, screenState: string, remainingSteps: number, screenshotRecord: ScreenshotRecord): Promise<void> Stores the current screen state for debugging purposes. Call this periodically during automation to track progress and help diagnose issues when automations fail. ### submit() Signature: submit(type: Submits the collected screen states for analysis. Call this when the automation ends unexpectedly or for debugging. Sections: - Supporting Types - Usage Pattern Content: Access step tracking through agent.utils.outOfSteps . Used for tracking automation progress and debugging when automations run out of steps or timeout. ; submit( type: "outOfSteps" | "timeout" | "debug" ): Promise Supporting Types Usage Pattern interface OutOfStepsUtils { storeScreen( screen: AndroidNode, stage: string, screenState: string, remainingSteps: number, screenshotRecord: ScreenshotRecord ): Promise; submit( type: "outOfSteps" | "timeout" | "debug" ): Promise<{ success: false; error: string } | { success: true; id: string }>; } enum ScreenshotRecord { HIGH_QUALITY, // Full quality screenshot LOW_QUALITY, // Compressed screenshot (faster, smaller) NONE // No screenshot } async function runAutomation() { let remainingSteps = 100; while (remainingSteps > 0) { const screen = await agent.actions.screenContent(); // Store screen state periodically await agent.utils.outOfSteps.storeScreen( screen, getCurrentStage(), describeScreen(screen), remainingSteps, ScreenshotRecord.LOW_QUALITY ); // Perform automation step const result = await performStep(screen); if (!result.success) { // Submit debug data on failure await agent.utils.outOfSteps.submit("outOfSteps"); throw new Error("Automation failed"); } remainingSteps--; } // Submit on timeout if loop exits without success await agent.utils.outOfSteps.submit("timeout"); } Step Tracking & Debugging Track automation progress and debug failures OutOfStepsUtils Interface storeScreen Stores the current screen state for debugging purposes. Call this periodically during automation to track progress and help diagnose issues when automations fail. submit Submits the collected screen states for analysis. Call this when the automation ends unexpectedly or for debugging. ScreenshotRecord Enum for screenshot quality settings. Complete Debugging Flow -------------------------------------------------------------------------------- ## Reference / File Operations Path: /docs/automation/reference/agent/utils/files Description: Read, write, list, upload, and manage files on the device Methods: ### exists() Signature: exists(path: string): boolean Checks if a file or directory exists at the specified path. ### getSize() Signature: getSize(filePath: string): number Gets the size of a file in bytes. ### getStorageRoot() Signature: getStorageRoot(): string Gets the root storage path for the device (typically ### readFullFile() Signature: readFullFile(filePath: string): string Reads the entire file content as UTF-8 text. ### readFullFileBase64() Signature: readFullFileBase64(filePath: string): string Reads the entire file content as a Base64-encoded string. Useful for binary files. ### readFileAsBlob() Signature: readFileAsBlob(filePath: string, mimeType?: string): Blob | null Reads a file directly as a Blob object. ### openStream() Signature: openStream(filePath: string): string Opens a file stream for reading large files in chunks. ### readChunk() Signature: readChunk(streamId: string, chunkSize: number): number[] Reads a chunk of data from an open file stream. ### closeStream() Signature: closeStream(streamId: string): void Closes an open file stream. ### list() Signature: list(dirPath: string): FileInfo[] Lists all files and directories in the specified directory. ### getPathInfo() Signature: getPathInfo(path: string): DirectoryInfo | FilePathInfo | PathNotFoundInfo Gets detailed information about a path, including whether it ### deleteFile() Signature: deleteFile(path: string): boolean Deletes a file at the specified path. Only works on files, not directories. Returns false if the path doesn ### deleteDir() Signature: deleteDir(path: string): boolean Deletes a directory and all its contents recursively. Only works on directories, not files. Returns false if the path doesn ### rename() Signature: rename(oldPath: string, newPath: string): boolean Renames or moves a file or directory from oldPath to newPath. Parent directories for the new path are created automatically. Fails if the source doesn ### getDirPath() Gets the external public directory path for a given type. ### startDownload() Signature: startDownload(url: string, localPath: string, options?: DownloadRequestOptions): string Starts downloading a file from a URL to a local path on the device. Returns a unique download ID that can be used to track progress with getDownloadStatus. Parent directories are created automatically. Supports resume on retry. Optionally specify HTTP method, headers, and body. ### getDownloadStatus() Signature: getDownloadStatus(id: string): DownloadStatusInfo Returns the current status of a download. Status can be ### retryDownload() Signature: retryDownload(id: string): boolean Retries a failed download. Only works if the download is in failed state. The download resumes from where it left off if the server supports Range requests. ### fetch2() Signature: fetch2(url: string, options?: Fetch2Options): Promise<{ success: true; content: string | Blob | Uint8Array; size: number } | { success: false; error: string }> Downloads a file from a URL and returns its content directly. Internally downloads to a temporary file, reads the content, then deletes the temp file. On failure, automatically retries using resume-capable retryDownload. ### getHashes() Signature: getHashes(filePath: string): FileHashes | FileHashError Calculates MD5, SHA-1, and SHA-256 hashes for a file. ### uploadTempFile() Signature: uploadTempFile(filename: string, base64Data: string): Promise<UploadTempFileResult | { success: false; error: string }> Uploads a file to the server as a temporary file. The file will be automatically deleted after 15 minutes. This overload accepts base64-encoded file data. ### uploadTempFile (local file)() Signature: uploadTempFile(localFilePath: string): Promise<UploadTempFileResult | { success: false; error: string }> Uploads a local file from the device to the server as a temporary file. The file will be automatically deleted after 15 minutes. This overload reads the file in chunks to handle large files efficiently. ### base64ToBytes() Signature: base64ToBytes(base64: string): Uint8Array Converts a Base64-encoded string to a Uint8Array. ### bytesToBlob() Signature: bytesToBlob(bytes: number[] | Uint8Array, mimeType?: string): Blob | null Converts a byte array to a Blob object. Sections: - Basic Operations - Reading Files - Streaming Large Files - Directory Operations - File Management - File Download - Download & Read - File Integrity - File Upload - Data Conversion - Types Content: Access file operations through agent.utils.files agent.utils.uploadTempFile ; uploadTempFile(localFilePath: string): Promise Basic Operations Reading Files Streaming Large Files Directory Operations File Management File Download Download & Read File Integrity File Upload Data Conversion Types interface AgentFiles { exists(path: string): boolean; getSize(filePath: string): number; readFullFileBase64(filePath: string): string; readFullFile(filePath: string): string; openStream(filePath: string): string; readChunk(streamId: string, chunkSize: number): number[]; closeStream(streamId: string): void; list(dirPath: string): FileInfo[]; getPathInfo(path: string): DirectoryInfo | FilePathInfo | PathNotFoundInfo; getStorageRoot(): string; getHashes(filePath: string): FileHashes | FileHashError; deleteFile(path: string): boolean; // Since 2.138 deleteDir(path: string): boolean; // Since 2.138 rename(oldPath: string, newPath: string): boolean; // Since 2.138 getDirPath(type: string): string; // Since 2.138 startDownload(url: string, localPath: string, options?: DownloadRequestOptions): string; // Since 2.138 getDownloadStatus(id: string): DownloadStatusInfo; // Since 2.138 retryDownload(id: string): boolean; // Since 2.138 base64ToBytes(base64: string): Uint8Array; bytesToBlob(bytes: number[] | Uint8Array, mimeType?: string): Blob | null; readFileAsBlob(filePath: string, mimeType?: string): Blob | null; } // On agent.utils (for file upload) interface AgentUtils { uploadTempFile(filename: string, base64Data: string): Promise; uploadTempFile(localFilePath: string): Promise; } interface DownloadStatusInfo { id: string; // Unique download ID status: "downloading" | "success" | "failed"; error: string | null; // Error message if failed bytesDownloaded: number; // Bytes downloaded so far totalBytes: number; // Total size (-1 if unknown) fileSize: number; // Final file size on success (-1 otherwise) filePath: string | null; // Absolute path on success } interface DownloadRequestOptions { method?: string; // HTTP method (default: "GET") headers?: Record; // HTTP headers body?: string; // Request body (for POST, PUT, etc.) } interface Fetch2Options { readAs?: "text" | "base64" | "blob" | "bytes"; // Default: "text" (UTF-8) timeoutMs?: number; // Max wait time in ms (default: 120000) maxRetries?: number; // Retry attempts on failure (default: 2) method?: string; // HTTP method (default: "GET") headers?: Record; // HTTP headers body?: string; // Request body (for POST, PUT, etc.) } interface FileInfo { name: string; // File or directory name path: string; // Full path isDirectory: boolean; isFile: boolean; size: number; // Size in bytes (0 for directories) lastModified: number; // Timestamp } interface DirectoryInfo { exists: true; path: string; name: string; isDirectory: true; isFile: false; lastModified: number; canRead: boolean; canWrite: boolean; fileCount: number; // Number of files directoryCount: number; // Number of subdirectories totalItems: number; // Total items } interface FilePathInfo { exists: true; path: string; name: string; isDirectory: false; isFile: true; lastModified: number; canRead: boolean; canWrite: boolean; size: number; } interface PathNotFoundInfo { exists: false; error?: string; } interface FileHashes { md5: string; sha1: string; sha256: string; size: number; } interface UploadTempFileResult { success: true; message: string; // "File uploaded successfully" data: { filename: string; // Server-assigned filename (e.g., "1234567890_example.pdf") originalName: string; // Original filename provided size: number; // File size in bytes url: string; // Full URL to access the file expiresAt: string; // ISO 8601 expiration timestamp (15 minutes from upload) }; } File Operations Read, write, list, upload, and manage files on the device AgentFiles Interface exists Checks if a file or directory exists at the specified path. getSize Gets the size of a file in bytes. getStorageRoot Gets the root storage path for the device (typically readFullFile Reads the entire file content as UTF-8 text. readFullFileBase64 Reads the entire file content as a Base64-encoded string. Useful for binary files. readFileAsBlob Reads a file directly as a Blob object. openStream Opens a file stream for reading large files in chunks. readChunk Reads a chunk of data from an open file stream. closeStream Closes an open file stream. list Lists all files and directories in the specified directory. getPathInfo Gets detailed information about a path, including whether it deleteFile Deletes a file at the specified path. Only works on files, not directories. Returns false if the path doesn deleteDir Deletes a directory and all its contents recursively. Only works on directories, not files. Returns false if the path doesn rename Renames or moves a file or directory from oldPath to newPath. Parent directories for the new path are created automatically. Fails if the source doesn getDirPath Gets the external public directory path for a given type. startDownload Starts downloading a file from a URL to a local path on the device. Returns a unique download ID that can be used to track progress with getDownloadStatus. Parent directories are created automatically. Supports resume on retry. Optionally specify HTTP method, headers, and body. getDownloadStatus Returns the current status of a download. Status can be retryDownload Retries a failed download. Only works if the download is in failed state. The download resumes from where it left off if the server supports Range requests. DownloadStatusInfo Status information for a file download. DownloadRequestOptions Optional HTTP request options for startDownload. fetch2 Downloads a file from a URL and returns its content directly. Internally downloads to a temporary file, reads the content, then deletes the temp file. On failure, automatically retries using resume-capable retryDownload. Fetch2Options Options for the fetch2 utility method. getHashes Calculates MD5, SHA-1, and SHA-256 hashes for a file. uploadTempFile Uploads a file to the server as a temporary file. The file will be automatically deleted after 15 minutes. This overload accepts base64-encoded file data. uploadTempFile (local file) Uploads a local file from the device to the server as a temporary file. The file will be automatically deleted after 15 minutes. This overload reads the file in chunks to handle large files efficiently. base64ToBytes Converts a Base64-encoded string to a Uint8Array. bytesToBlob Converts a byte array to a Blob object. FileInfo DirectoryInfo FilePathInfo PathNotFoundInfo FileHashes UploadTempFileResult Result returned when a file is successfully uploaded. -------------------------------------------------------------------------------- ## Reference / AgentInfo Path: /docs/automation/reference/agent/info Description: Get automation and device metadata Methods: ### getAutomationInfo() Signature: getAutomationInfo(): AutomationInfo Returns metadata about the currently running automation. ### getDeviceInfo() Signature: getDeviceInfo(): DeviceInfo Returns information about the device hardware and configuration. ### getPhoneNumber() Signature: getPhoneNumber(): string | null Returns the phone number of the device ### getVerifiedPhoneNumber() Signature: getVerifiedPhoneNumber(): Promise Returns the phone number stored on the device record on the server, only if it has been marked verified there. Resolves to null when the device has no verified phone number, the agent token is missing (i.e. the automation is not running through a job/agent), or the request fails. Differs from getPhoneNumber, which reads the SIM directly on the device. Backed by GET /api/v2/devices/verified-phone-number. Sections: - Methods - Types - Usage Example Content: agent.info . Provides information about the current automation and the device it's running on. Methods Types Usage Example interface AgentInfo { getAutomationInfo(): AutomationInfo; getDeviceInfo(): DeviceInfo; getPhoneNumber(): string | null; // Since 2.138 getVerifiedPhoneNumber(): Promise; } interface AutomationInfo { name: string; // Automation name description: string; // Automation description launchId: string; // Unique ID for this launch agent?: { // Optional agent details id: string; commitId: string; token: string; jobTaskId: string; }; serverBaseUrl: string; // Server URL for API calls timeout?: number; // Optional timeout in minutes available since app v2.123 (135) } interface DeviceInfo { id: string; // Unique device ID brand: string; // Device brand (e.g., "Samsung") model: string; // Device model (e.g., "SM-G991B") sdkVersion: number; // Android SDK version (e.g., 33) processor: string; // CPU info numberOfCores: number; // CPU core count ramMb: number; // RAM in megabytes country: string; // Device country isEmulator: boolean; // true if running on emulator width: number; // Screen width in pixels height: number; // Screen height in pixels } // Adapt automation to device capabilities const device = agent.info.getDeviceInfo(); // Calculate center of screen const centerX = device.width / 2; const centerY = device.height / 2; // Check Android version for feature availability if (device.sdkVersion >= 33) { // Use Android 13+ features } // Log automation context const automation = agent.info.getAutomationInfo(); console.log(\`Running "\${automation.name}" on \${device.brand} \${device.model}\`); AgentInfo Get automation and device metadata AgentInfo Interface getAutomationInfo Returns metadata about the currently running automation. getDeviceInfo Returns information about the device hardware and configuration. getPhoneNumber Returns the phone number of the device getVerifiedPhoneNumber Returns the phone number stored on the device record on the server, only if it has been marked verified there. Resolves to null when the device has no verified phone number, the agent token is missing (i.e. the automation is not running through a job/agent), or the request fails. Differs from getPhoneNumber, which reads the SIM directly on the device. Backed by GET /api/v2/devices/verified-phone-number. AutomationInfo Contains metadata about the running automation. AutomationInfo Properties DeviceInfo Contains hardware and configuration information about the device. DeviceInfo Properties -------------------------------------------------------------------------------- ## Reference / AgentControl Path: /docs/automation/reference/agent/control Description: Control automation execution Methods: ### stopCurrentAutomation() Signature: stopCurrentAutomation(): void Immediately stops the current automation execution. Use this to gracefully terminate an automation when a condition is met or an error occurs. Sections: - Methods - Note Content: Access control methods through agent.control . Provides methods to control the automation execution flow. Methods Note When stopCurrentAutomation() is called, any code after the call will not execute. The automation terminates immediately. interface AgentControl { stopCurrentAutomation(): void; } AgentControl Control automation execution AgentControl Interface stopCurrentAutomation Immediately stops the current automation execution. Use this to gracefully terminate an automation when a condition is met or an error occurs. -------------------------------------------------------------------------------- ## Reference / AgentDisplay Path: /docs/automation/reference/agent/display Description: Display HTML overlays on screen Methods: ### displayHTMLCode() Signature: displayHTMLCode(htmlCode: string, x1: number, y1: number, x2: number, y2: number, opacity: number): void Displays an HTML overlay on the screen within the specified rectangle. ### hideHTMLCode() Signature: hideHTMLCode(): void Hides any currently displayed HTML overlay. Sections: - Methods - Processing... - Use Cases - Status Display - Progress Indicators - Error Messages - Debug Information - Tip Content: agent.display . Allows you to render HTML content as an overlay on the device screen. Methods ' + ' Processing... ' + ' Please wait while the automation runs. ' + ' Loading... Working... Use Cases Status Display Show the current status of the automation to the user, especially for long-running tasks. Progress Indicators Display progress bars or percentage completion for multi-step processes. Error Messages Show error messages or warnings that require user attention. Debug Information Display debug info during development to understand automation behavior. Tip HTML overlays are rendered on top of the screen content but don't intercept touch events. The overlay is purely visual and won't affect automation interactions with the underlying UI. interface AgentDisplay { displayHTMLCode(htmlCode: string, x1: number, y1: number, x2: number, y2: number, opacity: number): void; hideHTMLCode(): void; } AgentDisplay Display HTML overlays on screen AgentDisplay Interface displayHTMLCode Displays an HTML overlay on the screen within the specified rectangle. hideHTMLCode Hides any currently displayed HTML overlay. -------------------------------------------------------------------------------- ## Reference / AgentEmail Path: /docs/automation/reference/agent/email Description: Read emails via IMAP protocol Methods: ### readIMAPEmails() Signature: readIMAPEmails(email: string, password: string, host?: string, port?: number, skip?: number, limit?: number, proxyHost?: string, proxyPort?: number, proxyUser?: string, proxyPassword?: string): Promise Reads emails from an IMAP server. Useful for automations that need to verify email content (e.g., verification codes, confirmation emails). Sections: - Methods - Email Type - Security Note Content: agent.email . Allows reading emails from IMAP-enabled email accounts. Methods Email Type Security Note App Password interface AgentEmail { readIMAPEmails( email: string, password: string, host?: string, port?: number, skip?: number, limit?: number, proxyHost?: string, proxyPort?: number, proxyUser?: string, proxyPassword?: string ): Promise; } interface Email { id: string; // Unique email ID subject: string; // Email subject from: string; // Sender email address fromName: string; // Sender display name to: string[]; // Recipients cc: string[]; // CC recipients bcc: string[]; // BCC recipients date: number; // Timestamp (Unix ms) body: string; // Email body content isHtml: boolean; // true if body is HTML isRead: boolean; // Read status hasAttachments: boolean; // Has attachments attachmentNames: string[]; // Attachment file names } AgentEmail Read emails via IMAP protocol AgentEmail Interface readIMAPEmails Reads emails from an IMAP server. Useful for automations that need to verify email content (e.g., verification codes, confirmation emails). Email Represents an email message. Email Properties -------------------------------------------------------------------------------- ## Reference / AgentSMS Path: /docs/automation/reference/agent/sms Description: Read and send SMS messages from the device Methods: ### readSMS() Signature: readSMS(options?: ReadSMSOptions): Promise Reads SMS messages from the device. Useful for automations that need to read verification codes or other SMS content. All options are optional - calling with no arguments returns the 20 most recent messages. ### sendSMS() Signature: sendSMS(phoneNumber: string, message: string): Promise Sends an SMS message from the device. Long messages are automatically split into multiple parts. Requires SEND_SMS permission. Sections: - Methods - Types - Permissions Note Content: agent.sms . Allows reading and sending SMS messages on the device. Requires SMS permissions to be granted. ; sendSMS(phoneNumber: string, message: string): Promise Methods Types Permissions Note SMS operations require the READ_SMS SEND_SMS permissions to be granted on the device. The app will request these permissions automatically during setup. If permissions are denied, the corresponding methods will return an error. interface AgentSMS { readSMS(options?: ReadSMSOptions): Promise; sendSMS(phoneNumber: string, message: string): Promise; } interface ReadSMSOptions { phoneNumber?: string; // Filter by phone number (partial match) limit?: number; // Max messages (default: 20) skip?: number; // Pagination offset (default: 0) sortOrder?: "asc" | "desc"; // Sort by date (default: "desc") type?: number; // 1=inbox, 2=sent, 3=draft minDate?: number; // After this timestamp (ms) maxDate?: number; // Before this timestamp (ms) } interface ReadSMSOptions { phoneNumber?: string; // Filter by phone number (partial match) limit?: number; // Max messages to return (default: 20) skip?: number; // Messages to skip (default: 0) sortOrder?: "asc" | "desc"; // Sort by date (default: "desc") type?: number; // SMS type: 1=inbox, 2=sent, 3=draft minDate?: number; // Only after this timestamp (Unix ms) maxDate?: number; // Only before this timestamp (Unix ms) } interface SmsMessage { id: string; // Unique message ID address: string; // Phone number body: string; // Message text content date: number; // Timestamp (Unix ms) type: number; // 1=inbox, 2=sent, 3=draft read: boolean; // Has been read seen: boolean; // Has been seen } AgentSMS Read and send SMS messages from the device AgentSMS Interface readSMS Reads SMS messages from the device. Useful for automations that need to read verification codes or other SMS content. All options are optional - calling with no arguments returns the 20 most recent messages. sendSMS Sends an SMS message from the device. Long messages are automatically split into multiple parts. Requires SEND_SMS permission. ReadSMSOptions Options for filtering and paginating SMS messages. ReadSMSOptions Properties SmsMessage Represents an SMS message. SmsMessage Properties -------------------------------------------------------------------------------- ## Reference / AgentNotifications Path: /docs/automation/reference/agent/notifications Description: Handle system notifications Methods: ### setNotificationCallback() Signature: setNotificationCallback(callback: NotificationCallback | null): void Registers a callback to receive system notifications. Pass null to unregister. ### onProcessed() Signature: onProcessed(notificationId: string, shouldOpenNotification: boolean): void Call this after processing a notification to indicate it has been handled. Optionally open the notification. Sections: - Methods - Complete Example - Callback Types - Callback Parameters - Tip Content: Access notification methods through agent.notifications . Register callbacks to receive notifications and process them. Methods Complete Example Callback Types Callback Parameters Parameter Type Description id string Unique notification identifier packageName string App that sent the notification channelId string Notification channel ID extras any Notification extras (title, text, etc.) Tip Remember to call agent.notifications.setNotificationCallback(null) when you no longer need to receive notifications to avoid memory leaks and unnecessary processing. interface AgentNotifications { setNotificationCallback(callback: NotificationCallback | null): void; onProcessed(notificationId: string, shouldOpenNotification: boolean): void; } // Track notifications for a specific task const receivedCodes = []; // Set up listener agent.notifications.setNotificationCallback((id, packageName, channelId, extras) => { console.log("=== Notification Received ==="); console.log("Package:", packageName); console.log("Channel:", channelId); console.log("Title:", extras.title); console.log("Text:", extras.text); // Look for OTP codes const text = (extras.title || "") + " " + (extras.text || ""); const otpMatch = text.match(/\\b\\d{6}\\b/); if (otpMatch) { receivedCodes.push({ code: otpMatch[0], from: packageName, time: Date.now() }); } // Mark as processed agent.notifications.onProcessed(id, false); }); // Later, when you need the code: if (receivedCodes.length > 0) { const latestCode = receivedCodes[receivedCodes.length - 1]; console.log("Using code:", latestCode.code); } // Clean up when done agent.notifications.setNotificationCallback(null); type NotificationCallback = ( id: string, packageName: string, channelId: string, extras: { title?: string; text?: string; [key: string]: any; } ) => void; AgentNotifications Handle system notifications AgentNotifications Interface setNotificationCallback Registers a callback to receive system notifications. Pass null to unregister. onProcessed Call this after processing a notification to indicate it has been handled. Optionally open the notification. NotificationCallback -------------------------------------------------------------------------------- ## Reference / AndroidNode Path: /docs/automation/reference/android-node Description: Accessibility tree nodes for UI element interaction Methods: ### allNodes() Signature: allNodes(): AndroidNode[] Returns all nodes in the subtree as a flat array, including this node and all descendants. ### findById() Signature: findById(id: string): AndroidNode[] Finds all nodes with the matching viewId. ### findByIdOne() Signature: findByIdOne(id: string): AndroidNode | null Finds the first node with the matching viewId. ### findText() Signature: findText(textOrRegex: string | RegExp): AndroidNode[] Finds all nodes containing the text in description, text, or hintText (case-insensitive). Supports strings and regular expressions. ### findTextOne() Signature: findTextOne(textOrRegex: string | RegExp): AndroidNode | null Finds the first node containing the text (case-insensitive). ### find() Signature: find(predicate: (node: AndroidNode) => boolean): AndroidNode | null Finds the first node matching a custom predicate function. ### filter() Signature: filter(predicate: (node: AndroidNode) => boolean): AndroidNode[] Filters all nodes matching a custom predicate function. ### findAdvanced() Signature: findAdvanced(filterBuilder: (f: AndroidNodeFilter) => AndroidNodeFilter): AndroidNode | null Finds the first node using the AndroidNodeFilter builder pattern. ### filterAdvanced() Signature: filterAdvanced(filterBuilder: (f: AndroidNodeFilter) => AndroidNodeFilter): AndroidNode[] Filters all nodes using the AndroidNodeFilter builder pattern. ### matches() Signature: matches(filterBuilder: (f: AndroidNodeFilter) => AndroidNodeFilter): boolean Checks if this node matches the given filter. ### toJSON() Signature: toJSON(): object Converts the node to a plain JSON object. Useful for debugging or serialization. ### performAction() Signature: performAction(actionInt: number, data?: object, fieldsToIgnore?: string[]): Promise<{ actionPerformed: boolean }> Performs an accessibility action on this node. This is the preferred way to interact with UI elements as it uses the accessibility system. ### adbPerformAction() Signature: adbPerformAction(actionInt: number, data?: object, fieldsToIgnore?: string[]): Promise<{ actionPerformed: boolean }> Performs an accessibility action on this node via ADB. This is the ADB equivalent of performAction(). Uses UiAutomation through the ADB touch server instead of the accessibility service. ### randomClick() Signature: randomClick(): void Performs a tap at a random position within this node ### randomSwipe() Signature: randomSwipe(direction: Performs a swipe gesture within this node ### randomClickAdb() Signature: randomClickAdb(): void Performs a tap at a random position within this node ### randomSwipeAdb() Signature: randomSwipeAdb(direction: Performs a swipe gesture within this node Sections: - Properties - Methods - Action Methods - Common Patterns - Identification - Layout - Text Content - State - Collection Info - Hierarchy - Actions Content: AndroidNode represents a node in the Android accessibility tree. Each node corresponds to a UI element on screen and contains properties like text, bounds, and state. Use methods like screenContent() to get the root node and traverse the tree. Properties Identification Layout Text Content State Collection Info Hierarchy Actions Methods Action Methods These methods allow you to perform actions directly on nodes. Available since app version 2.119. Note: randomClickAdb() and randomSwipeAdb() require app version 2.141 (153)+. They use ADB shell commands instead of the accessibility service, which can be more reliable in some scenarios. Common Patterns // Get the accessibility tree const screen = await agent.actions.screenContent(); // Find elements const button = screen.findTextOne("Submit"); const allInputs = screen.filterAdvanced(f => f.isEditText()); // Interact with elements if (button) { const { left, top, right, bottom } = button.boundsInScreen; await agent.actions.tap((left + right) / 2, (top + bottom) / 2); } const button = screen.findTextOne("Submit"); if (button) { button.randomClick(); } const button = screen.findAdvanced(f => f.isButton().hasText("OK")); if (button) { await button.performAction(agent.constants.ACTION_CLICK); } const input = screen.findAdvanced(f => f.isEditText().isEditable()); if (input) { await input.performAction(agent.constants.ACTION_CLICK); await agent.actions.writeText("Hello World"); } const scrollView = screen.findAdvanced(f => f.isScrollable()); if (scrollView) { // Using accessibility action await scrollView.performAction(agent.constants.ACTION_SCROLL_FORWARD); // Or using random swipe for more natural scrolling scrollView.randomSwipe("up"); } AndroidNode Accessibility tree nodes for UI element interaction Getting Screen Content AndroidNodeFilter Learn about the builder pattern for complex node queries allNodes Returns all nodes in the subtree as a flat array, including this node and all descendants. findById Finds all nodes with the matching viewId. findByIdOne Finds the first node with the matching viewId. findText Finds all nodes containing the text in description, text, or hintText (case-insensitive). Supports strings and regular expressions. findTextOne Finds the first node containing the text (case-insensitive). find Finds the first node matching a custom predicate function. filter Filters all nodes matching a custom predicate function. findAdvanced Finds the first node using the AndroidNodeFilter builder pattern. filterAdvanced Filters all nodes using the AndroidNodeFilter builder pattern. matches Checks if this node matches the given filter. toJSON Converts the node to a plain JSON object. Useful for debugging or serialization. performAction Performs an accessibility action on this node. This is the preferred way to interact with UI elements as it uses the accessibility system. adbPerformAction Performs an accessibility action on this node via ADB. This is the ADB equivalent of performAction(). Uses UiAutomation through the ADB touch server instead of the accessibility service. randomClick Performs a tap at a random position within this node randomSwipe Performs a swipe gesture within this node randomClickAdb Performs a tap at a random position within this node randomSwipeAdb Performs a swipe gesture within this node Click a button (random position) Click using accessibility action Type in an input field Scroll a list -------------------------------------------------------------------------------- ## Reference / AndroidNodeFilter Path: /docs/automation/reference/android-node/filter Description: Builder pattern for complex node queries Methods: ### isButton() Signature: isButton(): AndroidNodeFilter Matches nodes with className ### isText() Signature: isText(): AndroidNodeFilter Matches nodes with className ### isImage() Signature: isImage(): AndroidNodeFilter Matches nodes with className ### isEditText() Signature: isEditText(): AndroidNodeFilter Matches nodes with className ### isCheckBox() Signature: isCheckBox(): AndroidNodeFilter Matches nodes with className ### isRadioButton() Signature: isRadioButton(): AndroidNodeFilter Matches nodes with className ### isSwitch() Signature: isSwitch(): AndroidNodeFilter Matches nodes with className ### isSeekBar() Signature: isSeekBar(): AndroidNodeFilter Matches nodes with className ### isViewGroup() Signature: isViewGroup(): AndroidNodeFilter Matches nodes with className ### is() Signature: is(className: string): AndroidNodeFilter Matches nodes with a custom className. ### isClickable() Signature: isClickable(): AndroidNodeFilter Matches nodes that are clickable. ### isScrollable() Signature: isScrollable(): AndroidNodeFilter Matches nodes that are scrollable. ### isSelected() Signature: isSelected(): AndroidNodeFilter Matches nodes that are currently selected. ### isEditable() Signature: isEditable(): AndroidNodeFilter Matches nodes that allow text editing. ### hasText() Signature: hasText(text: string): AndroidNodeFilter Matches nodes with exact text content. ### text() Signature: text(condition: (text: string | undefined) => boolean): AndroidNodeFilter Matches nodes based on a custom text condition. ### hasDescription() Signature: hasDescription(description: string): AndroidNodeFilter Matches nodes with exact content description (accessibility label). ### descriptionContains() Signature: descriptionContains(part: string): AndroidNodeFilter Matches nodes whose description contains the given text. ### description() Signature: description(condition: (desc: string | undefined) => boolean): AndroidNodeFilter Matches nodes based on a custom description condition. ### hasChildWithText() Signature: hasChildWithText(text: string): AndroidNodeFilter Matches nodes that have a direct child with the specified text. ### hasId() Signature: hasId(id: string): AndroidNodeFilter Matches nodes with the specified viewId. ### hasPackageName() Signature: hasPackageName(packageName: string): AndroidNodeFilter Matches nodes belonging to the specified package. ### hasChildWithId() Signature: hasChildWithId(id: string): AndroidNodeFilter Matches nodes that have a direct child with the specified viewId. ### and() Signature: and(filterBuilder: (f: AndroidNodeFilter) => void): AndroidNodeFilter Combines with another filter using AND logic. The node must match both filters. ### or() Signature: or(filterBuilder: (f: AndroidNodeFilter) => void): AndroidNodeFilter Combines with another filter using OR logic. The node must match either filter. ### parent() Signature: parent(filterBuilder: (f: AndroidNodeFilter) => void): AndroidNodeFilter Matches nodes whose immediate parent matches the given filter. ### anyParent() Signature: anyParent(filterBuilder: (f: AndroidNodeFilter) => void): AndroidNodeFilter Matches nodes that have any ancestor matching the given filter. Sections: - Class Matchers - State Matchers - Content Matchers - Identity Matchers - Logical Operators - Hierarchy Matchers - Complex Examples Content: AndroidNodeFilter provides a fluent builder interface for constructing complex queries to find nodes in the accessibility tree. All methods are chainable and return the filter instance. Class Matchers Match nodes based on their Android class type. State Matchers Match nodes based on their current state. Content Matchers Match nodes based on their text content. Identity Matchers Match nodes based on their identity attributes. Logical Operators Combine multiple conditions using logical operators. Hierarchy Matchers Match nodes based on their position in the node hierarchy. Complex Examples // Find a submit button const button = screen.findAdvanced(f => f.isButton().hasText("Submit")); // Find all editable text fields const inputs = screen.filterAdvanced(f => f.isEditText().isEditable()); // Complex query with multiple conditions const node = screen.findAdvanced(f => f.isClickable() .hasPackageName("com.example.app") .anyParent(p => p.hasId("com.example:id/main_container")) ); const loginBtn = screen.findAdvanced(f => f.isButton() .hasText("Login") .isClickable() .anyParent(p => p.hasId("com.example:id/login_form")) ); const prices = screen.filterAdvanced(f => f.isText() .text(t => /^\\$\\d+/.test(t || "")) .anyParent(p => p.hasId("com.example:id/product_card")) ); for (const price of prices) { console.log("Price:", price.text); } const emptyInput = screen.findAdvanced(f => f.isEditText() .isEditable() .text(t => !t || t.length === 0) ); if (emptyInput) { await emptyInput.performAction(agent.constants.ACTION_CLICK); await agent.actions.writeText("Hello"); } const confirmBtn = screen.findAdvanced(f => f.isButton() .isClickable() .or(o => o.hasText("OK")) .or(o => o.hasText("Confirm")) .or(o => o.hasText("Yes")) .or(o => o.hasText("Accept")) ); AndroidNodeFilter Builder pattern for complex node queries Basic Usage isButton Matches nodes with className isText Matches nodes with className isImage Matches nodes with className isEditText Matches nodes with className isCheckBox Matches nodes with className isRadioButton Matches nodes with className isSwitch Matches nodes with className isSeekBar Matches nodes with className isViewGroup Matches nodes with className is Matches nodes with a custom className. isClickable Matches nodes that are clickable. isScrollable Matches nodes that are scrollable. isSelected Matches nodes that are currently selected. isEditable Matches nodes that allow text editing. hasText Matches nodes with exact text content. text Matches nodes based on a custom text condition. hasDescription Matches nodes with exact content description (accessibility label). descriptionContains Matches nodes whose description contains the given text. description Matches nodes based on a custom description condition. hasChildWithText Matches nodes that have a direct child with the specified text. hasId Matches nodes with the specified viewId. hasPackageName Matches nodes belonging to the specified package. hasChildWithId Matches nodes that have a direct child with the specified viewId. and Combines with another filter using AND logic. The node must match both filters. or Combines with another filter using OR logic. The node must match either filter. parent Matches nodes whose immediate parent matches the given filter. anyParent Matches nodes that have any ancestor matching the given filter. Find login button in a specific form Find all product prices Find first empty input field Find any confirmation button -------------------------------------------------------------------------------- ## Reference / Types Path: /docs/automation/reference/types Description: All supporting types used throughout the Automation API Sections: - File Types - Info Types - Email Type - OCR Types - Callback Types - Other Types - Usage Examples Content: This page documents all supporting types, interfaces, and type aliases used by the Automation API. File Types Types used by agent.utils.files methods. Info Types Types used by agent.info methods. Email Type OCR Types Hierarchical types returned by agent.actions.recognizeText() . The structure follows: TextJSON → TextBlockJSON → LineJSON → ElementJSON → SymbolJSON. Callback Types Function types for event callbacks in agent.utils Other Types Usage Examples interface FileInfo { name: string; // File or directory name path: string; // Full absolute path isDirectory: boolean; // true if directory isFile: boolean; // true if file size: number; // Size in bytes (0 for directories) lastModified: number; // Last modified timestamp } interface DirectoryInfo { exists: true; path: string; name: string; isDirectory: true; isFile: false; lastModified: number; canRead: boolean; canWrite: boolean; fileCount: number; // Number of files directoryCount: number; // Number of subdirectories totalItems: number; // Total items (files + directories) } interface FilePathInfo { exists: true; path: string; name: string; isDirectory: false; isFile: true; lastModified: number; canRead: boolean; canWrite: boolean; size: number; } interface PathNotFoundInfo { exists: false; error?: string; // Optional error message } interface FileHashes { md5: string; // MD5 hash sha1: string; // SHA-1 hash sha256: string; // SHA-256 hash size: number; // File size } interface FileHashError { error: string; } interface AutomationInfo { name: string; // Automation name description: string; // Automation description launchId: string; // Unique ID for this execution agent?: { // Optional agent details id: string; commitId: string; token: string; jobTaskId: string; }; serverBaseUrl: string; // Server URL for API calls timeout?: number; // Optional timeout in minutes available since app v2.123 (135) } interface DeviceInfo { id: string; // Unique device ID brand: string; // Device brand (e.g., "Samsung") model: string; // Device model sdkVersion: number; // Android SDK version processor: string; // CPU info numberOfCores: number; // CPU cores ramMb: number; // RAM in MB country: string; // Device country isEmulator: boolean; // true if emulator width: number; // Screen width in pixels height: number; // Screen height in pixels } interface Email { id: string; // Unique email ID subject: string; // Email subject from: string; // Sender email address fromName: string; // Sender display name to: string[]; // Recipients cc: string[]; // CC recipients bcc: string[]; // BCC recipients date: number; // Timestamp (Unix ms) body: string; // Email body content isHtml: boolean; // true if body is HTML isRead: boolean; // Read status hasAttachments: boolean; // Has attachments attachmentNames: string[]; // Attachment file names } interface TextJSON { text: string; // Complete recognized text textBlocks: TextBlockJSON[]; // Array of text blocks } interface TextBlockJSON { text: string; // Text content of the block recognizedLanguage: string; // Detected language boundingBox?: RectJSON; // Bounding box on screen lines: LineJSON[]; // Lines within the block } interface LineJSON { text: string; // Text content of the line angle: number; // Rotation angle confidence: number; // Recognition confidence (0-1) recognizedLanguage: string; // Detected language boundingBox?: RectJSON; // Bounding box on screen elements: ElementJSON[]; // Words/elements in the line } interface ElementJSON { text: string; // Text content (usually a word) angle: number; // Rotation angle confidence: number; // Recognition confidence (0-1) recognizedLanguage: string; // Detected language boundingBox?: RectJSON; // Bounding box on screen symbols: SymbolJSON[]; // Individual characters } interface SymbolJSON { text: string; // Single character angle: number; // Rotation angle confidence: number; // Recognition confidence (0-1) recognizedLanguage: string; // Detected language boundingBox?: RectJSON; // Bounding box on screen } interface RectJSON { left: number; // Left edge top: number; // Top edge right: number; // Right edge bottom: number; // Bottom edge } type NotificationCallback = ( id: string, // Notification ID packageName: string, // Source app package channelId: string, // Notification channel extras: any // Notification data (title, text, etc.) ) => void; type NetworkCallback = ( networkAvailable: boolean // true if network is available ) => void; type ToastCallback = ( packageName: string, // Source app package data: { message: string } // Toast message ) => void; interface MultiTapSequenceItem { x: number; // X coordinate to tap y: number; // Y coordinate to tap delay: number; // Delay in ms after this tap } const info = agent.utils.files.getPathInfo("/sdcard/Download"); if (info.exists) { if (info.isDirectory) { // TypeScript knows this is DirectoryInfo console.log("Contains", info.totalItems, "items"); } else { // TypeScript knows this is FilePathInfo console.log("File size:", info.size); } } else { // TypeScript knows this is PathNotFoundInfo console.log("Path not found:", info.error); } const { screenshot } = await agent.actions.screenshot(1080, 1920, 90); const result = await agent.actions.recognizeText(screenshot); // Access full text console.log("Full text:", result.text); // Iterate through hierarchy for (const block of result.textBlocks) { console.log("Block:", block.text); for (const line of block.lines) { console.log(" Line:", line.text, "confidence:", line.confidence); // Get bounding box for the line if (line.boundingBox) { const { left, top, right, bottom } = line.boundingBox; console.log(" Position:", left, top, "to", right, bottom); } } } Types All supporting types used throughout the Automation API FileInfo Basic file or directory information returned by list(). DirectoryInfo Detailed information about a directory. FilePathInfo Detailed information about a file. PathNotFoundInfo Returned when a path doesn FileHashes Hash values for a file. FileHashError Error returned when hashing fails. AutomationInfo Metadata about the running automation. DeviceInfo Hardware and configuration info about the device. Email Email message returned by readIMAPEmails(). TextJSON Root OCR result containing all recognized text. TextBlockJSON A block of text (paragraph or distinct region). LineJSON A single line of text. ElementJSON A word or element within a line. SymbolJSON A single character. RectJSON Bounding box coordinates. NotificationCallback Callback for receiving system notifications. NetworkCallback Callback for network state changes. ToastCallback Callback for receiving toast messages. MultiTapSequenceItem Item in a multi-tap sequence for agent.actions.multiTap(). Working with file types Working with OCR results -------------------------------------------------------------------------------- ## Reference / Helper Functions Path: /docs/automation/reference/helpers Description: Standalone utility functions for working with AndroidNode trees Methods: ### getAllNodes() Signature: getAllNodes(rootNode: AndroidNode | null | undefined): AndroidNode[] Returns all nodes in the tree as a flat array. Equivalent to calling rootNode.allNodes() but handles null/undefined input safely. ### buildNodePath() Signature: buildNodePath(rootNode: AndroidNode | null | undefined, targetNode: AndroidNode | null | undefined): AndroidNode[] Builds the path from the root node to the target node as an array of nodes. Useful for understanding the hierarchy or debugging. ### findNodesById() Signature: findNodesById(rootNode: AndroidNode | null | undefined, id: string): AndroidNode[] Finds all nodes with the matching viewId. Equivalent to rootNode.findById(id) but handles null/undefined input. ### findNodesByText() Signature: findNodesByText(rootNode: AndroidNode | null | undefined, text: string): AndroidNode[] Finds all nodes containing the specified text in their text, description, or hintText properties. Case-insensitive search. ### findParentOf() Signature: findParentOf(rootNode: AndroidNode | null | undefined, targetNode: AndroidNode | null | undefined): AndroidNode | null Finds the parent of the target node within the tree. Alternative to accessing targetNode.parent directly, with null-safe handling. Sections: - Functions - When to Use Helper Functions vs Methods - Use Helper Functions When: - Use Instance Methods When: Content: These global helper functions provide additional ways to work with the accessibility tree. They are available alongside the methods on AndroidNode instances. Functions When to Use Helper Functions vs Methods Use Helper Functions When: The root node might be null or undefined You want consistent null-safe behavior You prefer a functional programming style Working with nodes from potentially failed operations Use Instance Methods When: You have a guaranteed non-null AndroidNode You're chaining operations fluently You need the advanced filter builder pattern Writing more concise code // All helper functions are globally available declare function getAllNodes(rootNode: AndroidNode | null | undefined): AndroidNode[]; declare function buildNodePath(rootNode: AndroidNode | null | undefined, targetNode: AndroidNode | null | undefined): AndroidNode[]; declare function findNodesById(rootNode: AndroidNode | null | undefined, id: string): AndroidNode[]; declare function findNodesByText(rootNode: AndroidNode | null | undefined, text: string): AndroidNode[]; declare function findParentOf(rootNode: AndroidNode | null | undefined, targetNode: AndroidNode | null | undefined): AndroidNode | null; // Using helper function (null-safe) const allNodes = getAllNodes(screen); const byId = findNodesById(screen, "com.example:id/btn"); const byText = findNodesByText(screen, "Submit"); // Using instance methods (requires non-null node) const allNodes2 = screen.allNodes(); const byId2 = screen.findById("com.example:id/btn"); const byText2 = screen.findText("Submit"); // Instance methods also provide advanced filtering const filtered = screen.filterAdvanced(f => f.isButton().isClickable()); Helper Functions Standalone utility functions for working with AndroidNode trees Overview getAllNodes Returns all nodes in the tree as a flat array. Equivalent to calling rootNode.allNodes() but handles null/undefined input safely. buildNodePath Builds the path from the root node to the target node as an array of nodes. Useful for understanding the hierarchy or debugging. findNodesById Finds all nodes with the matching viewId. Equivalent to rootNode.findById(id) but handles null/undefined input. findNodesByText Finds all nodes containing the specified text in their text, description, or hintText properties. Case-insensitive search. findParentOf Finds the parent of the target node within the tree. Alternative to accessing targetNode.parent directly, with null-safe handling. Comparison -------------------------------------------------------------------------------- ## Reference / API Reference Path: /docs/automation/reference/api Description: REST API endpoints for automation control Sections: - Usage Example - Error Handling - Authentication - Launch Automation - Launch Direct Automation - Stop All Automations - Get Automation Logs - Write Project File - Read Project File - Commit Project Changes - Get Out of Steps Records - Mark Out of Steps Status - 401 Unauthorized - 400 Bad Request - Device Offline Content: These API endpoints allow you to programmatically control automations on devices. All endpoints require authentication via Bearer token in the Authorization header. Authentication All API requests must include an Authorization header with a valid Bearer token: Authorization: Bearer your_api_token POST /api/v2/devices/launchAutomation Launch Automation Launches a saved automation project on one or more devices. Use this to run automations that have been created and saved in the IDE. Request Parameters Parameter Type Required Description device_ids string[] Yes Array of device IDs to run the automation on automationId string Yes The ID of the saved automation project command string Yes Command to execute: "start" "stop" automationParameters object No Parameters defined in the automation schema jobVariables object No Job-specific variables for the automation Responses 200 Success 400 Bad Request POST /api/v2/devices/launchDirectAutomation Launch Direct Automation Launches automation code directly on a device without saving it as a project. You provide a unique launch ID (UUID) to track the automation and retrieve logs. Request Parameters Parameter Type Required Description device_id string Yes The ID of the device to run automation on launch_id string (UUID) Yes Unique identifier for this automation launch (generate a UUID) command string Yes Command to execute: "start" "stop" code string Yes JavaScript/TypeScript code to execute on the device automationParameters object No agent.arguments.automationParameters jobVariables object No agent.arguments.jobVariables Responses 200 Success 400 Bad Request Using the launch_id The launch_id automationLogs to retrieve logs for this automation. POST /api/v2/devices/stopAllAutomations Stop All Automations Stops all running automations on the specified device. Request Parameters Parameter Type Required Description device_id string Yes The ID of the device to stop automations on Responses 200 Success 400 Bad Request Invalid device_id or device not found POST /api/v2/devices/automationLogs Get Automation Logs Retrieves console logs from running automations. Only returns logs from the last 10 minutes. Use the from parameter for pagination to get logs after a specific log ID. Request Parameters Parameter Type Required Description device_id string Yes The ID of the device to get logs from automation_ids string[] Yes Array of automation IDs or launch IDs (for direct automations) to get logs for from string No Log ID to get logs after (for pagination) automation_ids for Direct Automations When using launchDirectAutomation launch_id automation_ids array to retrieve logs for that specific automation. Responses 200 Success Log Levels Level Description log Standard console.log output warn Warning messages from console.warn error Error messages from console.error PUT /api/v2/automation/file/:id Write Project File Creates or updates a file within an automation project. For TypeScript files, the content is automatically compiled to JavaScript. URL Parameters Parameter Type Description id string The automation project ID Request Body Parameter Type Description path string main.ts utils/helpers.ts content string The file content (max ~50MB) Responses 200 Success 400 TypeScript Compilation Error 404 Not Found Automation project not found or access denied TypeScript Compilation When writing .ts files, the TypeScript is automatically compiled and both the source ( .ts ) and compiled ( .js ) files are saved. You cannot directly edit .js files that have a paired .ts GET /api/v2/automation/file/:id Read Project File Reads the content of a file within an automation project. URL Parameters Parameter Type Description id string The automation project ID Query Parameters Parameter Type Description path string main.ts Responses 200 Success 404 Not Found File or automation project not found POST /api/v2/automation/commit/:id Commit Project Changes Creates a new commit with all current changes in the automation project. The project must have uncommitted changes for this to succeed. URL Parameters Parameter Type Description id string The automation project ID Request Body Parameter Type Description message string The commit message describing the changes (1-1000 characters) Responses 201 Created 400 No Changes 404 Not Found Automation project not found or access denied Version Control Each automation project has built-in version control. After making changes with the Write Project File API, use this endpoint to commit those changes. The commit hash can be used to track versions and revert if needed. GET /api/v2/automation-project/out-of-steps Get Out of Steps Records Retrieves out of steps records for automations where you are the owner or a shared editor. These records contain screenshots and UI data captured when an automation encounters an unknown screen or runs out of steps. Returns one record at a time with cursor-based pagination. Query Parameters Parameter Type Description cursor string Direct link to a specific record by ID before_cursor string Get records older than this cursor (for "next" pagination) after_cursor string Get records newer than this cursor (for "prev" pagination) automation_id string Filter by automation ID type string Filter by type: outOfSteps timeout debug crash status string Filter by status: pending solved skipped partial string Filter by partial: true or false Responses 200 Success Screen Object Fields uiUrl - Full URL to the UI JSON file containing the accessibility tree screenshot.screenshotUrl nextStage - The stage the automation was attempting to reach screenState - The identified screen state (or "unknown") maxSteps - Remaining steps when captured (0 = out of steps) PATCH /api/v2/automation-project/out-of-steps/:id/status Mark Out of Steps Status Mark an out of steps record as solved or skipped. Only automation editors (owner or shared_editor) can mark status. Once marked, status cannot be changed. URL Parameters Parameter Type Description id string The out of steps record ID Request Body Parameter Type Required Description status string Yes The status to set: "solved" "skipped" Responses 200 Success 400 Already Marked 403 Forbidden You don't have permission to mark this out of steps record Status is Immutable Once a status is set to solved skipped , it cannot be changed. Make sure you've addressed the issue before marking it. Usage Example Here's a complete example of launching a direct automation and polling for logs: Error Handling 401 Unauthorized The API token is missing, invalid, or expired. Ensure you're passing a valid Bearer token in the Authorization header. 400 Bad Request The request parameters are invalid or missing. Check that all required fields are provided with correct types. Device Offline The target device is not connected. Ensure the device is online and has the Xgodo app running before launching automations. { "device_ids": ["device_123456", "device_789012"], "automationId": "83ba7359-d8eb-4374-8ecb-055af01fddfe", "command": "start", "automationParameters": { "maxRetries": 3, "timeout": 30000 }, "jobVariables": { "email": "user@example.com", "password": "secret123" } } { "success": true, "message": "Action performed successfully" } { "success": false, "message": "Some devices offline or not found." } { "device_id": "device_123456", "launch_id": "550e8400-e29b-41d4-a716-446655440000", "command": "start", "code": "console.log('Hello from automation!');", "automationParameters": { "maxRetries": 3 }, "jobVariables": { "targetUrl": "https://example.com" } } { "success": true, "message": "Action performed successfully" } { "success": false, "message": "Some devices offline or not found." } { "device_id": "device_123456" } { "success": true, "message": "All automations stopped successfully" } { "device_id": "device_123456", "automation_ids": ["550e8400-e29b-41d4-a716-446655440000"], "from": "e0e1a3ef-7b2a-42b5-8f21-b845fa41c8b9" } { "logs": [ { "id": "e0e1a3ef-7b2a-42b5-8f21-b845fa41c8b9", "automationId": "550e8400-e29b-41d4-a716-446655440000", "message": "Processing screen content...", "lineNumber": 42, "messageLevel": "log", "timestamp": 1755811585870 }, { "id": "f1f2b4ff-8c3b-53c6-9g32-c956gb52d9ca", "automationId": "550e8400-e29b-41d4-a716-446655440000", "message": "Button clicked successfully", "lineNumber": 45, "messageLevel": "log", "timestamp": 1755811586120 } ] } { "path": "main.ts", "content": "async function main() {\\n console.log('Hello!');\\n stopCurrentAutomation();\\n}\\n\\nmain();" } { "success": true, "path": "main.ts", "compiledPath": "main.js", "compiled": true } { "success": false, "message": "TypeScript compilation failed", "errors": ["Error at 5:10: Property 'foo' does not exist on type 'string'."] } curl "https://xgodo.com/api/v2/automation/file/83ba7359-d8eb-4374-8ecb-055af01fddfe?path=main.ts" \\ -H "Authorization: Bearer your_api_token" { "success": true, "content": "async function main() {\\n console.log('Hello!');\\n stopCurrentAutomation();\\n}\\n\\nmain();", "path": "main.ts" } { "message": "Add error handling to main automation flow" } { "success": true, "commit": { "hash": "a1b2c3d4e5f6789012345678901234567890abcd", "message": "Add error handling to main automation flow" } } { "success": false, "message": "No changes to commit" } { "success": true, "result": { "device_info": { "brand": "Samsung", "model": "Galaxy S21", ... }, "automation_name": "My Automation", "out_of_steps_id": "507f1f77bcf86cd799439011", "automation_id": "507f1f77bcf86cd799439012", "type": "outOfSteps", "status": "pending", "screens": [ { "_id": "507f1f77bcf86cd799439013", "uiUrl": "https://server.com/outofsteps/2024/01/15/.../ui_....json", "screenshot": { "screenshotUrl": "https://server.com/outofsteps/2024/01/15/.../screenshot_....jpeg", "compressedWidth": 540, "compressedHeight": 1200, "originalWidth": 1080, "originalHeight": 2400 }, "nextStage": "main", "screenState": "unknown", "maxSteps": 0, "timestamp": 1705334400000 } ], "added": "2024-01-15T12:00:00.000Z" }, "total": 100, "currentPage": 1, "currentCursor": "507f1f77bcf86cd799439011", "hasNext": true, "hasPrev": false } { "status": "solved" } { "success": true, "message": "Out of steps marked as solved", "status": "solved" } { "success": false, "message": "Status already marked as \\"solved\\". Cannot change status once marked." } const API_BASE = "https://xgodo.com"; const TOKEN = "your_api_token"; // Generate a unique launch ID function generateUUID(): string { return crypto.randomUUID(); } async function runDirectAutomation(deviceId: string, code: string) { const launchId = generateUUID(); // Launch the automation const launchRes = await fetch(\`\${API_BASE}/api/v2/devices/launchDirectAutomation\`, { method: "POST", headers: { "Content-Type": "application/json", "Authorization": \`Bearer \${TOKEN}\` }, body: JSON.stringify({ device_id: deviceId, launch_id: launchId, command: "start", code: code, automationParameters: { maxRetries: 3 }, jobVariables: { targetUrl: "https://example.com" } }) }); const launchData = await launchRes.json(); if (!launchData.success) { throw new Error(launchData.message); } console.log("Automation started with launch ID:", launchId); // Poll for logs using the launch_id let lastLogId: string | undefined; const pollLogs = async () => { const logsRes = await fetch(\`\${API_BASE}/api/v2/devices/automationLogs\`, { method: "POST", headers: { "Content-Type": "application/json", "Authorization": \`Bearer \${TOKEN}\` }, body: JSON.stringify({ device_id: deviceId, automation_ids: [launchId], // Use launch_id here from: lastLogId }) }); const logsData = await logsRes.json(); for (const log of logsData.logs) { console.log(\`[\${log.messageLevel}] \${log.message}\`); lastLogId = log.id; } }; // Poll every 2 seconds const interval = setInterval(pollLogs, 2000); // Stop after 60 seconds setTimeout(async () => { clearInterval(interval); // Stop the automation await fetch(\`\${API_BASE}/api/v2/devices/stopAllAutomations\`, { method: "POST", headers: { "Content-Type": "application/json", "Authorization": \`Bearer \${TOKEN}\` }, body: JSON.stringify({ device_id: deviceId }) }); console.log("Automation stopped"); }, 60000); } // Usage runDirectAutomation("device_123456", \` const params = agent.arguments.automationParameters; const vars = agent.arguments.jobVariables; console.log("Max retries:", params.maxRetries); console.log("Target URL:", vars.targetUrl); await agent.actions.goHome(); console.log("Done!"); stopCurrentAutomation(); \`); API Reference REST API endpoints for automation control Example Request Example Request Example Request Example Request Example Request Example Request Example Request Example Request Example: Launch and Monitor Direct Automation -------------------------------------------------------------------------------- ================================================================================ END OF DOCUMENTATION ================================================================================