# Directory Structure
```
├── .editorconfig
├── .github
│ ├── ISSUE_TEMPLATE
│ │ └── bug_report.md
│ └── workflows
│ └── build.yml
├── .gitignore
├── .husky
│ └── pre-commit
├── .mocharc.yml
├── .npmignore
├── CHANGELOG.md
├── eslint.config.mjs
├── LICENSE
├── mobile-mcp.png
├── package-lock.json
├── package.json
├── README.md
├── SECURITY.md
├── server.json
├── src
│ ├── android.ts
│ ├── image-utils.ts
│ ├── index.ts
│ ├── ios.ts
│ ├── iphone-simulator.ts
│ ├── logger.ts
│ ├── mobilecli.ts
│ ├── png.ts
│ ├── robot.ts
│ ├── server.ts
│ └── webdriver-agent.ts
├── test
│ ├── android.ts
│ ├── ios.ts
│ ├── iphone-simulator.ts
│ └── png.ts
└── tsconfig.json
```
# Files
--------------------------------------------------------------------------------
/.mocharc.yml:
--------------------------------------------------------------------------------
```yaml
timeout: 60s
```
--------------------------------------------------------------------------------
/.npmignore:
--------------------------------------------------------------------------------
```
**/*
README.md
LICENSE
!index.*
!lib/**/*.js
```
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
```
**/.DS_Store
node_modules
lib
.nyc_output
.vscode
```
--------------------------------------------------------------------------------
/.editorconfig:
--------------------------------------------------------------------------------
```
[*]
indent_style = tab
indent_size = 8
tab_width = 8
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
max_line_length = 150
```
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
```markdown
# Mobile Next - MCP server for Mobile Development and Automation | iOS, Android, Simulator, Emulator, and Real Devices
This is a [Model Context Protocol (MCP) server](https://github.com/modelcontextprotocol) that enables scalable mobile automation, development through a platform-agnostic interface, eliminating the need for distinct iOS or Android knowledge. You can run it on emulators, simulators, and real devices (iOS and Android).
This server allows Agents and LLMs to interact with native iOS/Android applications and devices through structured accessibility snapshots or coordinate-based taps based on screenshots.
<h4 align="center">
<a href="https://github.com/mobile-next/mobile-mcp">
<img src="https://img.shields.io/github/stars/mobile-next/mobile-mcp" alt="Mobile Next Stars" />
</a>
<a href="https://github.com/mobile-next/mobile-mcp">
<img src="https://img.shields.io/github/contributors/mobile-next/mobile-mcp?color=green" alt="Mobile Next Downloads" />
</a>
<a href="https://www.npmjs.com/package/@mobilenext/mobile-mcp">
<img src="https://img.shields.io/npm/dm/@mobilenext/mobile-mcp?logo=npm&style=flat&color=red" alt="npm" />
</a>
<a href="https://github.com/mobile-next/mobile-mcp/releases">
<img src="https://img.shields.io/github/release/mobile-next/mobile-mcp" />
</a>
<a href="https://github.com/mobile-next/mobile-mcp/blob/main/LICENSE">
<img src="https://img.shields.io/badge/license-Apache 2.0-blue.svg" alt="Mobile MCP is released under the Apache-2.0 License" />
</a>
<a href="https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%7B%22name%22%3A%22mobile-mcp%22%2C%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40mobilenext%2Fmobile-mcp%40latest%22%5D%7D">
<img src="https://img.shields.io/badge/VS_Code-VS_Code?style=flat-square&label=Install%20Server&color=0098FF" alt="Install in VS Code" />
</a>
</h4>
<h4 align="center">
<a href="https://github.com/mobile-next/mobile-mcp/wiki">
<img src="https://img.shields.io/badge/documentation-wiki-blue" alt="wiki" />
</a>
<a href="http://mobilenexthq.com/join-slack">
<img src="https://img.shields.io/badge/join-Slack-blueviolet?logo=slack&style=flat" alt="join on Slack" />
</a>
</h4>
https://github.com/user-attachments/assets/c4e89c4f-cc71-4424-8184-bdbc8c638fa1
<p align="center">
<a href="https://github.com/mobile-next/">
<img alt="mobile-mcp" src="https://raw.githubusercontent.com/mobile-next/mobile-next-assets/refs/heads/main/mobile-mcp-banner.png" width="600" />
</a>
</p>
### 🚀 Mobile MCP Roadmap: Building the Future of Mobile
Join us on our journey as we continuously enhance Mobile MCP!
Check out our detailed roadmap to see upcoming features, improvements, and milestones. Your feedback is invaluable in shaping the future of mobile automation.
👉 [Explore the Roadmap](https://github.com/orgs/mobile-next/projects/3)
### Main use cases
How we help to scale mobile automation:
- 📲 Native app automation (iOS and Android) for testing or data-entry scenarios.
- 📝 Scripted flows and form interactions without manually controlling simulators/emulators or real devices (iPhone, Samsung, Google Pixel etc)
- 🧭 Automating multi-step user journeys driven by an LLM
- 👆 General-purpose mobile application interaction for agent-based frameworks
- 🤖 Enables agent-to-agent communication for mobile automation usecases, data extraction
## Main Features
- 🚀 **Fast and lightweight**: Uses native accessibility trees for most interactions, or screenshot based coordinates where a11y labels are not available.
- 🤖 **LLM-friendly**: No computer vision model required in Accessibility (Snapshot).
- 🧿 **Visual Sense**: Evaluates and analyses what’s actually rendered on screen to decide the next action. If accessibility data or view-hierarchy coordinates are unavailable, it falls back to screenshot-based analysis.
- 📊 **Deterministic tool application**: Reduces ambiguity found in purely screenshot-based approaches by relying on structured data whenever possible.
- 📺 **Extract structured data**: Enables you to extract structred data from anything visible on screen.
## 🏗️ Mobile MCP Architecture
<p align="center">
<a href="https://raw.githubusercontent.com/mobile-next/mobile-next-assets/refs/heads/main/mobile-mcp-arch-1.png">
<img alt="mobile-mcp" src="https://raw.githubusercontent.com/mobile-next/mobile-next-assets/refs/heads/main/mobile-mcp-arch-1.png" width="600">
</a>
</p>
## 📚 Wiki page
More details in our [wiki page](https://github.com/mobile-next/mobile-mcp/wiki) for setup, configuration and debugging related questions.
## Installation and configuration
**Standard config** works in most of the tools:
```json
{
"mcpServers": {
"mobile-mcp": {
"command": "npx",
"args": ["-y", "@mobilenext/mobile-mcp@latest"]
}
}
}
```
<details>
<summary>Cline</summary>
To setup Cline, just add the json above to your MCP settings file.
[More in our wiki](https://github.com/mobile-next/mobile-mcp/wiki/Cline)
</details>
<details>
<summary>Claude Code</summary>
Use the Claude Code CLI to add the Mobile MCP server:
```bash
claude mcp add mobile-mcp -- npx -y @mobilenext/mobile-mcp@latest
```
</details>
<details>
<summary>Cursor</summary>
#### Click the button to install:
[<img src="https://cursor.com/deeplink/mcp-install-dark.svg" alt="Install in Cursor">](https://cursor.com/en/install-mcp?name=Mobile%20MCP&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBtb2JpbGVuZXh0L21vYmlsZS1tY3BAbGF0ZXN0Il19)
#### Or install manually:
Go to `Cursor Settings` -> `MCP` -> `Add new MCP Server`. Name to your liking, use `command` type with the command `npx -y @mobilenext/mobile-mcp@latest`. You can also verify config or add command like arguments via clicking `Edit`.
</details>
<details>
<summary>Gemini CLI</summary>
Use the Gemini CLI to add the Mobile MCP server:
```bash
gemini mcp add mobile-mcp npx -y @mobilenext/mobile-mcp@latest
```
</details>
<details>
<summary>Goose</summary>
#### Click the button to install:
[](https://block.github.io/goose/extension?cmd=npx&arg=-y&arg=%40mobilenext%2Fmobile-mcp%40latest&id=mobile-mcp&name=Mobile%20MCP&description=Mobile%20automation%20and%20development%20for%20iOS%2C%20Android%2C%20simulators%2C%20emulators%2C%20and%20real%20devices)
#### Or install manually:
Go to `Advanced settings` -> `Extensions` -> `Add custom extension`. Name to your liking, use type `STDIO`, and set the `command` to `npx -y @mobilenext/mobile-mcp@latest`. Click "Add Extension".
</details>
<details>
<summary>Qodo Gen</summary>
Open [Qodo Gen](https://docs.qodo.ai/qodo-documentation/qodo-gen) chat panel in VSCode or IntelliJ → Connect more tools → + Add new MCP → Paste the standard config above.
Click <code>Save</code>.
</details>
[Read more in our wiki](https://github.com/mobile-next/mobile-mcp/wiki)! 🚀
### 🛠️ How to Use 📝
After adding the MCP server to your IDE/Client, you can instruct your AI assistant to use the available tools.
For example, in Cursor's agent mode, you could use the prompts below to quickly validate, test and iterate on UI intereactions, read information from screen, go through complex workflows.
Be descriptive, straight to the point.
### ✨ Example Prompts
#### Workflows
You can specifiy detailed workflows in a single prompt, verify business logic, setup automations. You can go crazy:
**Search for a video, comment, like and share it.**
```
Find the video called " Beginner Recipe for Tonkotsu Ramen" by Way of
Ramen, click on like video, after liking write a comment " this was
delicious, will make it next Friday", share the video with the first
contact in your whatsapp list.
```
**Download a successful step counter app, register, setup workout and 5-star the app**
```
Find and Download a free "Pomodoro" app that has more than 1k stars.
Launch the app, register with my email, after registration find how to
start a pomodoro timer. When the pomodoro timer started, go back to the
app store and rate the app 5 stars, and leave a comment how useful the
app is.
```
**Search in Substack, read, highlight, comment and save an article**
```
Open Substack website, search for "Latest trends in AI automation 2025",
open the first article, highlight the section titled "Emerging AI trends",
and save article to reading list for later review, comment a random
paragraph summary.
```
**Reserve a workout class, set timer**
```
Open ClassPass, search for yoga classes tomorrow morning within 2 miles,
book the highest-rated class at 7 AM, confirm reservation,
setup a timer for the booked slot in the phone
```
**Find a local event, setup calendar event**
```
Open Eventbrite, search for AI startup meetup events happening this
weekend in "Austin, TX", select the most popular one, register and RSVP
yes to the event, setup a calendar event as a reminder.
```
**Check weather forecast and send a Whatsapp/Telegram/Slack message**
```
Open Weather app, check tomorrow's weather forecast for "Berlin", and
send the summary via Whatsapp/Telegram/Slack to contact "Lauren Trown",
thumbs up their response.
```
- **Schedule a meeting in Zoom and share invite via email**
```
Open Zoom app, schedule a meeting titled "AI Hackathon" for tomorrow at
10AM with a duration of 1 hour, copy the invitation link, and send it via
Gmail to contacts "[email protected]".
```
[More prompt examples can be found here.](https://github.com/mobile-next/mobile-mcp/wiki/Prompt-Example-repo-list)
## Prerequisites
What you will need to connect MCP with your agent and mobile devices:
- [Xcode command line tools](https://developer.apple.com/xcode/resources/)
- [Android Platform Tools](https://developer.android.com/tools/releases/platform-tools)
- [node.js](https://nodejs.org/en/download/) v22+
- [MCP](https://modelcontextprotocol.io/introduction) supported foundational models or agents, like [Claude MCP](https://modelcontextprotocol.io/quickstart/server), [OpenAI Agent SDK](https://openai.github.io/openai-agents-python/mcp/), [Copilot Studio](https://www.microsoft.com/en-us/microsoft-copilot/blog/copilot-studio/introducing-model-context-protocol-mcp-in-copilot-studio-simplified-integration-with-ai-apps-and-agents/)
### Simulators, Emulators, and Real Devices
When launched, Mobile MCP can connect to:
- iOS Simulators on macOS/Linux
- Android Emulators on Linux/Windows/macOS
- iOS or Android real devices (requires proper platform tools and drivers)
Make sure you have your mobile platform SDKs (Xcode, Android SDK) installed and configured properly before running Mobile Next Mobile MCP.
### Running in "headless" mode on Simulators/Emulators
When you do not have a real device connected to your machine, you can run Mobile MCP with an emulator or simulator in the background.
For example, on Android:
1. Start an emulator (avdmanager / emulator command).
2. Run Mobile MCP with the desired flags
On iOS, you'll need Xcode and to run the Simulator before using Mobile MCP with that simulator instance.
- `xcrun simctl list`
- `xcrun simctl boot "iPhone 16"`
# Thanks to all contributors ❤️
### We appreciate everyone who has helped improve this project.
<a href = "https://github.com/mobile-next/mobile-mcp/graphs/contributors">
<img src = "https://contrib.rocks/image?repo=mobile-next/mobile-mcp"/>
</a>
```
--------------------------------------------------------------------------------
/SECURITY.md:
--------------------------------------------------------------------------------
```markdown
# Security Policy
## Supported Versions
All versions of this project are currently being supported with security updates.
| Version | Supported |
| ------- | ------------------ |
| All | :white_check_mark: |
## Reporting a Vulnerability
To report a security vulnerability, please join our Slack channel at http://mobilenexthq.com/join-slack and DM the moderators with details of the vulnerability.
We take all security reports seriously and will respond as quickly as possible.
```
--------------------------------------------------------------------------------
/tsconfig.json:
--------------------------------------------------------------------------------
```json
{
"compilerOptions": {
"target": "ESNext",
"skipLibCheck": true,
"esModuleInterop": true,
"moduleResolution": "node",
"strict": true,
"module": "CommonJS",
"outDir": "./lib"
},
"include": [
"src",
],
}
```
--------------------------------------------------------------------------------
/src/png.ts:
--------------------------------------------------------------------------------
```typescript
export interface PngDimensions {
width: number;
height: number;
}
export class PNG {
public constructor(private readonly buffer: Buffer) {
}
public getDimensions(): PngDimensions {
const pngSignature = Buffer.from([137, 80, 78, 71, 13, 10, 26, 10]);
if (!this.buffer.subarray(0, 8).equals(pngSignature)) {
throw new Error("Not a valid PNG file");
}
const width = this.buffer.readUInt32BE(16);
const height = this.buffer.readUInt32BE(20);
return { width, height };
}
}
```
--------------------------------------------------------------------------------
/src/logger.ts:
--------------------------------------------------------------------------------
```typescript
import { appendFileSync } from "node:fs";
const writeLog = (message: string) => {
if (process.env.LOG_FILE) {
const logfile = process.env.LOG_FILE;
const timestamp = new Date().toISOString();
const levelStr = "INFO";
const logMessage = `[${timestamp}] ${levelStr} ${message}`;
appendFileSync(logfile, logMessage + "\n");
}
console.error(message);
};
export const trace = (message: string) => {
writeLog(message);
};
export const error = (message: string) => {
writeLog(message);
};
```
--------------------------------------------------------------------------------
/server.json:
--------------------------------------------------------------------------------
```json
{
"$schema": "https://static.modelcontextprotocol.io/schemas/2025-10-17/server.schema.json",
"name": "io.github.mobile-next/mobile-mcp",
"description": "MCP server for iOS and Android Mobile Development, Automation and Testing",
"status": "active",
"repository": {
"url": "https://github.com/mobile-next/mobile-mcp",
"source": "github"
},
"version": "{{VERSION}}",
"packages": [
{
"registryType": "npm",
"registryBaseUrl": "https://registry.npmjs.org",
"identifier": "@mobilenext/mobile-mcp",
"version": "{{VERSION}}",
"transport": {
"type": "stdio"
},
"environmentVariables": [
{
}
]
}
]
}
```
--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/bug_report.md:
--------------------------------------------------------------------------------
```markdown
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
---
**Describe the bug**
A clear and concise description of what the bug is.
**Configuration (please complete the following information):**
- Agent: [e.g, Claude Desktop, Cursor]
- OS: [e.g, Mac, Linux, Windows]
- Device used: [e.g. Android, iOS, iOS Simulator]
- Device version: [e.g, 18.3.2]
- Device model: [e.g., Samsung Galaxy S25]
**To Reproduce**
Steps to reproduce the behavior:
1. Use prompt '...'
2. Then do '...'
3. See error
**Expected behavior**
A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.
```
--------------------------------------------------------------------------------
/test/png.ts:
--------------------------------------------------------------------------------
```typescript
import assert from "node:assert";
import { PNG } from "../src/png";
describe("png", async () => {
it("should be able to parse png", () => {
const buffer = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGNgYAAAAAMAAWgmWQ0AAAAASUVORK5CYII=";
const png = new PNG(Buffer.from(buffer, "base64"));
assert.ok(png.getDimensions().width === 1);
assert.ok(png.getDimensions().height === 1);
});
it("should be able to to detect an invalid png", done => {
try {
const buffer = btoa("IAMADUCKIAMADUCKIAMADUCKIAMADUCKIAMADUCK");
const png = new PNG(Buffer.from(buffer, "base64"));
png.getDimensions();
done(new Error("should have thrown an error"));
} catch (error) {
done();
}
});
});
```
--------------------------------------------------------------------------------
/test/ios.ts:
--------------------------------------------------------------------------------
```typescript
import assert from "node:assert";
import { IosManager, IosRobot } from "../src/ios";
import { PNG } from "../src/png";
describe("ios", async () => {
const manager = new IosManager();
const devices = await manager.listDevices();
const hasOneDevice = devices.length === 1;
const robot = new IosRobot(devices?.[0]?.deviceId || "");
it("should be able to get screenshot", async function() {
hasOneDevice || this.skip();
const screenshot = await robot.getScreenshot();
// an black screenshot (screen is off) still consumes over 30KB
assert.ok(screenshot.length > 128 * 1024);
// must be a valid png image that matches the screen size
const image = new PNG(screenshot);
const pngSize = image.getDimensions();
const screenSize = await robot.getScreenSize();
// wda returns screen size as points, round up
assert.equal(Math.ceil(pngSize.width / screenSize.scale), screenSize.width);
assert.equal(Math.ceil(pngSize.height / screenSize.scale), screenSize.height);
});
});
```
--------------------------------------------------------------------------------
/.github/workflows/build.yml:
--------------------------------------------------------------------------------
```yaml
name: Build
permissions:
id-token: write
contents: write
on:
push:
branches:
- main
tags:
- "*.*.*"
pull_request:
branches:
- main
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Java
uses: actions/setup-java@v4
with:
distribution: 'temurin'
java-version: '17'
- name: Use Node.js 22
uses: actions/setup-node@v4
with:
node-version: '22'
cache: 'npm'
- name: Install dependencies
run: npm install
- name: Audit
run: npm audit --audit-level high
- name: Lint
run: npm run lint
- name: Update version
if: github.ref_type == 'tag'
run: |
npm version "${{ github.ref_name }}" --no-git-tag-version
npm update
- name: Build
run: |
npm run build
- name: Publish
if: github.ref_type == 'tag'
env:
NPM_AUTH_TOKEN: ${{ secrets.NPM_AUTH_TOKEN }}
run: |
echo "//registry.npmjs.org/:_authToken=$NPM_AUTH_TOKEN" >> ~/.npmrc
npm publish
- name: Publish on github mcp registry
if: github.ref_type == 'tag'
run: |
# get homebrew in path
eval "$(/home/linuxbrew/.linuxbrew/bin/brew shellenv)"
brew install mcp-publisher
mcp-publisher login github-oidc
# patch version
sed -i "s/{{VERSION}}/${{ github.ref_name }}/g" server.json
mcp-publisher publish
mcp-publisher logout
```
--------------------------------------------------------------------------------
/src/index.ts:
--------------------------------------------------------------------------------
```typescript
#!/usr/bin/env node
import { SSEServerTransport } from "@modelcontextprotocol/sdk/server/sse.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { createMcpServer, getAgentVersion } from "./server";
import { error } from "./logger";
import express from "express";
import { program } from "commander";
const startSseServer = async (port: number) => {
const app = express();
const server = createMcpServer();
let transport: SSEServerTransport | null = null;
app.post("/mcp", (req, res) => {
if (transport) {
transport.handlePostMessage(req, res);
}
});
app.get("/mcp", (req, res) => {
if (transport) {
transport.close();
}
transport = new SSEServerTransport("/mcp", res);
server.connect(transport);
});
app.listen(port, () => {
error(`mobile-mcp ${getAgentVersion()} sse server listening on http://localhost:${port}/mcp`);
});
};
const startStdioServer = async () => {
try {
const transport = new StdioServerTransport();
const server = createMcpServer();
await server.connect(transport);
error("mobile-mcp server running on stdio");
} catch (err: any) {
console.error("Fatal error in main():", err);
error("Fatal error in main(): " + JSON.stringify(err.stack));
process.exit(1);
}
};
const main = async () => {
program
.version(getAgentVersion())
.option("--port <port>", "Start SSE server on this port")
.option("--stdio", "Start stdio server (default)")
.parse(process.argv);
const options = program.opts();
if (options.port) {
await startSseServer(+options.port);
} else {
await startStdioServer();
}
};
main().then();
```
--------------------------------------------------------------------------------
/src/mobilecli.ts:
--------------------------------------------------------------------------------
```typescript
import { existsSync } from "node:fs";
import { dirname, join, sep } from "node:path";
export const getMobilecliPath = (): string => {
if (process.env.MOBILECLI_PATH) {
return process.env.MOBILECLI_PATH;
}
const arch = process.arch;
const platform = process.platform;
let binaryName = "mobilecli";
switch (platform) {
case "darwin":
if (arch === "arm64") {
binaryName += "-darwin-arm64";
} else {
binaryName += "-darwin-amd64";
}
break;
case "linux":
if (arch === "arm64") {
binaryName += "-linux-arm64";
} else {
binaryName += "-linux-amd64";
}
break;
case "win32":
binaryName += "-windows-amd64.exe";
break;
default:
throw new Error(`Unsupported platform: ${platform}`);
}
// Check if mobile-mcp is installed as a package
const currentPath = __filename;
const pathParts = currentPath.split(sep);
const lastNodeModulesIndex = pathParts.lastIndexOf("node_modules");
if (lastNodeModulesIndex !== -1) {
// We're inside node_modules, go to the last node_modules in the path
const nodeModulesParts = pathParts.slice(0, lastNodeModulesIndex + 1);
const lastNodeModulesPath = nodeModulesParts.join(sep);
const mobilecliPath = join(lastNodeModulesPath, "@mobilenext", "mobilecli", "bin", binaryName);
if (existsSync(mobilecliPath)) {
return mobilecliPath;
}
}
// Not in node_modules, look one directory up from current script
const scriptDir = dirname(__filename);
const parentDir = dirname(scriptDir);
const mobilecliPath = join(parentDir, "node_modules", "@mobilenext", "mobilecli", "bin", binaryName);
if (existsSync(mobilecliPath)) {
return mobilecliPath;
}
throw new Error(`Could not find mobilecli binary for platform: ${platform}`);
};
```
--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------
```json
{
"name": "@mobilenext/mobile-mcp",
"mcpName": "io.github.mobile-next/mobile-mcp",
"version": "0.0.1",
"description": "Mobile MCP",
"repository": {
"type": "git",
"url": "git+https://github.com/mobile-next/mobile-mcp.git"
},
"engines": {
"node": ">=18"
},
"license": "Apache-2.0",
"scripts": {
"build": "tsc && chmod +x lib/index.js",
"lint": "eslint .",
"fixlint": "eslint . --fix",
"test": "nyc mocha --require ts-node/register test/*.ts",
"watch": "tsc --watch",
"clean": "rm -rf lib",
"prepare": "husky"
},
"files": [
"lib"
],
"dependencies": {
"@modelcontextprotocol/sdk": "1.13.2",
"commander": "14.0.0",
"express": "5.1.0",
"fast-xml-parser": "5.2.5",
"zod-to-json-schema": "3.24.6"
},
"optionalDependencies": {
"@mobilenext/mobilecli": "0.0.33"
},
"devDependencies": {
"@eslint/eslintrc": "^3.2.0",
"@eslint/js": "^9.19.0",
"@stylistic/eslint-plugin": "^3.0.1",
"@types/commander": "^2.12.0",
"@types/express": "^5.0.3",
"@types/mocha": "^10.0.10",
"@types/node": "^22.13.10",
"@typescript-eslint/eslint-plugin": "^8.28.0",
"@typescript-eslint/parser": "^8.26.1",
"@typescript-eslint/utils": "^8.26.1",
"eslint": "^9.19.0",
"eslint-plugin": "^1.0.1",
"eslint-plugin-import": "^2.31.0",
"eslint-plugin-notice": "^1.0.0",
"husky": "^9.1.7",
"mocha": "^11.1.0",
"nyc": "^17.1.0",
"ts-node": "^10.9.2",
"typescript": "^5.8.2"
},
"main": "index.js",
"bin": {
"mcp-server-mobile": "lib/index.js"
},
"directories": {
"lib": "lib"
},
"author": "",
"bugs": {
"url": "https://github.com/mobile-next/mobile-mcp/issues"
},
"homepage": "https://github.com/mobile-next/mobile-mcp#readme"
}
```
--------------------------------------------------------------------------------
/eslint.config.mjs:
--------------------------------------------------------------------------------
```
import typescriptEslint from "@typescript-eslint/eslint-plugin";
import tsParser from "@typescript-eslint/parser";
import stylistic from "@stylistic/eslint-plugin";
import importRules from "eslint-plugin-import";
const plugins = {
"@stylistic": stylistic,
"@typescript-eslint": typescriptEslint,
import: importRules,
};
export const baseRules = {
"@typescript-eslint/no-unused-vars": [
2,
{args: "none", caughtErrors: "none"},
],
/**
* Enforced rules
*/
// syntax preferences
"object-curly-spacing": ["error", "always"],
quotes: [
2,
"double",
{
avoidEscape: true,
allowTemplateLiterals: true,
},
],
"jsx-quotes": [2, "prefer-single"],
"no-extra-semi": 2,
"@stylistic/semi": [2],
"comma-style": [2, "last"],
"wrap-iife": [2, "inside"],
"spaced-comment": [
2,
"always",
{
markers: ["*"],
},
],
eqeqeq: [2],
"accessor-pairs": [
2,
{
getWithoutSet: false,
setWithoutGet: false,
},
],
"brace-style": [2, "1tbs", {allowSingleLine: true}],
curly: [2, "all"],
"new-parens": 2,
"arrow-parens": [2, "as-needed"],
"prefer-const": 2,
"quote-props": [2, "consistent"],
"nonblock-statement-body-position": [2, "below"],
// anti-patterns
"no-var": 2,
"no-with": 2,
"no-multi-str": 2,
"no-caller": 2,
"no-implied-eval": 2,
"no-labels": 2,
"no-new-object": 2,
"no-octal-escape": 2,
"no-self-compare": 2,
"no-shadow-restricted-names": 2,
"no-cond-assign": 2,
"no-debugger": 2,
"no-dupe-keys": 2,
"no-duplicate-case": 2,
"no-empty-character-class": 2,
"no-unreachable": 2,
"no-unsafe-negation": 2,
radix: 2,
"valid-typeof": 2,
"no-implicit-globals": [2],
"no-unused-expressions": [
2,
{allowShortCircuit: true, allowTernary: true, allowTaggedTemplates: true},
],
"no-proto": 2,
// es2015 features
"require-yield": 2,
"template-curly-spacing": [2, "never"],
// spacing details
"space-infix-ops": 2,
"space-in-parens": [2, "never"],
"array-bracket-spacing": [2, "never"],
"comma-spacing": [2, {before: false, after: true}],
"keyword-spacing": [2, "always"],
"space-before-function-paren": [
2,
{
anonymous: "never",
named: "never",
asyncArrow: "always",
},
],
"no-whitespace-before-property": 2,
"keyword-spacing": [
2,
{
overrides: {
if: {after: true},
else: {after: true},
for: {after: true},
while: {after: true},
do: {after: true},
switch: {after: true},
return: {after: true},
},
},
],
"arrow-spacing": [
2,
{
after: true,
before: true,
},
],
"@stylistic/func-call-spacing": 2,
"@stylistic/type-annotation-spacing": 2,
// file whitespace
"no-multiple-empty-lines": [2, {max: 2, maxEOF: 0}],
"no-mixed-spaces-and-tabs": 2,
"no-trailing-spaces": 2,
"linebreak-style": [process.platform === "win32" ? 0 : 2, "unix"],
indent: [
2,
"tab",
{SwitchCase: 1, CallExpression: {arguments: "first"}, MemberExpression: 1},
],
"key-spacing": [
2,
{
beforeColon: false,
},
],
"eol-last": 2,
};
const languageOptions = {
parser: tsParser,
ecmaVersion: 9,
sourceType: "module",
};
export default [
{
files: ["**/*.ts"],
plugins,
languageOptions,
rules: baseRules,
},
];
```
--------------------------------------------------------------------------------
/src/robot.ts:
--------------------------------------------------------------------------------
```typescript
export interface Dimensions {
width: number;
height: number;
}
export interface ScreenSize extends Dimensions {
scale: number;
}
export interface InstalledApp {
packageName: string;
appName: string;
}
export type SwipeDirection = "up" | "down" | "left" | "right";
export type Button = "HOME" | "BACK" | "VOLUME_UP" | "VOLUME_DOWN" | "ENTER" | "DPAD_CENTER" | "DPAD_UP" | "DPAD_DOWN" | "DPAD_LEFT" | "DPAD_RIGHT";
export interface ScreenElementRect {
x: number;
y: number;
width: number;
height: number;
}
export interface ScreenElement {
type: string;
label?: string;
text?: string;
name?: string;
value?: string;
identifier?: string;
rect: ScreenElementRect;
// currently only on android tv
focused?: boolean;
}
export class ActionableError extends Error {
constructor(message: string) {
super(message);
}
}
export type Orientation = "portrait" | "landscape";
export interface Robot {
/**
* Get the screen size of the device in pixels.
*/
getScreenSize(): Promise<ScreenSize>;
/**
* Swipe in a direction.
*/
swipe(direction: SwipeDirection): Promise<void>;
/**
* Swipe from a specific coordinate in a direction.
*/
swipeFromCoordinate(x: number, y: number, direction: SwipeDirection, distance?: number): Promise<void>;
/**
* Get a screenshot of the screen. Returns a Buffer that contains
* a PNG image of the screen. Will be same dimensions as getScreenSize().
*/
getScreenshot(): Promise<Buffer>;
/**
* List all installed apps on the device. Returns an array of package names (or
* bundle identifiers in iOS) for all installed apps.
*/
listApps(): Promise<InstalledApp[]>;
/**
* Launch an app.
*/
launchApp(packageName: string): Promise<void>;
/**
* Terminate an app. If app was already terminated (or non existent) then this
* is a no-op.
*/
terminateApp(packageName: string): Promise<void>;
/**
* Install an app on the device from a file path.
*/
installApp(path: string): Promise<void>;
/**
* Uninstall an app from the device.
*/
uninstallApp(bundleId: string): Promise<void>;
/**
* Open a URL in the device's web browser. Can be an https:// url, or a
* custom scheme (e.g. "myapp://").
*/
openUrl(url: string): Promise<void>;
/**
* Send keys to the device, simulating keyboard input.
*/
sendKeys(text: string): Promise<void>;
/**
* Press a button on the device, simulating a physical button press.
*/
pressButton(button: Button): Promise<void>;
/**
* Tap on a specific coordinate on the screen.
*/
tap(x: number, y: number): Promise<void>;
/**
* Tap on a specific coordinate on the screen.
*/
doubleTap(x: number, y: number): Promise<void>;
/**
* Long press on a specific coordinate on the screen.
*/
longPress(x: number, y: number): Promise<void>;
/**
* Get all elements on the screen. Works only on native apps (not webviews). Will
* return a filtered list of elements that make sense to interact with.
*/
getElementsOnScreen(): Promise<ScreenElement[]>;
/**
* Change the screen orientation of the device.
* @param orientation The desired orientation ("portrait" or "landscape")
*/
setOrientation(orientation: Orientation): Promise<void>;
/**
* Get the current screen orientation.
*/
getOrientation(): Promise<Orientation>;
}
```
--------------------------------------------------------------------------------
/src/image-utils.ts:
--------------------------------------------------------------------------------
```typescript
import { execFileSync, spawnSync } from "child_process";
import os from "node:os";
import fs from "node:fs";
import path from "node:path";
import { trace } from "./logger";
const DEFAULT_JPEG_QUALITY = 75;
export class ImageTransformer {
private newWidth: number = 0;
private newFormat: "jpg" | "png" = "png";
private jpegOptions: { quality: number } = { quality: DEFAULT_JPEG_QUALITY };
constructor(private buffer: Buffer) {}
public resize(width: number): ImageTransformer {
this.newWidth = width;
return this;
}
public jpeg(options: { quality: number }): ImageTransformer {
this.newFormat = "jpg";
this.jpegOptions = options;
return this;
}
public png(): ImageTransformer {
this.newFormat = "png";
return this;
}
public toBuffer(): Buffer {
if (isSipsInstalled()) {
try {
return this.toBufferWithSips();
} catch (error) {
trace(`Sips failed, falling back to ImageMagick: ${error}`);
}
}
try {
return this.toBufferWithImageMagick();
} catch (error) {
trace(`ImageMagick failed: ${error}`);
throw new Error("Image scaling unavailable (requires Sips or ImageMagick).");
}
}
private qualityToSips(q: number): "low" | "normal" | "high" | "best" {
if (q >= 90) {
return "best";
}
if (q >= 75) {
return "high";
}
if (q >= 50) {
return "normal";
}
return "low";
}
private toBufferWithSips(): Buffer {
const tempDir = fs.mkdtempSync(path.join(os.tmpdir(), "image-"));
const inputFile = path.join(tempDir, "input");
const outputFile = path.join(tempDir, `output.${this.newFormat === "jpg" ? "jpg" : "png"}`);
try {
fs.writeFileSync(inputFile, this.buffer);
const args = ["-s", "format", this.newFormat === "jpg" ? "jpeg" : "png"];
if (this.newFormat === "jpg") {
args.push("-s", "formatOptions", this.qualityToSips(this.jpegOptions.quality));
}
args.push("-Z", `${this.newWidth}`);
args.push("--out", outputFile);
args.push(inputFile);
trace(`Running sips command: /usr/bin/sips ${args.join(" ")}`);
const proc = spawnSync("/usr/bin/sips", args, {
maxBuffer: 8 * 1024 * 1024
});
if (proc.status !== 0) {
throw new Error(`Sips failed with status ${proc.status}`);
}
const outputBuffer = fs.readFileSync(outputFile);
trace("Sips returned buffer of size: " + outputBuffer.length);
return outputBuffer;
} finally {
try {
fs.rmSync(tempDir, { recursive: true, force: true });
} catch (error) {
// Ignore cleanup errors
}
}
}
private toBufferWithImageMagick(): Buffer {
const magickArgs = ["-", "-resize", `${this.newWidth}x`, "-quality", `${this.jpegOptions.quality}`, `${this.newFormat}:-`];
trace(`Running magick command: magick ${magickArgs.join(" ")}`);
const proc = spawnSync("magick", magickArgs, {
maxBuffer: 8 * 1024 * 1024,
input: this.buffer
});
return proc.stdout;
}
}
export class Image {
constructor(private buffer: Buffer) {}
public static fromBuffer(buffer: Buffer): Image {
return new Image(buffer);
}
public resize(width: number): ImageTransformer {
return new ImageTransformer(this.buffer).resize(width);
}
public jpeg(options: { quality: number }): ImageTransformer {
return new ImageTransformer(this.buffer).jpeg(options);
}
}
const isDarwin = (): boolean => {
return os.platform() === "darwin";
};
export const isSipsInstalled = (): boolean => {
if (!isDarwin()) {
return false;
}
try {
execFileSync("/usr/bin/sips", ["--version"]);
return true;
} catch (error) {
return false;
}
};
export const isImageMagickInstalled = (): boolean => {
try {
return execFileSync("magick", ["--version"])
.toString()
.split("\n")
.filter(line => line.includes("Version: ImageMagick"))
.length > 0;
} catch (error) {
return false;
}
};
export const isScalingAvailable = (): boolean => {
return isImageMagickInstalled() || isSipsInstalled();
};
```
--------------------------------------------------------------------------------
/test/android.ts:
--------------------------------------------------------------------------------
```typescript
import assert from "node:assert";
import { PNG } from "../src/png";
import { AndroidRobot, AndroidDeviceManager } from "../src/android";
const manager = new AndroidDeviceManager();
const devices = manager.getConnectedDevices();
const hasOneAndroidDevice = devices.length === 1;
describe("android", () => {
const android = new AndroidRobot(devices?.[0]?.deviceId || "");
it("should be able to get the screen size", async function() {
hasOneAndroidDevice || this.skip();
const screenSize = await android.getScreenSize();
assert.ok(screenSize.width > 1024);
assert.ok(screenSize.height > 1024);
assert.ok(screenSize.scale === 1);
assert.equal(Object.keys(screenSize).length, 3, "screenSize should have exactly 3 properties");
});
it("should be able to take screenshot", async function() {
hasOneAndroidDevice || this.skip();
const screenSize = await android.getScreenSize();
const screenshot = await android.getScreenshot();
assert.ok(screenshot.length > 64 * 1024);
// must be a valid png image that matches the screen size
const image = new PNG(screenshot);
const pngSize = image.getDimensions();
assert.equal(pngSize.width, screenSize.width);
assert.equal(pngSize.height, screenSize.height);
});
it("should be able to list apps", async function() {
hasOneAndroidDevice || this.skip();
const apps = await android.listApps();
const packages = apps.map(app => app.packageName);
assert.ok(packages.includes("com.android.settings"));
});
it("should be able to open a url", async function() {
hasOneAndroidDevice || this.skip();
await android.adb("shell", "input", "keyevent", "HOME");
await android.openUrl("https://www.example.com");
});
it("should be able to list elements on screen", async function() {
hasOneAndroidDevice || this.skip();
await android.terminateApp("com.android.chrome");
await android.adb("shell", "input", "keyevent", "HOME");
await android.openUrl("https://www.example.com");
const elements = await android.getElementsOnScreen();
// make sure title (TextView) is present
const foundTitle = elements.find(element => element.type === "android.widget.TextView" && element.text?.startsWith("This domain is for use in illustrative examples in documents"));
assert.ok(foundTitle, "Title element not found");
// make sure navbar (EditText) is present
const foundNavbar = elements.find(element => element.type === "android.widget.EditText" && element.label === "Search or type URL" && element.text === "example.com");
assert.ok(foundNavbar, "Navbar element not found");
// this is an icon, but has accessibility label
const foundSecureIcon = elements.find(element => element.type === "android.widget.ImageButton" && element.text === "" && element.label === "New tab");
assert.ok(foundSecureIcon, "New tab icon not found");
});
it("should be able to send keys and tap", async function() {
hasOneAndroidDevice || this.skip();
await android.terminateApp("com.google.android.deskclock");
await android.adb("shell", "pm", "clear", "com.google.android.deskclock");
await android.launchApp("com.google.android.deskclock");
// We probably start at Clock tab
await new Promise(resolve => setTimeout(resolve, 3000));
let elements = await android.getElementsOnScreen();
const timerElement = elements.find(e => e.label === "Timer" && e.type === "android.widget.FrameLayout");
assert.ok(timerElement !== undefined);
await android.tap(timerElement.rect.x, timerElement.rect.y);
// now we're in Timer tab
await new Promise(resolve => setTimeout(resolve, 3000));
elements = await android.getElementsOnScreen();
const currentTime = elements.find(e => e.text === "00h 00m 00s");
assert.ok(currentTime !== undefined, "Expected time to be 00h 00m 00s");
await android.sendKeys("123456");
// now the title has changed with new timer
await new Promise(resolve => setTimeout(resolve, 3000));
elements = await android.getElementsOnScreen();
const newTime = elements.find(e => e.text === "12h 34m 56s");
assert.ok(newTime !== undefined, "Expected time to be 12h 34m 56s");
await android.terminateApp("com.google.android.deskclock");
});
it("should be able to launch and terminate an app", async function() {
hasOneAndroidDevice || this.skip();
// kill if running
await android.terminateApp("com.android.chrome");
await android.launchApp("com.android.chrome");
await new Promise(resolve => setTimeout(resolve, 3000));
const processes = await android.listRunningProcesses();
assert.ok(processes.includes("com.android.chrome"));
await android.terminateApp("com.android.chrome");
const processes2 = await android.listRunningProcesses();
assert.ok(!processes2.includes("com.android.chrome"));
});
it("should handle orientation changes", async function() {
hasOneAndroidDevice || this.skip();
// assume we start in portrait
const originalOrientation = await android.getOrientation();
assert.equal(originalOrientation, "portrait");
const screenSize1 = await android.getScreenSize();
// set to landscape
await android.setOrientation("landscape");
await new Promise(resolve => setTimeout(resolve, 1500));
const orientation = await android.getOrientation();
assert.equal(orientation, "landscape");
const screenSize2 = await android.getScreenSize();
// set to portrait
await android.setOrientation("portrait");
await new Promise(resolve => setTimeout(resolve, 1500));
const orientation2 = await android.getOrientation();
assert.equal(orientation2, "portrait");
// screen size should not have changed
assert.deepEqual(screenSize1, screenSize2);
});
});
```
--------------------------------------------------------------------------------
/test/iphone-simulator.ts:
--------------------------------------------------------------------------------
```typescript
import assert from "node:assert";
import { randomBytes } from "node:crypto";
import { PNG } from "../src/png";
import { SimctlManager } from "../src/iphone-simulator";
describe("iphone-simulator", () => {
const manager = new SimctlManager();
const bootedSimulators = manager.listBootedSimulators();
const hasOneSimulator = bootedSimulators.length === 1;
const simctl = manager.getSimulator(bootedSimulators?.[0]?.uuid || "");
const restartApp = async (app: string) => {
await simctl.launchApp(app);
await simctl.terminateApp(app);
await simctl.launchApp(app);
};
const restartPreferencesApp = async () => {
await restartApp("com.apple.Preferences");
};
const restartRemindersApp = async () => {
await restartApp("com.apple.reminders");
};
it("should be able to swipe", async function() {
hasOneSimulator || this.skip();
await restartPreferencesApp();
// make sure "General" is present (since it's at the top of the list)
const elements1 = await simctl.getElementsOnScreen();
assert.ok(elements1.findIndex(e => e.name === "com.apple.settings.general") !== -1);
// swipe up (bottom of screen to top of screen)
await simctl.swipe("up");
// make sure "General" is not visible now
const elements2 = await simctl.getElementsOnScreen();
assert.ok(elements2.findIndex(e => e.name === "com.apple.settings.general") === -1);
// swipe down
await simctl.swipe("down");
// make sure "General" is visible again
const elements3 = await simctl.getElementsOnScreen();
assert.ok(elements3.findIndex(e => e.name === "com.apple.settings.general") !== -1);
});
it("should be able to send keys and press enter", async function() {
hasOneSimulator || this.skip();
await restartRemindersApp();
// find new reminder element
await new Promise(resolve => setTimeout(resolve, 3000));
const elements = await simctl.getElementsOnScreen();
const newElement = elements.find(e => e.label === "New Reminder");
assert.ok(newElement !== undefined, "should have found New Reminder element");
// click on new reminder
await simctl.tap(newElement.rect.x, newElement.rect.y);
// wait for keyboard to appear
await new Promise(resolve => setTimeout(resolve, 1000));
// send keys with press button "Enter"
const random1 = randomBytes(8).toString("hex");
await simctl.sendKeys(random1);
await simctl.pressButton("ENTER");
// send keys with "\n"
const random2 = randomBytes(8).toString("hex");
await simctl.sendKeys(random2 + "\n");
const elements2 = await simctl.getElementsOnScreen();
assert.ok(elements2.findIndex(e => e.value === random1) !== -1);
assert.ok(elements2.findIndex(e => e.value === random2) !== -1);
});
it("should be able to get the screen size", async function() {
hasOneSimulator || this.skip();
const screenSize = await simctl.getScreenSize();
assert.ok(screenSize.width > 256);
assert.ok(screenSize.height > 256);
assert.ok(screenSize.scale >= 1);
assert.equal(Object.keys(screenSize).length, 3, "screenSize should have exactly 3 properties");
});
it("should be able to get screenshot", async function() {
hasOneSimulator || this.skip();
const screenshot = await simctl.getScreenshot();
assert.ok(screenshot.length > 64 * 1024);
// must be a valid png image that matches the screen size
const image = new PNG(screenshot);
const pngSize = image.getDimensions();
const screenSize = await simctl.getScreenSize();
// wda returns screen size as points, round up
assert.equal(Math.ceil(pngSize.width / screenSize.scale), screenSize.width);
assert.equal(Math.ceil(pngSize.height / screenSize.scale), screenSize.height);
});
it("should be able to open url", async function() {
hasOneSimulator || this.skip();
// simply checking thato openurl with https:// launches safari
await simctl.openUrl("https://www.example.com");
await new Promise(resolve => setTimeout(resolve, 1000));
const elements = await simctl.getElementsOnScreen();
assert.ok(elements.length > 0);
const addressBar = elements.find(element => element.type === "TextField" && element.name === "TabBarItemTitle" && element.label === "Address");
assert.ok(addressBar !== undefined, "should have address bar");
});
it("should be able to list apps", async function() {
hasOneSimulator || this.skip();
const apps = await simctl.listApps();
const packages = apps.map(app => app.packageName);
assert.ok(packages.includes("com.apple.mobilesafari"));
assert.ok(packages.includes("com.apple.reminders"));
assert.ok(packages.includes("com.apple.Preferences"));
});
it("should be able to get elements on screen", async function() {
hasOneSimulator || this.skip();
await simctl.pressButton("HOME");
await new Promise(resolve => setTimeout(resolve, 2000));
const elements = await simctl.getElementsOnScreen();
assert.ok(elements.length > 0);
// must have News app in home screen
const element = elements.find(e => e.type === "Icon" && e.label === "News");
assert.ok(element !== undefined, "should have News app in home screen");
});
it("should be able to launch and terminate app", async function() {
hasOneSimulator || this.skip();
await restartPreferencesApp();
await new Promise(resolve => setTimeout(resolve, 2000));
const elements = await simctl.getElementsOnScreen();
const buttons = elements.filter(e => e.type === "Button").map(e => e.label);
assert.ok(buttons.includes("General"));
assert.ok(buttons.includes("Accessibility"));
// make sure app is terminated
await simctl.terminateApp("com.apple.Preferences");
const elements2 = await simctl.getElementsOnScreen();
const buttons2 = elements2.filter(e => e.type === "Button").map(e => e.label);
assert.ok(!buttons2.includes("General"));
});
it("should throw an error if button is not supported", async function() {
hasOneSimulator || this.skip();
try {
await simctl.pressButton("NOT_A_BUTTON" as any);
assert.fail("should have thrown an error");
} catch (error) {
assert.ok(error instanceof Error);
assert.ok(error.message.includes("Button \"NOT_A_BUTTON\" is not supported"));
}
});
});
```
--------------------------------------------------------------------------------
/src/ios.ts:
--------------------------------------------------------------------------------
```typescript
import { Socket } from "node:net";
import { execFileSync } from "node:child_process";
import { WebDriverAgent } from "./webdriver-agent";
import { ActionableError, Button, InstalledApp, Robot, ScreenSize, SwipeDirection, ScreenElement, Orientation } from "./robot";
const WDA_PORT = 8100;
const IOS_TUNNEL_PORT = 60105;
interface ListCommandOutput {
deviceList: string[];
}
interface VersionCommandOutput {
version: string;
}
interface InfoCommandOutput {
DeviceClass: string;
DeviceName: string;
ProductName: string;
ProductType: string;
ProductVersion: string;
PhoneNumber: string;
TimeZone: string;
}
export interface IosDevice {
deviceId: string;
deviceName: string;
}
const getGoIosPath = (): string => {
if (process.env.GO_IOS_PATH) {
return process.env.GO_IOS_PATH;
}
// fallback to go-ios in PATH via `npm install -g go-ios`
return "ios";
};
export class IosRobot implements Robot {
public constructor(private deviceId: string) {
}
private isListeningOnPort(port: number): Promise<boolean> {
return new Promise((resolve, reject) => {
const client = new Socket();
client.connect(port, "localhost", () => {
client.destroy();
resolve(true);
});
client.on("error", (err: any) => {
resolve(false);
});
});
}
private async isTunnelRunning(): Promise<boolean> {
return await this.isListeningOnPort(IOS_TUNNEL_PORT);
}
private async isWdaForwardRunning(): Promise<boolean> {
return await this.isListeningOnPort(WDA_PORT);
}
private async assertTunnelRunning(): Promise<void> {
if (await this.isTunnelRequired()) {
if (!(await this.isTunnelRunning())) {
throw new ActionableError("iOS tunnel is not running, please see https://github.com/mobile-next/mobile-mcp/wiki/");
}
}
}
private async wda(): Promise<WebDriverAgent> {
await this.assertTunnelRunning();
if (!(await this.isWdaForwardRunning())) {
throw new ActionableError("Port forwarding to WebDriverAgent is not running (tunnel okay), please see https://github.com/mobile-next/mobile-mcp/wiki/");
}
const wda = new WebDriverAgent("localhost", WDA_PORT);
if (!(await wda.isRunning())) {
throw new ActionableError("WebDriverAgent is not running on device (tunnel okay, port forwarding okay), please see https://github.com/mobile-next/mobile-mcp/wiki/");
}
return wda;
}
private async ios(...args: string[]): Promise<string> {
return execFileSync(getGoIosPath(), ["--udid", this.deviceId, ...args], {}).toString();
}
public async getIosVersion(): Promise<string> {
const output = await this.ios("info");
const json = JSON.parse(output);
return json.ProductVersion;
}
private async isTunnelRequired(): Promise<boolean> {
const version = await this.getIosVersion();
const args = version.split(".");
return parseInt(args[0], 10) >= 17;
}
public async getScreenSize(): Promise<ScreenSize> {
const wda = await this.wda();
return await wda.getScreenSize();
}
public async swipe(direction: SwipeDirection): Promise<void> {
const wda = await this.wda();
await wda.swipe(direction);
}
public async swipeFromCoordinate(x: number, y: number, direction: SwipeDirection, distance?: number): Promise<void> {
const wda = await this.wda();
await wda.swipeFromCoordinate(x, y, direction, distance);
}
public async listApps(): Promise<InstalledApp[]> {
await this.assertTunnelRunning();
const output = await this.ios("apps", "--all", "--list");
return output
.split("\n")
.map(line => {
const [packageName, appName] = line.split(" ");
return {
packageName,
appName,
};
});
}
public async launchApp(packageName: string): Promise<void> {
await this.assertTunnelRunning();
await this.ios("launch", packageName);
}
public async terminateApp(packageName: string): Promise<void> {
await this.assertTunnelRunning();
await this.ios("kill", packageName);
}
public async installApp(path: string): Promise<void> {
await this.assertTunnelRunning();
try {
await this.ios("install", "--path", path);
} catch (error: any) {
const stdout = error.stdout ? error.stdout.toString() : "";
const stderr = error.stderr ? error.stderr.toString() : "";
const output = (stdout + stderr).trim();
throw new ActionableError(output || error.message);
}
}
public async uninstallApp(bundleId: string): Promise<void> {
await this.assertTunnelRunning();
try {
await this.ios("uninstall", "--bundleid", bundleId);
} catch (error: any) {
const stdout = error.stdout ? error.stdout.toString() : "";
const stderr = error.stderr ? error.stderr.toString() : "";
const output = (stdout + stderr).trim();
throw new ActionableError(output || error.message);
}
}
public async openUrl(url: string): Promise<void> {
const wda = await this.wda();
await wda.openUrl(url);
}
public async sendKeys(text: string): Promise<void> {
const wda = await this.wda();
await wda.sendKeys(text);
}
public async pressButton(button: Button): Promise<void> {
const wda = await this.wda();
await wda.pressButton(button);
}
public async tap(x: number, y: number): Promise<void> {
const wda = await this.wda();
await wda.tap(x, y);
}
public async doubleTap(x: number, y: number): Promise<void> {
const wda = await this.wda();
await wda.doubleTap(x, y);
}
public async longPress(x: number, y: number): Promise<void> {
const wda = await this.wda();
await wda.longPress(x, y);
}
public async getElementsOnScreen(): Promise<ScreenElement[]> {
const wda = await this.wda();
return await wda.getElementsOnScreen();
}
public async getScreenshot(): Promise<Buffer> {
const wda = await this.wda();
return await wda.getScreenshot();
/* alternative:
await this.assertTunnelRunning();
const tmpFilename = path.join(tmpdir(), `screenshot-${randomBytes(8).toString("hex")}.png`);
await this.ios("screenshot", "--output", tmpFilename);
const buffer = readFileSync(tmpFilename);
unlinkSync(tmpFilename);
return buffer;
*/
}
public async setOrientation(orientation: Orientation): Promise<void> {
const wda = await this.wda();
await wda.setOrientation(orientation);
}
public async getOrientation(): Promise<Orientation> {
const wda = await this.wda();
return await wda.getOrientation();
}
}
export class IosManager {
public isGoIosInstalled(): boolean {
try {
const output = execFileSync(getGoIosPath(), ["version"], { stdio: ["pipe", "pipe", "ignore"] }).toString();
const json: VersionCommandOutput = JSON.parse(output);
return json.version !== undefined && (json.version.startsWith("v") || json.version === "local-build");
} catch (error) {
return false;
}
}
public getDeviceName(deviceId: string): string {
const output = execFileSync(getGoIosPath(), ["info", "--udid", deviceId]).toString();
const json: InfoCommandOutput = JSON.parse(output);
return json.DeviceName;
}
public listDevices(): IosDevice[] {
if (!this.isGoIosInstalled()) {
console.error("go-ios is not installed, no physical iOS devices can be detected");
return [];
}
const output = execFileSync(getGoIosPath(), ["list"]).toString();
const json: ListCommandOutput = JSON.parse(output);
const devices = json.deviceList.map(device => ({
deviceId: device,
deviceName: this.getDeviceName(device),
}));
return devices;
}
}
```
--------------------------------------------------------------------------------
/src/iphone-simulator.ts:
--------------------------------------------------------------------------------
```typescript
import { execFileSync } from "node:child_process";
import { mkdtempSync, readdirSync, rmSync } from "node:fs";
import { tmpdir } from "node:os";
import { join, basename, extname } from "node:path";
import { trace } from "./logger";
import { WebDriverAgent } from "./webdriver-agent";
import { ActionableError, Button, InstalledApp, Robot, ScreenElement, ScreenSize, SwipeDirection, Orientation } from "./robot";
export interface Simulator {
name: string;
uuid: string;
state: string;
}
interface ListDevicesResponse {
devices: {
[key: string]: Array<{
state: string;
name: string;
isAvailable: boolean;
udid: string;
}>,
},
}
interface AppInfo {
ApplicationType: string;
Bundle: string;
CFBundleDisplayName: string;
CFBundleExecutable: string;
CFBundleIdentifier: string;
CFBundleName: string;
CFBundleVersion: string;
DataContainer: string;
Path: string;
}
const TIMEOUT = 30000;
const WDA_PORT = 8100;
const MAX_BUFFER_SIZE = 1024 * 1024 * 4;
export class Simctl implements Robot {
constructor(private readonly simulatorUuid: string) {}
private async isWdaInstalled(): Promise<boolean> {
const apps = await this.listApps();
return apps.map(app => app.packageName).includes("com.facebook.WebDriverAgentRunner.xctrunner");
}
private async startWda(): Promise<void> {
if (!(await this.isWdaInstalled())) {
// wda is not even installed, won't attempt to start it
return;
}
trace("Starting WebDriverAgent");
const webdriverPackageName = "com.facebook.WebDriverAgentRunner.xctrunner";
this.simctl("launch", this.simulatorUuid, webdriverPackageName);
// now we wait for wda to have a successful status
const wda = new WebDriverAgent("localhost", WDA_PORT);
// wait up to 10 seconds for wda to start
const timeout = +new Date() + 10 * 1000;
while (+new Date() < timeout) {
// cross fingers and see if wda is already running
if (await wda.isRunning()) {
trace("WebDriverAgent is now running");
return;
}
// wait 100ms before trying again
await new Promise(resolve => setTimeout(resolve, 100));
}
trace("Could not start WebDriverAgent in time, giving up");
}
private async wda(): Promise<WebDriverAgent> {
const wda = new WebDriverAgent("localhost", WDA_PORT);
if (!(await wda.isRunning())) {
await this.startWda();
if (!(await wda.isRunning())) {
throw new ActionableError("WebDriverAgent is not running on simulator, please see https://github.com/mobile-next/mobile-mcp/wiki/");
}
// was successfully started
}
return wda;
}
private simctl(...args: string[]): Buffer {
return execFileSync("xcrun", ["simctl", ...args], {
timeout: TIMEOUT,
maxBuffer: MAX_BUFFER_SIZE,
});
}
public async getScreenshot(): Promise<Buffer> {
const wda = await this.wda();
return await wda.getScreenshot();
// alternative: return this.simctl("io", this.simulatorUuid, "screenshot", "-");
}
public async openUrl(url: string) {
const wda = await this.wda();
await wda.openUrl(url);
// alternative: this.simctl("openurl", this.simulatorUuid, url);
}
public async launchApp(packageName: string) {
this.simctl("launch", this.simulatorUuid, packageName);
}
public async terminateApp(packageName: string) {
this.simctl("terminate", this.simulatorUuid, packageName);
}
private findAppBundle(dir: string): string | null {
const entries = readdirSync(dir, { withFileTypes: true });
for (const entry of entries) {
if (entry.isDirectory() && entry.name.endsWith(".app")) {
return join(dir, entry.name);
}
}
return null;
}
private validateZipPaths(zipPath: string): void {
const output = execFileSync("/usr/bin/zipinfo", ["-1", zipPath], {
timeout: TIMEOUT,
maxBuffer: MAX_BUFFER_SIZE,
}).toString();
const invalidPath = output
.split("\n")
.map(s => s.trim())
.filter(s => s)
.find(s => s.startsWith("/") || s.includes(".."));
if (invalidPath) {
throw new ActionableError(`Security violation: File path '${invalidPath}' contains invalid characters`);
}
}
public async installApp(path: string): Promise<void> {
let tempDir: string | null = null;
let installPath = path;
try {
// zip files need to be extracted prior to installation
if (extname(path).toLowerCase() === ".zip") {
trace(`Detected .zip file, validating contents`);
// before extracting, let's make sure there's no zip-slip bombs here
this.validateZipPaths(path);
tempDir = mkdtempSync(join(tmpdir(), "ios-app-"));
try {
execFileSync("unzip", ["-q", path, "-d", tempDir], {
timeout: TIMEOUT,
});
} catch (error: any) {
throw new ActionableError(`Failed to unzip file: ${error.message}`);
}
const appBundle = this.findAppBundle(tempDir);
if (!appBundle) {
throw new ActionableError("No .app bundle found in the .zip file, please visit wiki at https://github.com/mobile-next/mobile-mcp/wiki for assistance.");
}
installPath = appBundle;
trace(`Found .app bundle at: ${basename(appBundle)}`);
}
// continue with installation
this.simctl("install", this.simulatorUuid, installPath);
} catch (error: any) {
const stdout = error.stdout ? error.stdout.toString() : "";
const stderr = error.stderr ? error.stderr.toString() : "";
const output = (stdout + stderr).trim();
throw new ActionableError(output || error.message);
} finally {
// Clean up temporary directory if it was created
if (tempDir) {
try {
trace(`Cleaning up temporary directory`);
rmSync(tempDir, { recursive: true, force: true });
} catch (cleanupError) {
trace(`Warning: Failed to cleanup temporary directory: ${cleanupError}`);
}
}
}
}
public async uninstallApp(bundleId: string): Promise<void> {
try {
this.simctl("uninstall", this.simulatorUuid, bundleId);
} catch (error: any) {
const stdout = error.stdout ? error.stdout.toString() : "";
const stderr = error.stderr ? error.stderr.toString() : "";
const output = (stdout + stderr).trim();
throw new ActionableError(output || error.message);
}
}
public async listApps(): Promise<InstalledApp[]> {
const text = this.simctl("listapps", this.simulatorUuid).toString();
const result = execFileSync("plutil", ["-convert", "json", "-o", "-", "-r", "-"], {
input: text,
});
const output = JSON.parse(result.toString()) as Record<string, AppInfo>;
return Object.values(output).map(app => ({
packageName: app.CFBundleIdentifier,
appName: app.CFBundleDisplayName,
}));
}
public async getScreenSize(): Promise<ScreenSize> {
const wda = await this.wda();
return wda.getScreenSize();
}
public async sendKeys(keys: string) {
const wda = await this.wda();
return wda.sendKeys(keys);
}
public async swipe(direction: SwipeDirection): Promise<void> {
const wda = await this.wda();
return wda.swipe(direction);
}
public async swipeFromCoordinate(x: number, y: number, direction: SwipeDirection, distance?: number): Promise<void> {
const wda = await this.wda();
return wda.swipeFromCoordinate(x, y, direction, distance);
}
public async tap(x: number, y: number) {
const wda = await this.wda();
return wda.tap(x, y);
}
public async doubleTap(x: number, y: number): Promise<void> {
const wda = await this.wda();
await wda.doubleTap(x, y);
}
public async longPress(x: number, y: number) {
const wda = await this.wda();
return wda.longPress(x, y);
}
public async pressButton(button: Button) {
const wda = await this.wda();
return wda.pressButton(button);
}
public async getElementsOnScreen(): Promise<ScreenElement[]> {
const wda = await this.wda();
return wda.getElementsOnScreen();
}
public async setOrientation(orientation: Orientation): Promise<void> {
const wda = await this.wda();
return wda.setOrientation(orientation);
}
public async getOrientation(): Promise<Orientation> {
const wda = await this.wda();
return wda.getOrientation();
}
}
export class SimctlManager {
public listSimulators(): Simulator[] {
// detect if this is a mac
if (process.platform !== "darwin") {
// don't even try to run xcrun
return [];
}
try {
const text = execFileSync("xcrun", ["simctl", "list", "devices", "-j"]).toString();
const json: ListDevicesResponse = JSON.parse(text);
return Object.values(json.devices).flatMap(device => {
return device.map(d => {
return {
name: d.name,
uuid: d.udid,
state: d.state,
};
});
});
} catch (error) {
console.error("Error listing simulators", error);
return [];
}
}
public listBootedSimulators(): Simulator[] {
return this.listSimulators()
.filter(simulator => simulator.state === "Booted");
}
public getSimulator(uuid: string): Simctl {
return new Simctl(uuid);
}
}
```
--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------
```markdown
## [0.0.34](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.34) (2025-11-01)
* Server: dry-run release for benchmarking how mobilecli detects devices ([#226](https://github.com/mobile-next/mobile-mcp/pull/226))
## [0.0.33](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.33) (2025-10-20)
* Server: added debug information for understanding screenshot issues on old devices ([#213](https://github.com/mobile-next/mobile-mcp/pull/213))
## [0.0.32](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.32) (2025-10-08)
* Server: fixed wrong separator when listing iOS simulators ([#208](https://github.com/mobile-next/mobile-mcp/pull/208))
* iOS: double tap at screen location ([#207](https://github.com/mobile-next/mobile-mcp/pull/207))
* Android: reduce stdout pollution by adb shell monkey ([#211](https://github.com/mobile-next/mobile-mcp/pull/211))
* Android: fix mobile_take_screenshot on very old android devices ([#204](https://github.com/mobile-next/mobile-mcp/pull/204)) by [@boulaycote](https://github.com/boulaycote)
* Android: double tap at screen location ([#194](https://github.com/mobile-next/mobile-mcp/pull/194)) by [@SakshamSahgal](https://github.com/SakshamSahgal)
## [0.0.31](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.31) (2025-10-07)
* Server: resolve mobilecli libc issues on very old linux distros ([#206](https://github.com/mobile-next/mobile-mcp/pull/206))
* Server: identify mcp-client for compatiblity patches ([#205](https://github.com/mobile-next/mobile-mcp/pull/205))
## [0.0.30](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.30) (2025-10-06)
* Server: introduction of mobilecli tool, will replace imagemagick, sips, go-ios and adb in the future ([#196](https://github.com/mobile-next/mobile-mcp/pull/196))
* iOS: app installation and uninstallation ([#202](https://github.com/mobile-next/mobile-mcp/pull/202))
* Android: app installation and uninstallation ([#202](https://github.com/mobile-next/mobile-mcp/pull/202))
## [0.0.29](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.29) (2025-09-26)
* Server: bumped mcp sdk to latest version ([#199](https://github.com/mobile-next/mobile-mcp/pull/199))
* Server: locked production npm packages to specific version ([#199](https://github.com/mobile-next/mobile-mcp/pull/199))
* Server: renamed tool 'swipe_on_screen' to 'mobile_swipe_on_screen' ([#197](https://github.com/mobile-next/mobile-mcp/pull/197))
## [0.0.28](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.28) (2025-09-15)
* Server: added 'device' parameter to all tools ([#181](https://github.com/mobile-next/mobile-mcp/pull/181))
* Server: enable agents to access multiple devices at once (eg, 'explain what's on screen on all devices connected')
([#181](https://github.com/mobile-next/mobile-mcp/pull/181))
## [0.0.27](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.27) (2025-09-10)
* Server: use 'sips' image scaling on mac if found, removes requirement to install ImageMagick for image scaling ([#188](https://github.com/mobile-next/mobile-mcp/pull/188))
## [0.0.26](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.26) (2025-09-09)
* Server: support listing of mobile-mcp in github's mcp registry ([e96404e](https://github.com/mobile-next/mobile-mcp/commit/e96404e0e513e48ebcfe7956800203cc0f363526))
## [0.0.25](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.25) (2025-09-08)
* Server: install mobile-mcp in vscode with a single-click in README ([#173](https://github.com/mobile-next/mobile-mcp/pull/173))
* Android: try finding 'adb' under $HOME/Library/Android if $ANDROID_HOME is not defined ([#183](https://github.com/mobile-next/mobile-mcp/pull/183))
* Android: better escaping of text input, for improved security ([#182](https://github.com/mobile-next/mobile-mcp/pull/183))
## [0.0.24](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.24) (2025-08-24)
* iOS: new tool for long press ([#143](https://github.com/mobile-next/mobile-mcp/pull/143))
* Android: new tool for long press ([#143](https://github.com/mobile-next/mobile-mcp/pull/143))
* Android: fixed screenshot from devices with multiple devices (foldables) again ([#171](https://github.com/mobile-next/mobile-mcp/pull/171))
## [0.0.23](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.23) (2025-07-31)
* Android: fixed a bug where devices with multiple screens (such as foldables) failed to take and save screenshot ([#159](https://github.com/mobile-next/mobile-mcp/pull/159))
## [0.0.22](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.22) (2025-07-17)
* iOS: fixed detection of go-ios installation ([#132](https://github.com/mobile-next/mobile-mcp/pull/132) by [@codeaholicguy](https://github.com/codeaholicguy)
## [0.0.21](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.21) (2025-06-27)
* Server: use node: prefixed modules (like node:fs) ([449c498](https://github.com/mobile-next/mobile-mcp/commit/449c498e6e9a3e68aab55ea82f15c296171fc05e))
* iOS: automatically start WebDriverAgent on simulator if already installed ([#126](https://github.com/mobile-next/mobile-mcp/pull/126))
* Android: fixed detection of com.mobilenext.devicekit when running mcp on windows ([c11c642](https://github.com/mobile-next/mobile-mcp/commit/c11c6427c71cb7cef6ce87005047df977f6bea8a))
## [0.0.20](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.20) (2025-06-23)
* Server: new tool `save_screenshot` which saves the screenshot to disk, to be used by other mcp servers ([#112](https://github.com/mobile-next/mobile-mcp/pull/112))
* Server: new tool `use_default_device` which picks the only device that is connected, to speed up use ([#112](https://github.com/mobile-next/mobile-mcp/pull/112))
* iOS: Use wda to grab screenshots for both real devices and simulators ([#115](https://github.com/mobile-next/mobile-mcp/pull/115))
* Android: Support for utf-8 text in sendKeys, see [wiki page]() for getting started ([#117](https://github.com/mobile-next/mobile-mcp/pull/117))
## [0.0.19](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.19) (2025-06-16)
* Server: Fixed support for Windsurf, where some tools caused a -32602 error ([#101](https://github.com/mobile-next/mobile-mcp/pull/101)) by [@amebahead](https://github.com/amebahead)
* iOS: Support for swipe left and right. Support x,y,direction,duration for custom swipes ([#92](https://github.com/mobile-next/mobile-mcp/pull/92/)) by [@benlmyers](https://github.com/benlmyers)
* Android: Support for swipe left and right. Support x,y,direction,duration for custom swipes ([#92](https://github.com/mobile-next/mobile-mcp/pull/92/)) by [@benlmyers](https://github.com/benlmyers)
* Android: Fix for get elements on screen, where uiautomator prints out warnings before the actual xml ([#86](https://github.com/mobile-next/mobile-mcp/pull/86)) by [@wenerme](https://github.com/wenerme)
## [0.0.18](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.18) (2025-06-12)
* Server: New support for SSE (Server-Sent-Events) transport, [see wiki for more information](https://github.com/mobile-next/mobile-mcp/wiki/Using-SSE-Transport) ([1b70d40](https://github.com/mobile-next/mobile-mcp/commit/1b70d403cd562a97a0723464f2b286f2fd6eee0a))
* iOS: Using plutil for `simctl listapps` parsing, might probably fix some parsing issues ([cfba3aa](https://github.com/mobile-next/mobile-mcp/commit/cfba3aaac5beb66d08d1138fe42c924309ede303))
* Other: We have a new Slack server, join us at http://mobilenexthq.com/join-slack
## [0.0.17](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.17) (2025-05-16)
* iOS: Fixed parsing of simctl listapps where CFBundleDisplayName contains non-alphanumerical characters ([#59](https://github.com/mobile-next/mobile-mcp/issues/59)) ([bf19771d](https://github.com/mobile-next/mobile-mcp/pull/63/commits/bf19771dcd49444ba4841ec649e3a72a03b54c74))
## [0.0.16](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.16) (2025-05-10)
* Server: Detect if there is a new version of the mcp and notify user ([14b015f](https://github.com/mobile-next/mobile-mcp/commit/14b015f29ab47aa1f3ae122a670a58eb7ef51fd8))
* Server: Instead of returning x,y for tap, return [top,left,width,height] of elements on screen ([3169d2f](https://github.com/mobile-next/mobile-mcp/commit/3169d2f46f0c789e4c3188e137ac645d6f6eb27c))
* iOS: Fixed coordinates location for iOS with retina display after image scaledown ([3169d2f](https://github.com/mobile-next/mobile-mcp/commit/3169d2f46f0c789e4c3188e137ac645d6f6eb27c))
* iOS: Added detection of StaticText and Image in mobile_list_elements_on_screen ([debe75b](https://github.com/mobile-next/mobile-mcp/commit/debe75b5c8afcafcef8328201e9886bffdd1f128))
## [0.0.15](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.15) (2025-05-04)
* Android: Fixed broken Android screenshots on Windows because of crlf ([#53](https://github.com/mobile-next/mobile-mcp/pull/53/files) by [@hanyuan97](https://github.com/hanyuan97))
## [0.0.14](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.14) (2025-05-02)
* Server: Fix a bug where xcrun was required, now works on Linux as well ([7fddba7](https://github.com/mobile-next/mobile-mcp/commit/7fddba71af51690cfa76f81154f72c3120ab7f07))
* Server: Removed dependency on sharp which was causing issues during installation, now ImageMagick is an optional dependency
* Android: Try uiautomator-dump multiple times, in case ui hierarchy is not stable
* Android: Return more information about elements on screen for better element detection
* Android: Support for Android TV using dpad for navigation ([399443d](https://github.com/mobile-next/mobile-mcp/commit/399443d519284a54b670a1598689a73d178db2ec) by [@surajsau](https://github.com/surajsau))
## [0.0.13](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.13) (2025-04-17)
* Server: Fix a bug where 'adb' is required to even work with iOS-only ([#30](https://github.com/mobile-next/mobile-mcp/issues/30)) ([867f662](https://github.com/mobile-next/mobile-mcp/pull/35/commits/867f662ac2edc68d542519bd72d1762d3dbca18d))
* iOS: Support for orientation changes ([844dc0e](https://github.com/mobile-next/mobile-mcp/pull/28/commits/844dc0eb953169871b4cdd2a57735bf50abe721a))
* Android: Support for orientation changes (eg 'change device to landscape') ([844dc0e](https://github.com/mobile-next/mobile-mcp/pull/28/commits/844dc0eb953169871b4cdd2a57735bf50abe721a))
* Android: Improve element detection by using element name if label not found ([8e8aadf](https://github.com/mobile-next/mobile-mcp/pull/33/commits/8e8aadfd7f300ff5b7f0a7857a99d1103cd9e941) by [@tomoya0x00](https://github.com/tomoya0x00))
## [0.0.12](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.12) (2025-04-12)
* Server: If hitting an error with tunnel, forward proxy, wda, descriptive error and link to documentation will be returned
* iOS: go-ios path can be set in env GO_IOS_PATH
* iOS: Support go-ios that was built locally (no version)
* iOS: Return bundle display name for apps for better app launch
* iOS: Fixed finding element coordinates on retina displays
* iOS: Saving temporary screenshots onto temporary directory ([#19](https://github.com/mobile-next/mobile-mcp/issues/19))
* iOS: Find elements better by removing off-screen and hidden elements
* Android: Support for 'adb' under ANDROID_HOME
* Android: Find elements better using accessibility hints and class names
## [0.0.11](https://github.com/mobile-next/mobile-mcp/releases/tag/0.0.11) (2025-04-06)
* Server: Support submit after sending text (\n)
* Server: Added support for multiple devices at the same time
* iOS: Support for iOS physical devices using go-ios ([see wiki](https://github.com/mobile-next/mobile-mcp/wiki/Getting-Started-with-iOS-Physical-Device))
* iOS: Added support for icons, search fields, and switches when getting elements on screen
```
--------------------------------------------------------------------------------
/src/webdriver-agent.ts:
--------------------------------------------------------------------------------
```typescript
import { ActionableError, SwipeDirection, ScreenSize, ScreenElement, Orientation } from "./robot";
export interface SourceTreeElementRect {
x: number;
y: number;
width: number;
height: number;
}
export interface SourceTreeElement {
type: string;
label?: string;
name?: string;
value?: string;
rawIdentifier?: string;
rect: SourceTreeElementRect;
isVisible?: string; // "0" or "1"
children?: Array<SourceTreeElement>;
}
export interface SourceTree {
value: SourceTreeElement;
}
export class WebDriverAgent {
constructor(private readonly host: string, private readonly port: number) {
}
public async isRunning(): Promise<boolean> {
const url = `http://${this.host}:${this.port}/status`;
try {
const response = await fetch(url);
const json = await response.json();
return response.status === 200 && json.value?.ready === true;
} catch (error) {
// console.error(`Failed to connect to WebDriverAgent: ${error}`);
return false;
}
}
public async createSession(): Promise<string> {
const url = `http://${this.host}:${this.port}/session`;
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({ capabilities: { alwaysMatch: { platformName: "iOS" } } }),
});
if (!response.ok) {
const errorText = await response.text();
throw new ActionableError(`Failed to create WebDriver session: ${response.status} ${errorText}`);
}
const json = await response.json();
if (!json.value || !json.value.sessionId) {
throw new ActionableError(`Invalid session response: ${JSON.stringify(json)}`);
}
return json.value.sessionId;
}
public async deleteSession(sessionId: string) {
const url = `http://${this.host}:${this.port}/session/${sessionId}`;
const response = await fetch(url, { method: "DELETE" });
return response.json();
}
public async withinSession(fn: (url: string) => Promise<any>) {
const sessionId = await this.createSession();
const url = `http://${this.host}:${this.port}/session/${sessionId}`;
const result = await fn(url);
await this.deleteSession(sessionId);
return result;
}
public async getScreenSize(sessionUrl?: string): Promise<ScreenSize> {
if (sessionUrl) {
const url = `${sessionUrl}/wda/screen`;
const response = await fetch(url);
const json = await response.json();
return {
width: json.value.screenSize.width,
height: json.value.screenSize.height,
scale: json.value.scale || 1,
};
} else {
return this.withinSession(async sessionUrlInner => {
const url = `${sessionUrlInner}/wda/screen`;
const response = await fetch(url);
const json = await response.json();
return {
width: json.value.screenSize.width,
height: json.value.screenSize.height,
scale: json.value.scale || 1,
};
});
}
}
public async sendKeys(keys: string) {
await this.withinSession(async sessionUrl => {
const url = `${sessionUrl}/wda/keys`;
await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({ value: [keys] }),
});
});
}
public async pressButton(button: string) {
const _map = {
"HOME": "home",
"VOLUME_UP": "volumeup",
"VOLUME_DOWN": "volumedown",
};
if (button === "ENTER") {
await this.sendKeys("\n");
return;
}
// Type assertion to check if button is a key of _map
if (!(button in _map)) {
throw new ActionableError(`Button "${button}" is not supported`);
}
await this.withinSession(async sessionUrl => {
const url = `${sessionUrl}/wda/pressButton`;
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
name: button,
}),
});
return response.json();
});
}
public async tap(x: number, y: number) {
await this.withinSession(async sessionUrl => {
const url = `${sessionUrl}/actions`;
await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
actions: [
{
type: "pointer",
id: "finger1",
parameters: { pointerType: "touch" },
actions: [
{ type: "pointerMove", duration: 0, x, y },
{ type: "pointerDown", button: 0 },
{ type: "pause", duration: 100 },
{ type: "pointerUp", button: 0 }
]
}
]
}),
});
});
}
public async doubleTap(x: number, y: number) {
await this.withinSession(async sessionUrl => {
const url = `${sessionUrl}/actions`;
await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
actions: [
{
type: "pointer",
id: "finger1",
parameters: { pointerType: "touch" },
actions: [
{ type: "pointerMove", duration: 0, x, y },
{ type: "pointerDown", button: 0 },
{ type: "pause", duration: 50 },
{ type: "pointerUp", button: 0 },
{ type: "pause", duration: 100 },
{ type: "pointerDown", button: 0 },
{ type: "pause", duration: 50 },
{ type: "pointerUp", button: 0 }
]
}
]
}),
});
});
}
public async longPress(x: number, y: number) {
await this.withinSession(async sessionUrl => {
const url = `${sessionUrl}/actions`;
await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
actions: [
{
type: "pointer",
id: "finger1",
parameters: { pointerType: "touch" },
actions: [
{ type: "pointerMove", duration: 0, x, y },
{ type: "pointerDown", button: 0 },
{ type: "pause", duration: 500 },
{ type: "pointerUp", button: 0 }
]
}
]
}),
});
});
}
private isVisible(rect: SourceTreeElementRect): boolean {
return rect.x >= 0 && rect.y >= 0;
}
private filterSourceElements(source: SourceTreeElement): Array<ScreenElement> {
const output: ScreenElement[] = [];
const acceptedTypes = ["TextField", "Button", "Switch", "Icon", "SearchField", "StaticText", "Image"];
if (acceptedTypes.includes(source.type)) {
if (source.isVisible === "1" && this.isVisible(source.rect)) {
if (source.label !== null || source.name !== null || source.rawIdentifier !== null) {
output.push({
type: source.type,
label: source.label,
name: source.name,
value: source.value,
identifier: source.rawIdentifier,
rect: {
x: source.rect.x,
y: source.rect.y,
width: source.rect.width,
height: source.rect.height,
},
});
}
}
}
if (source.children) {
for (const child of source.children) {
output.push(...this.filterSourceElements(child));
}
}
return output;
}
public async getPageSource(): Promise<SourceTree> {
const url = `http://${this.host}:${this.port}/source/?format=json`;
const response = await fetch(url);
const json = await response.json();
return json as SourceTree;
}
public async getElementsOnScreen(): Promise<ScreenElement[]> {
const source = await this.getPageSource();
return this.filterSourceElements(source.value);
}
public async openUrl(url: string): Promise<void> {
await this.withinSession(async sessionUrl => {
await fetch(`${sessionUrl}/url`, {
method: "POST",
body: JSON.stringify({ url }),
});
});
}
public async getScreenshot(): Promise<Buffer> {
const url = `http://${this.host}:${this.port}/screenshot`;
const response = await fetch(url);
const json = await response.json();
return Buffer.from(json.value, "base64");
}
public async swipe(direction: SwipeDirection): Promise<void> {
await this.withinSession(async sessionUrl => {
const screenSize = await this.getScreenSize(sessionUrl);
let x0: number, y0: number, x1: number, y1: number;
// Use 60% of the width/height for swipe distance
const verticalDistance = Math.floor(screenSize.height * 0.6);
const horizontalDistance = Math.floor(screenSize.width * 0.6);
const centerX = Math.floor(screenSize.width / 2);
const centerY = Math.floor(screenSize.height / 2);
switch (direction) {
case "up":
x0 = x1 = centerX;
y0 = centerY + Math.floor(verticalDistance / 2);
y1 = centerY - Math.floor(verticalDistance / 2);
break;
case "down":
x0 = x1 = centerX;
y0 = centerY - Math.floor(verticalDistance / 2);
y1 = centerY + Math.floor(verticalDistance / 2);
break;
case "left":
y0 = y1 = centerY;
x0 = centerX + Math.floor(horizontalDistance / 2);
x1 = centerX - Math.floor(horizontalDistance / 2);
break;
case "right":
y0 = y1 = centerY;
x0 = centerX - Math.floor(horizontalDistance / 2);
x1 = centerX + Math.floor(horizontalDistance / 2);
break;
default:
throw new ActionableError(`Swipe direction "${direction}" is not supported`);
}
const url = `${sessionUrl}/actions`;
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
actions: [
{
type: "pointer",
id: "finger1",
parameters: { pointerType: "touch" },
actions: [
{ type: "pointerMove", duration: 0, x: x0, y: y0 },
{ type: "pointerDown", button: 0 },
{ type: "pointerMove", duration: 1000, x: x1, y: y1 },
{ type: "pointerUp", button: 0 }
]
}
]
}),
});
if (!response.ok) {
const errorText = await response.text();
throw new ActionableError(`WebDriver actions request failed: ${response.status} ${errorText}`);
}
// Clear actions to ensure they complete
await fetch(`${sessionUrl}/actions`, {
method: "DELETE",
});
});
}
public async swipeFromCoordinate(x: number, y: number, direction: SwipeDirection, distance: number = 400): Promise<void> {
await this.withinSession(async sessionUrl => {
// Use simple coordinates like the working swipe method
const x0 = x;
const y0 = y;
let x1 = x;
let y1 = y;
// Calculate target position based on direction and distance
switch (direction) {
case "up":
y1 = y - distance; // Move up by specified distance
break;
case "down":
y1 = y + distance; // Move down by specified distance
break;
case "left":
x1 = x - distance; // Move left by specified distance
break;
case "right":
x1 = x + distance; // Move right by specified distance
break;
default:
throw new ActionableError(`Swipe direction "${direction}" is not supported`);
}
const url = `${sessionUrl}/actions`;
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
actions: [
{
type: "pointer",
id: "finger1",
parameters: { pointerType: "touch" },
actions: [
{ type: "pointerMove", duration: 0, x: x0, y: y0 },
{ type: "pointerDown", button: 0 },
{ type: "pointerMove", duration: 1000, x: x1, y: y1 },
{ type: "pointerUp", button: 0 }
]
}
]
}),
});
if (!response.ok) {
const errorText = await response.text();
throw new ActionableError(`WebDriver actions request failed: ${response.status} ${errorText}`);
}
// Clear actions to ensure they complete
await fetch(`${sessionUrl}/actions`, {
method: "DELETE",
});
});
}
public async setOrientation(orientation: Orientation): Promise<void> {
await this.withinSession(async sessionUrl => {
const url = `${sessionUrl}/orientation`;
await fetch(url, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
orientation: orientation.toUpperCase()
})
});
});
}
public async getOrientation(): Promise<Orientation> {
return this.withinSession(async sessionUrl => {
const url = `${sessionUrl}/orientation`;
const response = await fetch(url);
const json = await response.json();
return json.value.toLowerCase() as Orientation;
});
}
}
```
--------------------------------------------------------------------------------
/src/android.ts:
--------------------------------------------------------------------------------
```typescript
import path from "node:path";
import { execFileSync } from "node:child_process";
import { existsSync } from "node:fs";
import * as xml from "fast-xml-parser";
import { ActionableError, Button, InstalledApp, Robot, ScreenElement, ScreenElementRect, ScreenSize, SwipeDirection, Orientation } from "./robot";
export interface AndroidDevice {
deviceId: string;
deviceType: "tv" | "mobile";
}
interface UiAutomatorXmlNode {
node: UiAutomatorXmlNode[];
class?: string;
text?: string;
bounds?: string;
hint?: string;
focused?: string;
"content-desc"?: string;
"resource-id"?: string;
}
interface UiAutomatorXml {
hierarchy: {
node: UiAutomatorXmlNode;
};
}
const getAdbPath = (): string => {
if (process.env.ANDROID_HOME) {
return path.join(process.env.ANDROID_HOME, "platform-tools", "adb");
}
if (process.platform === "win32" && process.env.LOCALAPPDATA) {
const windowsAdbPath = path.join(process.env.LOCALAPPDATA, "Android", "Sdk", "platform-tools", "adb.exe");
if (existsSync(windowsAdbPath)) {
return windowsAdbPath;
}
}
if (process.env.HOME) {
const defaultAndroidSdk = path.join(process.env.HOME, "Library", "Android", "sdk", "platform-tools", "adb");
if (existsSync(defaultAndroidSdk)) {
return defaultAndroidSdk;
}
}
// fallthrough, hope for the best
return "adb";
};
const BUTTON_MAP: Record<Button, string> = {
"BACK": "KEYCODE_BACK",
"HOME": "KEYCODE_HOME",
"VOLUME_UP": "KEYCODE_VOLUME_UP",
"VOLUME_DOWN": "KEYCODE_VOLUME_DOWN",
"ENTER": "KEYCODE_ENTER",
"DPAD_CENTER": "KEYCODE_DPAD_CENTER",
"DPAD_UP": "KEYCODE_DPAD_UP",
"DPAD_DOWN": "KEYCODE_DPAD_DOWN",
"DPAD_LEFT": "KEYCODE_DPAD_LEFT",
"DPAD_RIGHT": "KEYCODE_DPAD_RIGHT",
};
const TIMEOUT = 30000;
const MAX_BUFFER_SIZE = 1024 * 1024 * 4;
type AndroidDeviceType = "tv" | "mobile";
export class AndroidRobot implements Robot {
public constructor(private deviceId: string) {
}
public adb(...args: string[]): Buffer {
return execFileSync(getAdbPath(), ["-s", this.deviceId, ...args], {
maxBuffer: MAX_BUFFER_SIZE,
timeout: TIMEOUT,
});
}
public silentAdb(...args: string[]): Buffer {
return execFileSync(getAdbPath(), ["-s", this.deviceId, ...args], {
maxBuffer: MAX_BUFFER_SIZE,
timeout: TIMEOUT,
stdio: ["pipe", "pipe", "pipe"],
});
}
public getSystemFeatures(): string[] {
return this.adb("shell", "pm", "list", "features")
.toString()
.split("\n")
.map(line => line.trim())
.filter(line => line.startsWith("feature:"))
.map(line => line.substring("feature:".length));
}
public async getScreenSize(): Promise<ScreenSize> {
const screenSize = this.adb("shell", "wm", "size")
.toString()
.split(" ")
.pop();
if (!screenSize) {
throw new Error("Failed to get screen size");
}
const scale = 1;
const [width, height] = screenSize.split("x").map(Number);
return { width, height, scale };
}
public async listApps(): Promise<InstalledApp[]> {
// only apps that have a launcher activity are returned
return this.adb("shell", "cmd", "package", "query-activities", "-a", "android.intent.action.MAIN", "-c", "android.intent.category.LAUNCHER")
.toString()
.split("\n")
.map(line => line.trim())
.filter(line => line.startsWith("packageName="))
.map(line => line.substring("packageName=".length))
.filter((value, index, self) => self.indexOf(value) === index)
.map(packageName => ({
packageName,
appName: packageName,
}));
}
private async listPackages(): Promise<string[]> {
return this.adb("shell", "pm", "list", "packages")
.toString()
.split("\n")
.map(line => line.trim())
.filter(line => line.startsWith("package:"))
.map(line => line.substring("package:".length));
}
public async launchApp(packageName: string): Promise<void> {
try {
this.silentAdb("shell", "monkey", "-p", packageName, "-c", "android.intent.category.LAUNCHER", "1");
} catch (error) {
throw new ActionableError(`Failed launching app with package name "${packageName}", please make sure it exists`);
}
}
public async listRunningProcesses(): Promise<string[]> {
return this.adb("shell", "ps", "-e")
.toString()
.split("\n")
.map(line => line.trim())
.filter(line => line.startsWith("u")) // non-system processes
.map(line => line.split(/\s+/)[8]); // get process name
}
public async swipe(direction: SwipeDirection): Promise<void> {
const screenSize = await this.getScreenSize();
const centerX = screenSize.width >> 1;
let x0: number, y0: number, x1: number, y1: number;
switch (direction) {
case "up":
x0 = x1 = centerX;
y0 = Math.floor(screenSize.height * 0.80);
y1 = Math.floor(screenSize.height * 0.20);
break;
case "down":
x0 = x1 = centerX;
y0 = Math.floor(screenSize.height * 0.20);
y1 = Math.floor(screenSize.height * 0.80);
break;
case "left":
x0 = Math.floor(screenSize.width * 0.80);
x1 = Math.floor(screenSize.width * 0.20);
y0 = y1 = Math.floor(screenSize.height * 0.50);
break;
case "right":
x0 = Math.floor(screenSize.width * 0.20);
x1 = Math.floor(screenSize.width * 0.80);
y0 = y1 = Math.floor(screenSize.height * 0.50);
break;
default:
throw new ActionableError(`Swipe direction "${direction}" is not supported`);
}
this.adb("shell", "input", "swipe", `${x0}`, `${y0}`, `${x1}`, `${y1}`, "1000");
}
public async swipeFromCoordinate(x: number, y: number, direction: SwipeDirection, distance?: number): Promise<void> {
const screenSize = await this.getScreenSize();
let x0: number, y0: number, x1: number, y1: number;
// Use provided distance or default to 30% of screen dimension
const defaultDistanceY = Math.floor(screenSize.height * 0.3);
const defaultDistanceX = Math.floor(screenSize.width * 0.3);
const swipeDistanceY = distance || defaultDistanceY;
const swipeDistanceX = distance || defaultDistanceX;
switch (direction) {
case "up":
x0 = x1 = x;
y0 = y;
y1 = Math.max(0, y - swipeDistanceY);
break;
case "down":
x0 = x1 = x;
y0 = y;
y1 = Math.min(screenSize.height, y + swipeDistanceY);
break;
case "left":
x0 = x;
x1 = Math.max(0, x - swipeDistanceX);
y0 = y1 = y;
break;
case "right":
x0 = x;
x1 = Math.min(screenSize.width, x + swipeDistanceX);
y0 = y1 = y;
break;
default:
throw new ActionableError(`Swipe direction "${direction}" is not supported`);
}
this.adb("shell", "input", "swipe", `${x0}`, `${y0}`, `${x1}`, `${y1}`, "1000");
}
private getDisplayCount(): number {
return this.adb("shell", "dumpsys", "SurfaceFlinger", "--display-id")
.toString()
.split("\n")
.filter(s => s.startsWith("Display "))
.length;
}
private getFirstDisplayId(): string | null {
try {
// Try using cmd display get-displays (Android 11+)
const displays = this.adb("shell", "cmd", "display", "get-displays")
.toString()
.split("\n")
.filter(s => s.startsWith("Display id "))
// filter for state ON even though get-displays only returns turned on displays
.filter(s => s.indexOf(", state ON,") >= 0)
// another paranoia check
.filter(s => s.indexOf(", uniqueId ") >= 0);
if (displays.length > 0) {
const m = displays[0].match(/uniqueId \"([^\"]+)\"/);
if (m !== null) {
let displayId = m[1];
if (displayId.startsWith("local:")) {
displayId = displayId.substring("local:".length);
}
return displayId;
}
}
} catch (error) {
// cmd display get-displays not available on this device
}
// fallback: parse dumpsys display for display info (compatible with older Android versions)
try {
const dumpsys = this.adb("shell", "dumpsys", "display")
.toString();
// look for DisplayViewport entries with isActive=true and type=INTERNAL
const viewportMatch = dumpsys.match(/DisplayViewport\{type=INTERNAL[^}]*isActive=true[^}]*uniqueId='([^']+)'/);
if (viewportMatch) {
let uniqueId = viewportMatch[1];
if (uniqueId.startsWith("local:")) {
uniqueId = uniqueId.substring("local:".length);
}
return uniqueId;
}
// fallback: look for active display with state ON
const displayStateMatch = dumpsys.match(/Display Id=(\d+)[\s\S]*?Display State=ON/);
if (displayStateMatch) {
return displayStateMatch[1];
}
} catch (error) {
// dumpsys display also failed
}
return null;
}
public async getScreenshot(): Promise<Buffer> {
if (this.getDisplayCount() <= 1) {
// backward compatibility for android 10 and below, and for single display devices
return this.adb("exec-out", "screencap", "-p");
}
// find the first display that is turned on, and capture that one
const displayId = this.getFirstDisplayId();
if (displayId === null) {
// no idea why, but we have displayCount >= 2, yet we failed to parse
// let's go with screencap's defaults and hope for the best
return this.adb("exec-out", "screencap", "-p");
}
return this.adb("exec-out", "screencap", "-p", "-d", `${displayId}`);
}
private collectElements(node: UiAutomatorXmlNode): ScreenElement[] {
const elements: Array<ScreenElement> = [];
if (node.node) {
if (Array.isArray(node.node)) {
for (const childNode of node.node) {
elements.push(...this.collectElements(childNode));
}
} else {
elements.push(...this.collectElements(node.node));
}
}
if (node.text || node["content-desc"] || node.hint) {
const element: ScreenElement = {
type: node.class || "text",
text: node.text,
label: node["content-desc"] || node.hint || "",
rect: this.getScreenElementRect(node),
};
if (node.focused === "true") {
// only provide it if it's true, otherwise don't confuse llm
element.focused = true;
}
const resourceId = node["resource-id"];
if (resourceId !== null && resourceId !== "") {
element.identifier = resourceId;
}
if (element.rect.width > 0 && element.rect.height > 0) {
elements.push(element);
}
}
return elements;
}
public async getElementsOnScreen(): Promise<ScreenElement[]> {
const parsedXml = await this.getUiAutomatorXml();
const hierarchy = parsedXml.hierarchy;
const elements = this.collectElements(hierarchy.node);
return elements;
}
public async terminateApp(packageName: string): Promise<void> {
this.adb("shell", "am", "force-stop", packageName);
}
public async installApp(path: string): Promise<void> {
try {
this.adb("install", "-r", path);
} catch (error: any) {
const stdout = error.stdout ? error.stdout.toString() : "";
const stderr = error.stderr ? error.stderr.toString() : "";
const output = (stdout + stderr).trim();
throw new ActionableError(output || error.message);
}
}
public async uninstallApp(bundleId: string): Promise<void> {
try {
this.adb("uninstall", bundleId);
} catch (error: any) {
const stdout = error.stdout ? error.stdout.toString() : "";
const stderr = error.stderr ? error.stderr.toString() : "";
const output = (stdout + stderr).trim();
throw new ActionableError(output || error.message);
}
}
public async openUrl(url: string): Promise<void> {
this.adb("shell", "am", "start", "-a", "android.intent.action.VIEW", "-d", url);
}
private isAscii(text: string): boolean {
return /^[\x00-\x7F]*$/.test(text);
}
private escapeShellText(text: string): string {
// escape all shell special characters that could be used for injection
return text.replace(/[\\'"` \t\n\r|&;()<>{}[\]$*?]/g, "\\$&");
}
private async isDeviceKitInstalled(): Promise<boolean> {
const packages = await this.listPackages();
return packages.includes("com.mobilenext.devicekit");
}
public async sendKeys(text: string): Promise<void> {
if (text === "") {
// bailing early, so we don't run adb shell with empty string.
// this happens when you prompt with a simple "submit".
return;
}
if (this.isAscii(text)) {
// adb shell input only supports ascii characters. and
// some of the keys have to be escaped.
const _text = this.escapeShellText(text);
this.adb("shell", "input", "text", _text);
} else if (await this.isDeviceKitInstalled()) {
// try sending over clipboard
const base64 = Buffer.from(text).toString("base64");
// send clipboard over and immediately paste it
this.adb("shell", "am", "broadcast", "-a", "devicekit.clipboard.set", "-e", "encoding", "base64", "-e", "text", base64, "-n", "com.mobilenext.devicekit/.ClipboardBroadcastReceiver");
this.adb("shell", "input", "keyevent", "KEYCODE_PASTE");
// clear clipboard when we're done
this.adb("shell", "am", "broadcast", "-a", "devicekit.clipboard.clear", "-n", "com.mobilenext.devicekit/.ClipboardBroadcastReceiver");
} else {
throw new ActionableError("Non-ASCII text is not supported on Android, please install mobilenext devicekit, see https://github.com/mobile-next/devicekit-android");
}
}
public async pressButton(button: Button) {
if (!BUTTON_MAP[button]) {
throw new ActionableError(`Button "${button}" is not supported`);
}
const mapped = BUTTON_MAP[button];
this.adb("shell", "input", "keyevent", mapped);
}
public async tap(x: number, y: number): Promise<void> {
this.adb("shell", "input", "tap", `${x}`, `${y}`);
}
public async longPress(x: number, y: number): Promise<void> {
// a long press is a swipe with no movement and a long duration
this.adb("shell", "input", "swipe", `${x}`, `${y}`, `${x}`, `${y}`, "500");
}
public async doubleTap(x: number, y: number): Promise<void> {
await this.tap(x, y);
await new Promise(r => setTimeout(r, 100)); // short delay
await this.tap(x, y);
}
public async setOrientation(orientation: Orientation): Promise<void> {
const value = orientation === "portrait" ? 0 : 1;
// disable auto-rotation prior to setting the orientation
this.adb("shell", "settings", "put", "system", "accelerometer_rotation", "0");
this.adb("shell", "content", "insert", "--uri", "content://settings/system", "--bind", "name:s:user_rotation", "--bind", `value:i:${value}`);
}
public async getOrientation(): Promise<Orientation> {
const rotation = this.adb("shell", "settings", "get", "system", "user_rotation").toString().trim();
return rotation === "0" ? "portrait" : "landscape";
}
private async getUiAutomatorDump(): Promise<string> {
for (let tries = 0; tries < 10; tries++) {
const dump = this.adb("exec-out", "uiautomator", "dump", "/dev/tty").toString();
// note: we're not catching other errors here. maybe we should check for <?xml
if (dump.includes("null root node returned by UiTestAutomationBridge")) {
// uncomment for debugging
// const screenshot = await this.getScreenshot();
// console.error("Failed to get UIAutomator XML. Here's a screenshot: " + screenshot.toString("base64"));
continue;
}
return dump.substring(dump.indexOf("<?xml"));
}
throw new ActionableError("Failed to get UIAutomator XML");
}
private async getUiAutomatorXml(): Promise<UiAutomatorXml> {
const dump = await this.getUiAutomatorDump();
const parser = new xml.XMLParser({
ignoreAttributes: false,
attributeNamePrefix: "",
});
return parser.parse(dump) as UiAutomatorXml;
}
private getScreenElementRect(node: UiAutomatorXmlNode): ScreenElementRect {
const bounds = String(node.bounds);
const [, left, top, right, bottom] = bounds.match(/^\[(\d+),(\d+)\]\[(\d+),(\d+)\]$/)?.map(Number) || [];
return {
x: left,
y: top,
width: right - left,
height: bottom - top,
};
}
}
export class AndroidDeviceManager {
private getDeviceType(name: string): AndroidDeviceType {
const device = new AndroidRobot(name);
const features = device.getSystemFeatures();
if (features.includes("android.software.leanback") || features.includes("android.hardware.type.television")) {
return "tv";
}
return "mobile";
}
public getConnectedDevices(): AndroidDevice[] {
try {
const names = execFileSync(getAdbPath(), ["devices"])
.toString()
.split("\n")
.map(line => line.trim())
.filter(line => line !== "")
.filter(line => !line.startsWith("List of devices attached"))
.map(line => line.split("\t")[0]);
return names.map(name => ({
deviceId: name,
deviceType: this.getDeviceType(name),
}));
} catch (error) {
console.error("Could not execute adb command, maybe ANDROID_HOME is not set?");
return [];
}
}
}
```
--------------------------------------------------------------------------------
/src/server.ts:
--------------------------------------------------------------------------------
```typescript
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { CallToolResult } from "@modelcontextprotocol/sdk/types";
import { z, ZodRawShape, ZodTypeAny } from "zod";
import fs from "node:fs";
import os from "node:os";
import crypto from "node:crypto";
import { execFileSync } from "node:child_process";
import { error, trace } from "./logger";
import { AndroidRobot, AndroidDeviceManager } from "./android";
import { ActionableError, Robot } from "./robot";
import { SimctlManager } from "./iphone-simulator";
import { IosManager, IosRobot } from "./ios";
import { PNG } from "./png";
import { isScalingAvailable, Image } from "./image-utils";
import { getMobilecliPath } from "./mobilecli";
interface MobilecliDevicesResponse {
status: "ok";
data: {
devices: Array<{
id: string;
name: string;
platform: "android" | "ios";
type: "real" | "emulator" | "simulator";
version: string;
}>;
};
}
export const getAgentVersion = (): string => {
const json = require("../package.json");
return json.version;
};
export const createMcpServer = (): McpServer => {
const server = new McpServer({
name: "mobile-mcp",
version: getAgentVersion(),
capabilities: {
resources: {},
tools: {},
},
});
// an empty object to satisfy windsurf
const noParams = z.object({});
const getClientName = (): string => {
try {
const clientInfo = server.server.getClientVersion();
const clientName = clientInfo?.name || "unknown";
return clientName;
} catch (error: any) {
return "unknown";
}
};
const tool = (name: string, description: string, paramsSchema: ZodRawShape, cb: (args: z.objectOutputType<ZodRawShape, ZodTypeAny>) => Promise<string>) => {
const wrappedCb = async (args: ZodRawShape): Promise<CallToolResult> => {
try {
trace(`Invoking ${name} with args: ${JSON.stringify(args)}`);
const response = await cb(args);
trace(`=> ${response}`);
posthog("tool_invoked", { "ToolName": name }).then();
return {
content: [{ type: "text", text: response }],
};
} catch (error: any) {
posthog("tool_failed", { "ToolName": name }).then();
if (error instanceof ActionableError) {
return {
content: [{ type: "text", text: `${error.message}. Please fix the issue and try again.` }],
};
} else {
// a real exception
trace(`Tool '${description}' failed: ${error.message} stack: ${error.stack}`);
return {
content: [{ type: "text", text: `Error: ${error.message}` }],
isError: true,
};
}
}
};
server.tool(name, description, paramsSchema, args => wrappedCb(args));
};
const posthog = async (event: string, properties: Record<string, string | number>) => {
try {
const url = "https://us.i.posthog.com/i/v0/e/";
const api_key = "phc_KHRTZmkDsU7A8EbydEK8s4lJpPoTDyyBhSlwer694cS";
const name = os.hostname() + process.execPath;
const distinct_id = crypto.createHash("sha256").update(name).digest("hex");
const systemProps: any = {
Platform: os.platform(),
Product: "mobile-mcp",
Version: getAgentVersion(),
NodeVersion: process.version,
};
const clientName = getClientName();
if (clientName !== "unknown") {
systemProps.AgentName = clientName;
}
await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json"
},
body: JSON.stringify({
api_key,
event,
properties: {
...systemProps,
...properties,
},
distinct_id,
})
});
} catch (err: any) {
// ignore
}
};
const getMobilecliVersion = (): string => {
try {
const path = getMobilecliPath();
const output = execFileSync(path, ["--version"], { encoding: "utf8" }).toString().trim();
if (output.startsWith("mobilecli version ")) {
return output.substring("mobilecli version ".length);
}
return "failed";
} catch (error: any) {
return "failed " + error.message;
}
};
const getMobilecliDevices = (): MobilecliDevicesResponse => {
const mobilecliPath = getMobilecliPath();
const mobilecliOutput = execFileSync(mobilecliPath, ["devices"], { encoding: "utf8" }).toString().trim();
return JSON.parse(mobilecliOutput) as MobilecliDevicesResponse;
};
const mobilecliVersion = getMobilecliVersion();
posthog("launch", { "MobilecliVersion": mobilecliVersion }).then();
const simulatorManager = new SimctlManager();
const getRobotFromDevice = (device: string): Robot => {
const iosManager = new IosManager();
const androidManager = new AndroidDeviceManager();
const simulators = simulatorManager.listBootedSimulators();
const androidDevices = androidManager.getConnectedDevices();
const iosDevices = iosManager.listDevices();
// Check if it's a simulator
const simulator = simulators.find(s => s.name === device);
if (simulator) {
return simulatorManager.getSimulator(device);
}
// Check if it's an Android device
const androidDevice = androidDevices.find(d => d.deviceId === device);
if (androidDevice) {
return new AndroidRobot(device);
}
// Check if it's an iOS device
const iosDevice = iosDevices.find(d => d.deviceId === device);
if (iosDevice) {
return new IosRobot(device);
}
throw new ActionableError(`Device "${device}" not found. Use the mobile_list_available_devices tool to see available devices.`);
};
tool(
"mobile_list_available_devices",
"List all available devices. This includes both physical devices and simulators. If there is more than one device returned, you need to let the user select one of them.",
{
noParams
},
async ({}) => {
const iosManager = new IosManager();
const androidManager = new AndroidDeviceManager();
const simulators = simulatorManager.listBootedSimulators();
const simulatorNames = simulators.map(d => d.name);
const androidDevices = androidManager.getConnectedDevices();
const iosDevices = await iosManager.listDevices();
const iosDeviceNames = iosDevices.map(d => d.deviceId);
const androidTvDevices = androidDevices.filter(d => d.deviceType === "tv").map(d => d.deviceId);
const androidMobileDevices = androidDevices.filter(d => d.deviceType === "mobile").map(d => d.deviceId);
if (true) {
// gilm: this is new code to verify first that mobilecli detects more or equal number of devices.
// in an attempt to make the smoothest transition from go-ios+xcrun+adb+iproxy+sips+imagemagick+wda to
// a single cli tool.
const deviceCount = simulators.length + iosDevices.length + androidDevices.length;
let mobilecliDeviceCount = 0;
try {
const response = getMobilecliDevices();
if (response.status === "ok" && response.data && response.data.devices) {
mobilecliDeviceCount = response.data.devices.length;
}
} catch (error: any) {
// if mobilecli fails, we'll just set count to 0
}
if (deviceCount === mobilecliDeviceCount) {
posthog("debug_mobilecli_same_number_of_devices", {
"DeviceCount": deviceCount,
"MobilecliDeviceCount": mobilecliDeviceCount,
}).then();
} else {
posthog("debug_mobilecli_different_number_of_devices", {
"DeviceCount": deviceCount,
"MobilecliDeviceCount": mobilecliDeviceCount,
"DeviceCountDifference": deviceCount - mobilecliDeviceCount,
}).then();
}
}
const resp = ["Found these devices:"];
if (simulatorNames.length > 0) {
resp.push(`iOS simulators: [${simulatorNames.join(",")}]`);
}
if (iosDevices.length > 0) {
resp.push(`iOS devices: [${iosDeviceNames.join(",")}]`);
}
if (androidMobileDevices.length > 0) {
resp.push(`Android devices: [${androidMobileDevices.join(",")}]`);
}
if (androidTvDevices.length > 0) {
resp.push(`Android TV devices: [${androidTvDevices.join(",")}]`);
}
return resp.join("\n");
}
);
tool(
"mobile_list_apps",
"List all the installed apps on the device",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you.")
},
async ({ device }) => {
const robot = getRobotFromDevice(device);
const result = await robot.listApps();
return `Found these apps on device: ${result.map(app => `${app.appName} (${app.packageName})`).join(", ")}`;
}
);
tool(
"mobile_launch_app",
"Launch an app on mobile device. Use this to open a specific app. You can find the package name of the app by calling list_apps_on_device.",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
packageName: z.string().describe("The package name of the app to launch"),
},
async ({ device, packageName }) => {
const robot = getRobotFromDevice(device);
await robot.launchApp(packageName);
return `Launched app ${packageName}`;
}
);
tool(
"mobile_terminate_app",
"Stop and terminate an app on mobile device",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
packageName: z.string().describe("The package name of the app to terminate"),
},
async ({ device, packageName }) => {
const robot = getRobotFromDevice(device);
await robot.terminateApp(packageName);
return `Terminated app ${packageName}`;
}
);
tool(
"mobile_install_app",
"Install an app on mobile device",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
path: z.string().describe("The path to the app file to install. For iOS simulators, provide a .zip file or a .app directory. For Android provide an .apk file. For iOS real devices provide an .ipa file"),
},
async ({ device, path }) => {
const robot = getRobotFromDevice(device);
await robot.installApp(path);
return `Installed app from ${path}`;
}
);
tool(
"mobile_uninstall_app",
"Uninstall an app from mobile device",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
bundle_id: z.string().describe("Bundle identifier (iOS) or package name (Android) of the app to be uninstalled"),
},
async ({ device, bundle_id }) => {
const robot = getRobotFromDevice(device);
await robot.uninstallApp(bundle_id);
return `Uninstalled app ${bundle_id}`;
}
);
tool(
"mobile_get_screen_size",
"Get the screen size of the mobile device in pixels",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you.")
},
async ({ device }) => {
const robot = getRobotFromDevice(device);
const screenSize = await robot.getScreenSize();
return `Screen size is ${screenSize.width}x${screenSize.height} pixels`;
}
);
tool(
"mobile_click_on_screen_at_coordinates",
"Click on the screen at given x,y coordinates. If clicking on an element, use the list_elements_on_screen tool to find the coordinates.",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
x: z.number().describe("The x coordinate to click on the screen, in pixels"),
y: z.number().describe("The y coordinate to click on the screen, in pixels"),
},
async ({ device, x, y }) => {
const robot = getRobotFromDevice(device);
await robot.tap(x, y);
return `Clicked on screen at coordinates: ${x}, ${y}`;
}
);
tool(
"mobile_double_tap_on_screen",
"Double-tap on the screen at given x,y coordinates.",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
x: z.number().describe("The x coordinate to double-tap, in pixels"),
y: z.number().describe("The y coordinate to double-tap, in pixels"),
},
async ({ device, x, y }) => {
const robot = getRobotFromDevice(device);
await robot!.doubleTap(x, y);
return `Double-tapped on screen at coordinates: ${x}, ${y}`;
}
);
tool(
"mobile_long_press_on_screen_at_coordinates",
"Long press on the screen at given x,y coordinates. If long pressing on an element, use the list_elements_on_screen tool to find the coordinates.",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
x: z.number().describe("The x coordinate to long press on the screen, in pixels"),
y: z.number().describe("The y coordinate to long press on the screen, in pixels"),
},
async ({ device, x, y }) => {
const robot = getRobotFromDevice(device);
await robot.longPress(x, y);
return `Long pressed on screen at coordinates: ${x}, ${y}`;
}
);
tool(
"mobile_list_elements_on_screen",
"List elements on screen and their coordinates, with display text or accessibility label. Do not cache this result.",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you.")
},
async ({ device }) => {
const robot = getRobotFromDevice(device);
const elements = await robot.getElementsOnScreen();
const result = elements.map(element => {
const out: any = {
type: element.type,
text: element.text,
label: element.label,
name: element.name,
value: element.value,
identifier: element.identifier,
coordinates: {
x: element.rect.x,
y: element.rect.y,
width: element.rect.width,
height: element.rect.height,
},
};
if (element.focused) {
out.focused = true;
}
return out;
});
return `Found these elements on screen: ${JSON.stringify(result)}`;
}
);
tool(
"mobile_press_button",
"Press a button on device",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
button: z.string().describe("The button to press. Supported buttons: BACK (android only), HOME, VOLUME_UP, VOLUME_DOWN, ENTER, DPAD_CENTER (android tv only), DPAD_UP (android tv only), DPAD_DOWN (android tv only), DPAD_LEFT (android tv only), DPAD_RIGHT (android tv only)"),
},
async ({ device, button }) => {
const robot = getRobotFromDevice(device);
await robot.pressButton(button);
return `Pressed the button: ${button}`;
}
);
tool(
"mobile_open_url",
"Open a URL in browser on device",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
url: z.string().describe("The URL to open"),
},
async ({ device, url }) => {
const robot = getRobotFromDevice(device);
await robot.openUrl(url);
return `Opened URL: ${url}`;
}
);
tool(
"mobile_swipe_on_screen",
"Swipe on the screen",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
direction: z.enum(["up", "down", "left", "right"]).describe("The direction to swipe"),
x: z.number().optional().describe("The x coordinate to start the swipe from, in pixels. If not provided, uses center of screen"),
y: z.number().optional().describe("The y coordinate to start the swipe from, in pixels. If not provided, uses center of screen"),
distance: z.number().optional().describe("The distance to swipe in pixels. Defaults to 400 pixels for iOS or 30% of screen dimension for Android"),
},
async ({ device, direction, x, y, distance }) => {
const robot = getRobotFromDevice(device);
if (x !== undefined && y !== undefined) {
// Use coordinate-based swipe
await robot.swipeFromCoordinate(x, y, direction, distance);
const distanceText = distance ? ` ${distance} pixels` : "";
return `Swiped ${direction}${distanceText} from coordinates: ${x}, ${y}`;
} else {
// Use center-based swipe
await robot.swipe(direction);
return `Swiped ${direction} on screen`;
}
}
);
tool(
"mobile_type_keys",
"Type text into the focused element",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
text: z.string().describe("The text to type"),
submit: z.boolean().describe("Whether to submit the text. If true, the text will be submitted as if the user pressed the enter key."),
},
async ({ device, text, submit }) => {
const robot = getRobotFromDevice(device);
await robot.sendKeys(text);
if (submit) {
await robot.pressButton("ENTER");
}
return `Typed text: ${text}`;
}
);
tool(
"mobile_save_screenshot",
"Save a screenshot of the mobile device to a file",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
saveTo: z.string().describe("The path to save the screenshot to"),
},
async ({ device, saveTo }) => {
const robot = getRobotFromDevice(device);
const screenshot = await robot.getScreenshot();
fs.writeFileSync(saveTo, screenshot);
return `Screenshot saved to: ${saveTo}`;
}
);
server.tool(
"mobile_take_screenshot",
"Take a screenshot of the mobile device. Use this to understand what's on screen, if you need to press an element that is available through view hierarchy then you must list elements on screen instead. Do not cache this result.",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you.")
},
async ({ device }) => {
try {
const robot = getRobotFromDevice(device);
const screenSize = await robot.getScreenSize();
let screenshot = await robot.getScreenshot();
let mimeType = "image/png";
// validate we received a png, will throw exception otherwise
const image = new PNG(screenshot);
const pngSize = image.getDimensions();
if (pngSize.width <= 0 || pngSize.height <= 0) {
throw new ActionableError("Screenshot is invalid. Please try again.");
}
if (isScalingAvailable()) {
trace("Image scaling is available, resizing screenshot");
const image = Image.fromBuffer(screenshot);
const beforeSize = screenshot.length;
screenshot = image.resize(Math.floor(pngSize.width / screenSize.scale))
.jpeg({ quality: 75 })
.toBuffer();
const afterSize = screenshot.length;
trace(`Screenshot resized from ${beforeSize} bytes to ${afterSize} bytes`);
mimeType = "image/jpeg";
}
const screenshot64 = screenshot.toString("base64");
trace(`Screenshot taken: ${screenshot.length} bytes`);
posthog("tool_invoked", {
"ToolName": "mobile_take_screenshot",
"ScreenshotFilesize": screenshot64.length,
"ScreenshotMimeType": mimeType,
"ScreenshotWidth": pngSize.width,
"ScreenshotHeight": pngSize.height,
}).then();
return {
content: [{ type: "image", data: screenshot64, mimeType }]
};
} catch (err: any) {
error(`Error taking screenshot: ${err.message} ${err.stack}`);
return {
content: [{ type: "text", text: `Error: ${err.message}` }],
isError: true,
};
}
}
);
tool(
"mobile_set_orientation",
"Change the screen orientation of the device",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you."),
orientation: z.enum(["portrait", "landscape"]).describe("The desired orientation"),
},
async ({ device, orientation }) => {
const robot = getRobotFromDevice(device);
await robot.setOrientation(orientation);
return `Changed device orientation to ${orientation}`;
}
);
tool(
"mobile_get_orientation",
"Get the current screen orientation of the device",
{
device: z.string().describe("The device identifier to use. Use mobile_list_available_devices to find which devices are available to you.")
},
async ({ device }) => {
const robot = getRobotFromDevice(device);
const orientation = await robot.getOrientation();
return `Current device orientation is ${orientation}`;
}
);
return server;
};
```