App Use – Enable AI to control your mobile apps

Comments (1)

itsericktorres · 1d ago

Hey HN! I built an open-source Python library that lets AI agents control mobile apps like humans do—tap buttons, scroll feeds, fill forms, you name it. Its heavily inspired by browser-use

Here's a quick demo https://x.com/itsericktorres/status/1932996729458110482

From our experience testing mobile workflows or automate repetitive app tasks has always been a pain. You either write brittle UI automation scripts or do everything manually. And after testing out browser-use and seeing all the cool things people were building for web without the pains of normal automation I decided I wanted to build something like it but for mobile

So I built App Use around Appium with a dead-simple Python interface. You just describe the task in plain English, point it at your app, and the agent uses computer vision to navigate the UI:

agent = Agent( task="Order some tacos from the nearest restaurant", llm=ChatOpenAI(model="gpt-4o"), app=app # iOS/Android app config ) await agent.run()

The agent takes screenshots, analyzes the UI, and performs actions step-by-step while explaining what it's doing. We've tested it on everything from shopping apps to social media—it can handle complex multi-step flows like creating accounts, making purchases, or posting content.

What makes it different from existing mobile automation tools: Natural language tasks instead of writing XPath selectors Works with any LLM (OpenAI, Anthropic, Gemini, etc.) Vision-based navigation so it adapts when UIs change Memory system to remember app states and user preferences Cross-platform (iOS + Android) with the same API

All the things you love about browser-use but for mobile

Current limitations: it's as smart as the LLM you give it, can be slow on complex tasks, and occasionally gets confused by unusual UI patterns. But for most common mobile workflows, it works surprisingly well.

Install with: pip install app-use

You'll also need Appium running and your device/emulator connected. The setup takes about 5 minutes if you're familiar with mobile development. I've curious what use cases this unlocks for you! Some ideas I've been thinking this could be used for:

QA teams automating regression tests across app updates Vibe Debugging tools Data collection from apps that don't have APIs Automating tasks on mobile first apps (stuff browser-use wouldn't be able to do)

The whole thing is open source at https://github.com/itsericktorres/app-use. I'd love feedback, especially if you've struggled with mobile app automation before or have ideas for making this more robust.

Let me know what you guys think! And if you have any questions!

Agent or Workflow? (Interactive Quiz) (agents-vs-workflows.streamlit.app)

Iranian Crown Prince in Exile – Interview with Reza Pahlavi [video] (youtube.com)

Media Cheat Sheet – Up-to-date social media image and video sizes (mediacheatsheet.com)

Embedded Configurable Operating System (ECos) (ecos.sourceware.org)

eKilo – A super lightweight Vim alternative. (github.com)

Show HN: Open-source AI generated short form content (vidgen.smrth.dev)

The Hoodcat Chronicles (static.xtremeownage.com)

VS Code 1.101 (code.visualstudio.com)

Ask HN: Stripe alternative for marketplace type business

ArisInfra Solutions IPO 2025 – Price Band, GMP, Allotment and How to Apply

Bubblewrap: Low-level unprivileged sandboxing tool (github.com)

Show HN: Radicle Desktop, a graphical user interface for Radicle (desktop.radicle.xyz)

Apple's Spin on the Personalized Siri Apple Intelligence Reset (daringfireball.net)

Scientists rush to stop mirror microbes that could threaten life on Earth (ft.com)

Quality Standards (june.kim)

How to pronounce schedule in British English (youglish.com)

If I Ran Mastodon (werd.io)

Opportunities for Digital Health in 2025 (ideo.com)

Barbie-maker Mattel partners with OpenAI to make AI child's play (ft.com)

Claude Code: Settings (docs.anthropic.com)

Google to reduce Pixel 6A charging performance after fire reports (notebookcheck.net)

Chinese AI Companies Dodge USA Chip Curbs with Flying Suitcases of Hard Drives (wsj.com)

Tesla sues ex-Optimus engineer alleging theft of robotic trade secrets (fortune.com)

Thousands of Koreans were banned from Instagram this week. I was one of them (koreajoongangdaily.joins.com)

Xian: A Native Python Blockchain (xian.org)

Bad at math and great at physics, explain this paradox? / (physicsforums.com)

Denmark gets more serious about digital sovereignty (world.hey.com)

Apple Intelligence Shifts Gears (sixcolors.com)

Traceable random numbers from a non-local quantum advantage (nature.com)

Deleted my entire site like an idiot (old.reddit.com)

Israel attacks Iran's top military officials and nuclear sites (apnews.com)

Ask HN: Any PDF Benchmarks?

Thermal Runaway: Why Waymo Cars Burned So Completely in the Los Angeles Protests (scientificamerican.com)

GoTo: The Forgotten Search Engine (2018) (thehistoryoftheweb.com)

I built an app so you don't have to waste hours learning any topic (apps.apple.com)

Why is 11A Europe's most hated airline seat? (flightradar24.com)

Social media is a threat to my psychological sovereignty (ankursethi.com)

Claudi language makes Claude faster (gist.github.com)

AI Evaluation Methods by Use Case (notion.so)

NASA reveals monster star clumps in galactic wreckage (sciencedaily.com)

The Limits of Believability in Science Fiction (classicsofsciencefiction.com)

Ask HN: Develop more intentional computer usage via physical unlocking device?

Symbolic AI or Gofai: The Old Approach to Artificial Intelligence (medium.com)

Show HN: ExtinctAtlas – Explore extinct species (github.com)

Why Denmark Is Dumping Microsoft Office and Windows for LibreOffice and Linux (zdnet.com)

Attorney General Bonta Urges Action by Meta to Prevent Investment Scam Ads (oag.ca.gov)

Taylor Swift now owns all the music she has ever made (theconversation.com)

This is fascism (decorrespondent.nl)

What would a multi-user web server look like? (A thought experiment) (utcc.utoronto.ca)

Protectli Vault Pro VP2430 – Intel N150 fanless network appliance with coreboot (cnx-software.com)

App Use – Enable AI to control your mobile apps

Comments (1)