Ask HN: Experience automating E2E manual testing with AI

1 rudderdev 0 9/8/2025, 6:22:19 AM
I see lots of discussions around using AI in testing. Let's make this discussion more objective and useful by sharing our experiences, here's my experience of using AI to automate e2e manual testing (especially where user interaction is required):

What I’m testing: RudderStack iOS SDK, it is used to track customer event data and send it to various product, marketing, and business tools.

The problem in my current testing workflow: Manual testing is important for quality assurance. In the case of testing RudderStack SDK, it requires multiple time-consuming and error-prone steps such as - plan specific steps for the test, perform interactions, review lengthy amounts of log text, and then verify logs which includes comparing long IDs.

The solution I experimented with: I leveraged LLM to plan test steps, used mobile-mcp to simulate user interactions (clicking some buttons such as track, reset, track, etc.), review logs using LLM (verify the event ID changes sent to the server), and prepare a final comprehensive report. All packaged as an MCP server that can work in my IDE (cursor) with test cases as prompt in plain English.

Result: My agent did click through track → reset → track and caught the anonymous ID change (something that ensures the tracking by the SDK worked properly)

What actually worked:

- Once set up, it did catch the regression correctly - Consistent results vs my manual testing where I sometimes miss things

Issues I ran into:

- Had to write extremely detailed step-by-step instructions and extensive context. If I missed anything, it just failed

- WebDriver setup on port 4723 was finicky

- It is slow. Took 2 minutes for what should be a 30-second manual test

Biggest problem: The amount of upfront work to get it running properly. I spent more time writing instructions than I would have just testing manually.

The real value might be in consistency for regression testing, not speed. But the initial investment is rough.

What would make this useful:

I need to create a workflow where, based on the feature or fixes, agents automatically generate test cases—including all edge cases—targeting the code impacted by the changes, and then perform a thorough end-to-end QA.

Has anyone else tried automating QA using AI? How was your experience and how did you resolve the challenges you faced? (I want to learn the practice that I can incorporate in my workflow)

Comments (0)

No comments yet