We need an MCP-like standard for interacting with UI components.
I want to be able to use my browser/mobile apps without hands. I want my AI assistant to be able to do anything I can.
To do it, I suggest we declare possible actions in every web page and every app screen in an LLM-readable format like
[{
"description": "rewind video on integer number of seconds, where positive means forward and negative means backwards",
"callback": callback,
},...
]
To do it, I suggest we declare possible actions in every web page and every app screen in an LLM-readable format like [{ "description": "rewind video on integer number of seconds, where positive means forward and negative means backwards", "callback": callback, },... ]