Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown
The problem? Apple's docs are JavaScript-rendered, so when you paste URLs into AI tools, they just see a blank page. Copy-pasting works but... c'mon.
So I built something that converts Apple Developer docs to clean markdown. Just swap developer.apple.com with sosumi.ai in any Apple docs URL and you get AI-readable content.
For example:
- Before: https://developer.apple.com/documentation/swift/double
- After: https://sosumi.ai/documentation/swift/double
The site itself is a small Hono app running on Cloudflare Workers. Apple's docs are actually available as structured data, but Apple doesn't make it obvious how to get it. So what this does is map the URLs, fetch the original JSON, and render as Markdown.
It also provides an MCP interface that includes a tool to search the Apple developer website, which is helpful.
Anyway, please give this a try and let me know what you think!
I looked at the examples you posted and did a quick glance. For example
'''init?(exactly: Float80)'''
the tool converted it to
'''- [initexactly-63925](/documentation/Swift/Double/init(exactly:)-63925)'''
To achieve its goal I would be worried that it dropped the verbatim function signature. Claude still figured it out, but for more obscure stuff that could be an issue.
Because that’s the authors actual goal? To take a web page that looks fine to human eyes but is unintuitively not accessible to AI. That’s genuinely useful and valuable.
Sure it’s no different than converting it to markdown for human eyes. But it’s important to be clear about not just WHAT but also WHY.
C’mon now. This isn’t controversial or even bad.
How hard would it be to build an MCP that's basically a proxy for web search except it always tries to build the markdown version of the web pages instead of passing HTML?
Basically Sosumi.ai but instead of working on only for Apple docs it works for any web page (including every doc on the internet)
But stripping complex formats like html & pdf down to simple markdown is a hard problem. It's nearly impossible to infer what the rendered page looks like by looking at the raw html / pdf code. https://github.com/mozilla/readability helps but it often breaks down over unconventional div structures. I heard the state of the art solution is using multimodal LLM OCR to really look at the rendered page and rewrite the thing in markdown.
Which makes me wonder: how did OpenAI make their model read pdf, docx and images at all?
https://jina.ai/reader
But with these RLHF'd AIs, being confident and helpful as they are, it took me a while to realize that they couldn't actually read the Apple developer links I was giving them. Like a kid who can't read the chalkboard, but doesn't realize they need glasses.
If GitHub could support .docc files, that would be great. Otherwise, I still use Jazzy Docs.
Long live Jazzy.
Thanks!
Curious how it handles some of the concurrency stuff. Actors, async/await etc..
- https://developer.apple.com/documentation/swift/tasklocal
- https://sosumi.ai/documentation/swiftui/button
- https://sosumi.ai/documentation/foundation/measurement
Most of the docs about the language itself live on swift.org (e.g. https://docs.swift.org/swift-book/documentation/the-swift-pr...). I've had pretty good luck getting Claude to write modern Swift code by saying things like "Use Swift 6 structured concurrency instead of GCD". But I could totally see expanding sosumi to include swift.org content, too.
Edit: Another Swift developer life hack: for new language features, copy-pasting the Swift Evolution proposal works pretty well! https://github.com/swiftlang/swift-evolution/tree/main/propo...
It’s ok to just start coding with a public repo. Code isn’t a secret.
I remember an early experience in my working career, when someone was sharing their sample code with a group, to demonstrate a particular concept. And one of those present picked them up on their use of magic numbers, as if that was at all relevant in the context.
I don't blame anyone for being wary of showing their work in progress. Painters often don't like their subjects trying to take a sneek peak at their work in progress, as another example.
- Not wanting to get roasted
- Open source = dealing with a lot of entitlement
And the list goes on. Putting code out into the world (publicly) often sets you up for future obligation of some kind (even if it’s just saying “no”).
None of this is a stance against open source, but I understand where people are coming from.
Yes, that is why I quit using Claude and swapped to ChatGPT about a year ago. I've had substantially less issues with GPT.
Also, Apple has started shipping docs like this, too. They are a bit hidden but you can find them here:
/Applications/Xcode-beta.app/Contents/PlugIns/IDEIntelligenceChat.framework/Versions/A/Resources/AdditionalDocumentation
I think this one would be slightly better if it rendered that Markdown as simple HTML if accessed through a real browser, but I can imagine even this version being pretty useful.
I think it could also make the "Small web" crowd pretty happy too.
Personally I feel that this whole AI induced problem should even exist in the first place, but even then it is ridiculous, that you have to query some web api to solve this problem, why not just publish parsed and converted to .md set of local files and be done with it.
Apple's ToS pretty explicitly forbid the kind of automation required to download everything. But even if someone did that, it'd only be a snapshot in time. And a lot can can change between OS releases.
As for the hosted web app, I wanted to provide this as a public service. I plan to open source it, so anyone can self-host instead, if they're inclined.