Show HN: I built a desktop app that indexes your media locally

Comments (1)

correa_brian · 1h ago

Hey everyone, I'm brian, one of the makers of cosmos, a desktop app that makes your entire media collection, including external hard drives, searchable by using local ML models.

With your catalog indexed, you can use existing content to generate videos (text-to-video and image-to-video) using Veo 3. To try this out you'll need to bring your own Gemini API key. Obviously this part is not private since you are using Google's AI, but the generations get saved to your desktop and imo it's less clunky than the Google Videos UI. We also added a prompt pre-processing step to enrich the original user input. We use Gemini to create a structured JSON prompt that includes detailed information on lighting, audio, characters, and mood, to name it few. In my experience this makes it easier to preserve continuity in your scenes.

I want to experiment with some local generation models soon so Cosmos can function 100% offline (I've read good things about Wan 2.1 and Stable Diffusion). I really like working with local models (also using Whisper for audio to text transcription) and think long-term everyone will want at least some portion of their data managed by private, offline models.

If you are curious about building something like this for yourself, below is a rough outline: - Pick a platform or a cross-platform tool for your build (we started with Electron and eventually moved to Tauri) - Select your ML models. There are plenty of open-source image and text embedding models (Clip, Siglip, Nomic) - Design a media processing pipeline that won't fry your users' computer (pro tip: you're going to want to throttle indexing when CPU utilization gets too high) - Experiment with well-known open-source media tools like ImageMagick and FFmpeg. This is more than enough to extract frame, clip videos, or anything else you might want to do with a piece of media in your pre/post-processing - Database choice: There are lots of choices for DBs, but in my experience simpler is better. We started with Redis (it was overkill) and eventually migrated to sqlite with a vector embedding extension. Haven't tried Qdrant, Pinecone, or Chromadb, but sqlite works great for this use case. - If you want to support online AI platforms like OpenAI or Anthropic then you'll need to manage API keys and HTTP requests to these services (or maybe MCP? Don't know much about that yet).

Demo https://www.youtube.com/watch?v=qHPl_n-HlP4

Trellis (YC W24) Is Hiring: Automate Prior Auth in Healthcare (ycombinator.com)

Type (YC W23) is hiring a founding engineer to build an AI-native doc editor (ycombinator.com)

Foundry (YC F24) is hiring staff-level product engineers (ycombinator.com)

GoGoGrandparent (YC S16) Is Hiring Back End and Full-Stack Engineers

Kyber (YC W23) is hiring enterprise account executives (ycombinator.com)

Converge (YC S23) well-capitalized New York startup seeks product developers (runconverge.com)

Great Question (YC W21) Is Hiring a VP of Engineering (Remote) (ycombinator.com)

Coverage Cat (YC S22) Is Hiring a Senior, Staff, or Principal Engineer (coveragecat.com)

Kaizen (YC X25) is hiring engineers to build browser agents that work (kaizenautomation.com)

Infracost (YC W21) hiring first PM to shift $600B cloud spend to proactive (ycombinator.com)

Sei (YC W22) Is Hiring a Full Stack Engineer in Chennai, India (ycombinator.com)

Artie (YC S23) Is Hiring Founding AEs (ycombinator.com)

Cedana (YC S23) Is Hiring a Systems Engineer (ycombinator.com)

CodeCrafters (YC S22) is hiring first Marketing Person (ycombinator.com)

PAX Markets (YC W25) is hiring a founding principal hardware (RTL) engineer (ycombinator.com)

Sendblue (YC S23) is hiring senior engineers (ycombinator.com)

Thunder Compute (YC S24) Is Hiring a C++ Systems Engineer (ycombinator.com)

Optery (YC W22) Is Hiring in Engineering, Legal, Sales, Marketing (U.S., Latam) (optery.com)

QuestDB (YC S20) Is Hiring a Technical Content Lead (questdb.com)

Depot (YC W23) Is Hiring a Technical Content Writer (Remote) (ycombinator.com)

Firebender (YC W24) Is Hiring (ycombinator.com)

Better Auth (YC X25) Is Hiring (ycombinator.com)

Kapa.ai (YC S23) is hiring a software engineers (EU remote) (ycombinator.com)

Spice Data (YC S19) Is Hiring a Product Associate (New Grad) (ycombinator.com)

Extend (YC W23) is hiring engineers to build SOTA document processing (jobs.ashbyhq.com)

Piramidal (YC W24) is hiring a full stack engineer (ycombinator.com)

Mango Health (YC W24) Is Hiring (ycombinator.com)

Resolve (YC W15) Is Hiring an Operations and Billing Lead for Construction VR

Arva AI (YC S24) Is Hiring an AI Research Engineer (London, UK) (arva.ai)

Rejoy Health (YC W21) Is Hiring (ycombinator.com)

Weave (YC W25) is hiring an AI engineer (ycombinator.com)

Show HN: I built a desktop app that indexes your media locally

Comments (1)