Spark AI (YC W24) is hiring a full-stack engineer in SF (founding team) (ycombinator.com)

I started building VexFS yesterday — a kernel-native filesystem that stores vector embeddings alongside files, supports semantic search (brute-force now, HNSW later), and exposes everything through a minimal IOCTL + mmap interface.

Think of it as:

A semantic memory layer for local AI agents

RAG without a vector DB

Vector search as an OS primitive

It’s early. It barely works. But it boots.

Why? Because if memory’s not snapshotable, it’s not memory. And maybe, just maybe, agents deserve a /mnt/mem they can mount natively.

When I asked Gemini what it thought of the idea, it said:

“An OS-level semantic context layer like this could enable more powerful, context-aware, and efficient AI systems.”

Not sure if it's a brilliant idea or a kernel panic waiting to happen. Either way, I’d love your feedback (and flames).

→ https://github.com/lspecian/vexfs

Comments (3)

solomatov · 32d ago

Why should it be at the kernel level? Which advantages does it give?

lspecian · 32d ago

To clarify even further how this might translate into real-world advantage: I’m not trying to displace databases or reinvent grep, I’m aiming at use cases where agents need durable, fast-access memory that doesn’t disappear when a container shuts down.

Imagine an agent running inside a microVM or container. It doesn’t call Pinecone or Redis—it mounts VexFS. When it stores an embedding, it’s not going into a remote vector store, it’s being written alongside the file it came from, locally and semantically indexed. The agent can reboot, restart, or crash and still recall the same memory—because it lives inside the OS, not in RAM or ephemeral middleware.

This also means the agent can snapshot its cognitive state—“what I knew before I re-planned,” or “the thoughts I embedded before the prompt changed.” These aren’t just filesystem snapshots, they’re points in vector space tied to contextual memory. You could even branch them, like cognitive git commits.

Even search becomes something different. Instead of listing paths or grep results, you’re getting the files most semantically relevant to a query—right at the kernel level, with mmap access or zero-copy responses.

Most importantly, all of this runs without needing an external stack. No HTTP, no gRPC, no network calls. Just a vector-native FS the agent can think through.

Still early, very early. Still rough. But if we’re going to build systems where agents operate autonomously, they’ll need more than tokens—they’ll need memory. And I think that needs to live closer to the metal.

lspecian · 32d ago

You're right to raise this.

VexFS is part of a broader idea: building infrastructure not for humans — but for agents.

Traditional filesystems, databases, and tools are all designed with human developers in mind: - POSIX APIs - Bash shells - REST endpoints

Metadata we interpret visually

But intelligent agents don’t think like us. They need: - Low-latency, associative memory - Contextual retrieval, not paths and filenames - Snapshotting of cognitive state, not just byte blocks

Direct memory-mapped embeddings, not serialized APIs

So why give them human abstractions?

VexFS is an experiment in flipping that. It’s not optimized for you — it’s optimized for the agents you’re about to spawn.

Maybe we need: - Filesystems that index vectors, not filenames - Kernel modules that serve memory, not storage - Logs that store intent, not just stdout

It's not about making AI fit Unix. It’s about asking:

"What would Unix look like if it evolved under the pressure of AI, not sysadmins?"

That’s what I’m trying to find out.

Would love feedback on what other parts of the stack should be rethought for agents first. VFS? IPC? Memory management?