Vectorless: open-source PDF chatbot without RAG

3 richardmeng 0 8/11/2025, 4:09:04 AM

Open-sourcing "Vectorless", a new PDF chatbot without embedding vectors.

Github Repo: https://github.com/roe-ai/vectorless-chatbot Demo app: https://vectorless-chatbot.vercel.app/

How it works: 1. Select best docs – Feed the LLM high-level descriptions + doc names. It picks which docs to use. 2. Select best pages – The Agent goes through the doc pages and pulls out the most relevant pages for your question. 3. Gather and answer – Agent takes all the relevant pages from step 2 and gives you the final answer.

Advantages 1. It's more predictable than vectors. You can tell the Agent exactly how you want to analyze your files. 2. You can ask abstract questions like: “How does NVIDIA compare to AMD in terms of risk?” 3. You can ask aggregate questions like: “How many questions in this SOC 2 report are marked negative?” 4. It supports multimodal questions and documents by nature.

Disadvantages 1. To work in a scalable setup, step 1 relies on high quality metadata over the documents. 2. Step 2 can be wasteful if the user asks a simple follow-up question, the context can be reused. 3. Slower than vector search chat.

How it will scale: 1. We envision a structured metadata retrieval via text to SQL to locate the paths of documents based on the user's questions at step 1. 3. Step 2 can be improved by caching. We envision when a document is queried once, a table of content can be stored, evolved, and leveraged as future questions come in.

Ask HN: With all the AI hype, how are software engineers feeling?

Ask HN: Has anyone built anything useful using AI?

Vectorless: open-source PDF chatbot without RAG

Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?

Ask HN: What toolchains are people using for desktop app development in 2025?

Ask HN: What are some comfy/stress-free jobs a SWE can do? (LCOL country)

Does anyone know a detailed residential cost estimator

Ask HN: What trick of the trade took you too long to learn?

Ask HN: Advice for someone who wants to try AI-assisted coding?

What's your favorite CLI tool for integrating LLMs into your terminal workflow?

Ask HN: Canadian founders, how do you build in SF?

Ask HN: Why is Usenet not coming back?

Ask HN: Best way to get a land line for my kids?

Tell HN: Anthropic expires paid credits after a year

Ask HN: Has any of the Pivotal Tracker replacement attempts succeeded?

Ask HN: What's Going on with AI Psychosis?

Feature Request: "Copy" Button Should Copy Only Main Output

GPT5 is worse than 4.1-mini for text and worse than Sonnet 4 for coding

Ask HN: Anyone working remotely for a US company internationally from Africa?

ChatGPT 5 is slow and no better than 4

Ask HN: What do you dislike about ChatGPT and what needs improving?

Ask HN: In which programming language is it better to make your own language?

Ask HN: How would you build second brain in the AI era?

ChatGPT-5 Can't Do Basic Math

Ask HN: OpenAI GPT-5 API seems to be significantly slower – is this expected?

Ask HN: How does GPT-OSS compare to other open-source models?

GPT-5 streaming requires submission of biometric data

Ask HN: How do you find honest tech reviews?

Ask HN: Are you running local LLMs? What are your key use cases?

Tell HN: Charles Irby has passed away

Exposing Satcom in the Sky: Aircraft Systems Vulnerable to Remote Attacks

Countries with most GPT-5 users, esp. in advanced computation and reasoning?

Tell HN: Chrome and Spotify dropping support for macOS11

Ask HN: Claude Code vs. Codex vs. GitHub Coding Agent?

Ask HN: Why Did Mercurial Die?:(

Ask HN: Which processor to pick for learning assembly?

Ask HN: What change enabled you to consistently finish your side projects?

Ask HN: Should brain implants be available for everyone as a productivity boost?

Ask HN: Recommendations for specification management software?

Vectorless: open-source PDF chatbot without RAG

Comments (0)