Show HN: An Open-Source E-Book Reader for Conversational Reading with an LLM
The problem: Traditional e-readers are passive. When you encounter something unclear, you have to context-switch to search for it. Your highlights and notes remain isolated, and you can't easily connect ideas across different books.
My solution: BookWith embeds an AI that maintains full context of what you're reading. It features:
- Context-aware AI chat: Ask questions about the current page/chapter and get instant answers
- AI podcast generation: Automatically converts book content into conversational podcasts using Google Cloud TTS
- Multi-layer memory system: Short-term (last 5 conversations), mid-term (summarized every 20), and long-term (vector search) memory that maintains continuity across reading sessions
- Smart annotations: 5-color highlighting system that AI can reference and analyze
Technical stack: Built as a fork of Flow (epub reader), with added LLM integration and vector database for semantic search. Supports multiple LLMs and languages (EN/JA/ZH).
Not to necessarily diss the work that was done on this, but the idea of actually wanting this for reading feels like it is a continuation of the lack of attention span that has seemed to get worse and worse. We already saw this with the oversimplification of television shows and movies. Many of them leaning more towards slapping you in the face with something instead of subtly.
I know way too many people that struggle to sit still for a half hour episode of some show now (like my partner, frustratingly) and have to be doing something else.
If you are struggling with absorbing the information you are reading that is likely a sign you should put down the book and come back to it later, obviously your mind wants to be doing something else. If it is a continued issue than practice reading something that you know you would like. Personally my "in" for my love of reading was reading video game books that expanded the lore and it grew from there, but I was already invested in the story so the book was easier to read.
Using this for a book feels more like a crutch than anything else. That is obviously before you get into whether or not the LLM is actually going to tell you the truth.
There is however one possible use case I could get into, but this is something that could be solved by just finding a video or something online. A refresher when it has been a long time between books coming out in a series.
What I've found interesting when doing similar experiments (feeding things like books to an LLM and asking questions) is that the output is almost always more bland than one would hope for. I suspect this may both be a result of LLMs being biased for the material they've been trained on and a reality I've suspected which is that the majority of books are mostly filler and aren't making points that are particularly profound. Most books, when you distill them down, fundamentally communicate ideas that are rather obvious, but the language around those points makes them sound a lot more profound than they really are. It's a kind of hypnosis, I think. In a sense, LLMs may be able to reveal how bereft a piece of written material is.
I disagree with the OP's statement that traditional e-readers being passive is actually a "problem". It's kind of like saying that cars are a problem because they can't fly. Maybe I'm being pedantic, but being alone with a book and one's own thoughts is hardly a problem; if anything, the problem is fewer and fewer people are comfortable without a constant barrage of thoughts other than their own.
Some people have deep knowledge, but don't have the skills to untangle context and lay out the right learning path for a reader. These people likely bell-curve around certain neurotypes, which perhaps know certain sorts of knowledge more strongly.
Right now, those people shouldn't publish. But if LLMs could augment poorly structured content (not incorrect content, just poorly structured), that perhaps open up more people to share their wisdom.
Anyhow, just thinking out loud here. I'm sure there are some massive downsides that are coming to mind for ppl reading :)
At some point I'll work on better integrating Emacs's nov.el EPUB reader with gptel to approximate something like this. Books are text, and I already have the ultimate text processing environment that I've invested quite a lot of time in.
The introduction video shows how easy it is to import an epub, and then "asks the ebook" to give them the Table of Contents. While the ToC was already available... no real added value compared to RAG
Because we've been mainly targeting business and technical books, the spoiler-prevention feature is not yet implemented.
However, to make novels and other narratives comfortable to read in the future, I'll definitely consider adding a feature to limit the AI's knowledge based on your reading progress.
Thanks again for the valuable suggestion!
Will definitely give it a go.
I'd love to hear your thoughts once you've had a chance to try it out. All feedback is welcome!
I would just consult a fan wiki, but that doesn't work if the title isn't popular or if the book is too new. This seems like the perfect tool if it can somehow maintain coherency across multiple books.
That said, I do understand (and share) a lot of the frustration and hesitancy that people here have around AI tools; I don't want an app that takes away the act of thinking (like that post recently about teachers using AI to make banal lesson plans, and students in turn using AI to write essays -- what is the point then?). I hope you don't take it too much to heart, and try to showcase use cases where your app can actually provide value.
Another piece of feedback is it would be great if this could be all packaged up into a docker image that would make it easy to deploy on a local machine (or like on a home server/NAS). Right now it seems there are still a lot of manual steps and scaffolding.
That seems like a maybe a wee bit of an overstatement of possibilities.
What I meant from a technical perspective is that the system uses a Retrieval-Augmented Generation (RAG) approach. It has the entire book's content available in a vector database, and when you ask a question, it performs a semantic search to pull the most relevant passages in real-time to use as context for the LLM's answer.
So, from a user's perspective, the experience is designed to feel like you're conversing with an expert who can instantly recall any part of the book. I should have used more precise language. Thanks for keeping me honest!
0 - Brandon Sanderson's Wind and Truth
You raise a really important point about the risk of lessening the reading experience. That's something I've thought about a lot. My personal experience while using it has been that it can actually deepen immersion, since I'm able to look up a word or phrase instantly without breaking my flow and switching to a browser.
You're absolutely right that this is a new kind of reading experience powered by LLMs, and there are bound to be some downsides. I hope it's an interesting experiment, and I'd be thrilled if you gave it a try.
Thanks again for the valuable perspective!
Thanks for building and sharing something.
I'm constantly finding myself pretty deep into a book and a conversation happens and I have no idea who one of the people are. I'd love a way to just ask my Kindle "who is Uriah Heep?"
Algorithmic social media has already destroyed our attention spans. ChatGPT is in the process of destroying the the rest. People read less than ever and have difficulty engaging with anything that takes more effort than "grok is this real?". Do we really need to put AI into the """reading experience"""?
A large aspect of the creation of this app was motivated by my curiosity about what the reading experience offered by an e-reader with AI functionality natively integrated into it. Another major reason was that I thought I would have to make it and use it to see if it was really necessary.
Why is that a problem? Your statement is a bit like saying "traditional avocados are too delicious. We at YuckCo are aiming to change that!"
You can't just define something as a problem merely to help you sell a solution.
> When you encounter something unclear, you have to context-switch to search for it.
Literally every eReader I've used has a built in dictionary. I tap the word and it tells me what it means.
How is that context switching but "Hey, Siri, what does the word avocado mean?" isn't?
Are you thinking of just books like novels? There’s a lot of reading of technical or scientific or reference material.
Today's phrase that you might want to discuss with an LLM, or a real person:
de gustibus non disputandum