Meta to invest hundreds of billions of dollars into several multi-GW datacenters (datacenterdynamics.com)

This highlights the need for context engineering. Whether relevant information is present in a model’s context is not all that matters; what matters more is how that information is presented.

Here is the complete open-source codebase to replicate our results: https://github.com/chroma-core/context-rot

Comments (8)

posnet · 51m ago

I've definitely noticed this anecdotally.

Especially with Gemini Pro when providing long form textual references, providing many documents in a single context windows gives worse answers than having it summarize documents first, ask a question about the summary only, then provide the full text of the sub-documents on request (rag style or just simple agent loop).

Similarly I've personally noticed that Claude Code with Opus or Sonnet gets worse the more compactions happen, it's unclear to me whether it's just the summary gets worse, or if its the context window having a higher percentage of less relevant data, but even clearing the context and asking it to re-read the relevant files (even if they were mentioned and summarized in the compaction) gives better results.

zwaps · 32m ago

Gemini loses coherence and reasoning ability well before the chat hits the context limitations, and according to this report, it is the best model on several dimensions.

Long story short: Context engineering is still king, RAG is not dead

risyachka · 5m ago

Yep. The easiest way to tell someone has no experience with LLMs is if they say “RAG is dead”

tough · 33m ago

Have you tried NotebookLM which basically does this as an app on the bg (chunking and summarising many docs) and you can -chat- with the full corpus using RAG

zwaps · 33m ago

Very cool results, very comprehensive article, many insights!

Media literacy disclaimer: Chroma is a vectorDB company.

philip1209 · 17m ago

Chroma does vector, full-text, and regex search. And, it's designed for multitenant workloads typical of AI applications. So, not just a "vectorDB company"

tough · 33m ago

this felt intuitively true, great to see some research putting hard numbers on that

tjkrusinski · 1h ago

Interesting report. Are there recommended sizes for different models? How do I know what works or doesn't for my use case?

Grok is making AI companions, including a goth anime girl (techcrunch.com)

IP over Avian Carriers (en.wikipedia.org)

Recommended Tech and AI Blogs to Follow in 2025 (guptadeepak.com)

Contagious Interview Campaign Escalates with 67 Malicious NPM Packages and New (socket.dev)

Are We Entering a First-Order Optimizer Renaissance? (lukemerrick.com)

Delaunay Mesh Generation (people.eecs.berkeley.edu)

Stormhood – protects social media users from scammers in real time (stormhood.xyz)

Lee Elia, Former Major League Manager, Dies at 87 (nytimes.com)

Setting Up an SDL3 Mac App in Xcode 16 (journal.stuffwithstuff.com)

Pulsed Impulsive Kill Laser (PIKL) (2000) [pdf] (web.archive.org)

Ask HN: What is a physiically disabled person to do in this job market?

Open-source project for Motorola device bootloader unlocking (fuckyoumoto.xyz)

Meta to invest hundreds of billions of dollars into several multi-GW datacenters (datacenterdynamics.com)

British Perl guru Matt Trout dead at 42 (theregister.com)

Why South Korean young men and women are more politically divided (latimes.com)

Show HN: Self-hosted task management with 24-languages support (tududi.com)

Source code analysis of Amazon Kiro (ghuntley.com)

We're Seemingly Still in the "Throw Money at It" AI Era (spyglass.org)

The Windsurf Acquisition Challenge (squingo.com)

Claude Code Is All You Need (twitter.com)

Show HN: BookList – an app to track books and share reviews (booklist-app.neocities.org)

Show HN: Generate any workflow with natural language (osly.ai)

Show HN: ZeroFS: The S3FS that does not suck (github.com)

AskIt MCP – Apache 2.0 (github.com)

Show HN: A simple iOS-native map measurement app (apps.apple.com)

Tresorit – secure file exchange and collaboration made easy (tresorit.com)

DEWLine Museum – The Distant Early Warning Radar Line (dewlinemuseum.com)

Launchk: Rust/Cursive TUI for looking at macOS launchd agents and daemons (github.com)

Ask HN: Have you noticed AI critic content being disparaged on HN?

Researchers Develop New Tool to Measure Biological Age (seattletimes.com)

Show HN: Clarifytube –Turn any YouTube video into a full article (clarifytube.com)

1.1.1.1 Is Down (one.one.one.one)

Scientists detect light passing through human head for brain imaging (spie.org)

Code highlighting with Cursor AI for $500k (securelist.com)

Microsoft Surface parody (2007) [video] (youtube.com)

Texas AG requests Robert Roberson be executed Oct. 16 (texastribune.org)

DHH: Future of Programming, AI, Ruby on Rails, Productivity and Parenting (lexfridman.com)

Journalist says 4k fake AI news websites created to game Google algorithms (pressgazette.co.uk)

Cloudflare DNS Is Down (news.ycombinator.com)

1.1.1.1 Is Down (cloudflarestatus.com)

Shopify MCP Can Be Abused to Manipulate Customer Purchases (tramlines.io)

Learning to Learn, in the Age of LLMs (carette.xyz)

Integer Division in Bucketed Time Series (uvdn7.github.io)

Why Don't You Fucking Retire Already? (medium.com)

AI labs are coming for Wall Street's quants (businessinsider.com)

Grad school is worse for public health than STDs (2019) (benkuhn.net)

Cloudflare DNS Down in UK/EU

Tiny Great Languages: Mouse (zserge.com)

Show HN: Forge – Connect multiple AI models through a single API (tensorblock.co)

Altered State of Consciousness Feels Like an Escape from Reality (popularmechanics.com)

Context Rot: How increasing input tokens impacts LLM performance

Comments (8)