Are LLMs better suited for PR reviews than full codebases?

3 aaa_2006 3 9/5/2025, 6:33:08 PM

Semgrep recently published an analysis of how LLMs perform at spotting vulnerabilities in code: https://semgrep.dev/blog/2025/finding-vulnerabilities-in-modern-web-apps-using-claude-code-and-openai-codex/

I’ve been thinking about this problem and wanted to share a perspective.

When evaluating LLMs for static analysis, I see four main dimensions: accuracy, coverage, context size, and cost.

On accuracy and coverage, today’s LLMs feel nowhere close to replacing dedicated SAST tools on real-world codebases. They do better on isolated snippets or smaller repos, but once you introduce deep dependency chains, results drop off quickly.

Context size is another bottleneck. Feeding an LLM a repo with millions of lines creates huge problems for reasoning across files, and the runtime gets impractical.

That leads to cost. Running an LLM across a massive codebase can be significantly more expensive than traditional scanners, without obvious ROI.

Where they do shine is at smaller scales — reviewing PRs, surfacing potential issues in context, or even suggesting precise fixes when the input is well-scoped. That seems like the most practical application right now. Whether providers will invest in solving the big scaling problems is still an open question.

Curious how others here think about the trade-offs between LLM-based approaches and existing SAST tools.

Comments (3)

aafanah · 4h ago

Interesting. LLMs are already shining at PR reviews even if they struggle with massive codebases right now. And they are evolving fast enough that those scaling limits might not stay limits much longer.

kogatlas · 4h ago

I'd love to see your evidence that "LLMs are already shining at PR reviews". We've used a handful of them here where I work for months now and they are rarely correct, and thus, rarely useful. Instead they tend to just summarize nonsense that wasn't even introduced in that PR, make shit up entirely, or recommend bad fixes to things that would be better solved by being removed entirely.

aafanah · 4h ago

Fair point. I think the bottom line is that it depends a lot on the context and how the prompt is framed. For PRs with small enough scope, I have seen LLMs provide decent value, mostly in surfacing potential issues or offering quick summaries. That said, the Semgrep analysis highlights that accuracy and coverage still fall short even in these narrow cases, so clearly there is still a lot of work to be done before this becomes broadly reliable.

Gym Class VR (YC W22) Is Hiring – UX Design Engineer (ycombinator.com)

Relace (YC W23) Is Hiring for Code LLMs (SF)

Artie (YC S23) Is Hiring Engineers, AES, and Senior PMM (ycombinator.com)

Depot (YC W23) Is Hiring a Solutions Engineer (Remote US and Canada) (ycombinator.com)

Svix (webhooks as a service) is hiring for a founding marketing lead (svix.com)

Dynamo AI (YC W22) Is Hiring for AI Product Managers (ycombinator.com)

Kapa.ai (YC S23) is hiring research and software engineers (ycombinator.com)

Optery (YC W22) Is Hiring in Engineering, Legal, Sales, Marketing (U.S., Latam) (optery.com)

Telli (YC F24) is hiring engineers, designers, and interns (on-site in Berlin) (hi.telli.com)

Infisical (YC W23) Is Hiring Solutions Engineers to Scale the OSS Security Stack (ycombinator.com)

Channel3 (YC S25) Is Hiring a Founding Engineer, NYC (channel3.notion.site)

Thunder Compute (YC S24) Is Hiring (ycombinator.com)

Deepnote (YC S19) is hiring engineers to build a better Jupyter notebook (deepnote.com)

Prosper AI (YC S23) Is Hiring Founding Account Executives (NYC) (jobs.ashbyhq.com)

The Forecasting Company (YC S24) Is Hiring a Software Engineer (ycombinator.com)

Lago – Open-Source Usage Based Billing – Is Hiring in Sales, Eng, Ops (EU, US) (ycombinator.com)

Ember (YC F24) Is Hiring Full Stack Engineer (ycombinator.com)

LiteLLM (YC W23) is hiring a back end engineer (ycombinator.com)

SigNoz (YC W21, Open Source Datadog) Is Hiring Platform Engineers (Remote) (jobs.ashbyhq.com)

Motion (YC W20) Is Hiring Principal Software Engineers (jobs.ashbyhq.com)

Bild AI (YC W25) Is Hiring an Applied AI Engineer (workatastartup.com)

Text.ai (YC X25) Is Hiring Founding Full-Stack Engineer (ycombinator.com)

Cua (YC X25) is hiring design engineers in SF (ycombinator.com)

Activeloop (YC S18) Is Hiring Member of Technical Staff – Back End Engineering (careers.activeloop.ai)

Coris (YC S22) Is Hiring (ycombinator.com)

14.ai (YC W24) is hiring engineers in SF to build an AI-native Zendesk (14.ai)

Spice Data (YC S19) Is Hiring a Product Associate (New Grad) (ycombinator.com)

Ashby (YC W19) Is Hiring Design Engineers in AMER and EMEA (ashbyhq.com)

EasyPost (YC S13) Is Hiring (easypost.com)

Tesorio (YC S15) Is Hiring a Senior GenAI Engineer (100% Remote) (tesorio.com)

OneSignal (YC S11) Is Hiring Engineers (onesignal.com)

Are LLMs better suited for PR reviews than full codebases?

Comments (3)