Scaling Judge-Time Compute with Leonard Tang

Scaling Judge-Time Compute with Leonard Tang – Weaviate Podcast

1 CShorten 0 5/12/2025, 3:14:50 PM

Scaling Judge-Time Compute!

I am SUPER EXCITED to publish the 121st episode of the Weaviate Podcast featuring Leonard Tang, Co-Founder of Haize Labs!

Evals are one of the hottest topics out there for people building AI systems. Leonard is absolutely at the cutting edge of this, and I learned so much from our chat!

The podcast covers tons of interesting nuggets around how LLM-as-Judge / Reward Model systems are evolving. Ideas such as UX for Evals, Contrastive Evaluations, Judge Ensembles, Debate Judges, Curating Eval Sets and Adversarial Testing, and of course... Scaling Judge-Time Compute!! --

I highly recommend checking out their new library, `Verdict`, a declarative framework for specifying and executing compound LLM-as-Judge systems.

I hope you find the podcast useful! As always, more than happy to discuss these ideas further with you!

YouTube: https://www.youtube.com/watch?v=KFrKLkJzNDQ

Spotify: https://creators.spotify.com/pod/show/weaviate/episodes/Haize-Labs-with-Leonard-Tang---Weaviate-Podcast-121-e32mts3

Ask HN: Cursor or Windsurf?

Ask HN: Any recommendations for a portable music player

Ask HN: Where to get used hardware cheap?

Ask HN: Should You Include a Certificate in a SAML AuthnRequest?

Ask HN: Do You Prepare for Job Interviews? If So, How?

Ask HN: Did GitHub UI become unbearably slow?

Ask HN: What are good high-information density UIs (screenshots, apps, sites)?

Image to 3D

Ask HN: Gemini Reliability Degrading?

Why is it so hard to find founders to bounce off ideas in city you are visiting?

Ask HN: How much better are AI IDEs vs. copy pasting into chat apps?

LLM Botnet: Are companies using botnets to scrape content?

Ask HN: Are LLMs useful or harmful when learning to program?

Ask HN: Anyone using Chrome ext with AI for daily copywriting/social media?

Ask HN: What is the worst communications tool you've ever used?

Removing "Annoying" Windows 10 Features Is DMCA Violation, Microsoft Says(2020)

Ask HN: Is big tech still more stable?

K Set Cover Solving Algorithm

Ask HN: Fictional business books like The Goal

A new way to automatically deprecate DockerHub repository: Bulk Archive

Ask HN: RAG or shared memory for task planning across physical agents?

Ask HN: Hackathons feel fake now

Qneurons: Quantum Technology Could Stimulate Human Cognitive Evolution"

What are some app ideas that you think would benefit people on a perosnal level?

AI Summarizer: Summarize Web, YouTube and PDFs in Seconds–Free

Ask HN: If 1 person can control 10 AI agents, why would still need that person?

Blazeio.SharpEvent: A Python Async Primitive That Scales to 1M Waiters with O(1)

Ask HN: Anyone using knowledge graphs for LLM agent memory/context management?

Getting tired of Helm – any better way to handle deployments in Kubernetes?

Ask HN: Did Aliexpress stop shipping to US?

Ask HN: Are you using AI coding assistance?

Ask HN: How do you obtain software development contracts?

Scaling Judge-Time Compute with Leonard Tang – Weaviate Podcast

Comments (0)