UK: Phone networks down: EE, BT, Three, Vodafone, O2 not working in mass outage (the-independent.com)

The best VRAM calculator I have found is https://apxml.com/tools/vram-calculator. It is much more thorough than this one. For example, it understands different models' attention schemes for correct KV cache size calculation, and supports quantization of both the model and the KV cache. Also, fine-tuning. It has its own limitations, such as only supporting specific models. In practice though, the generic calculators are not very useful because model architectures vary (mainly the KV cache) and end up being way off. (Not sure whether or not it would be better to discuss it separately, but I submitted it at https://news.ycombinator.com/item?id=44677409)

zeroq · 10m ago

This one is indeed much better and it instantly answers my immediate feedback I wanted to leave for the one originally posted, which is - instead of calculating an artificial scenario I would like to state what can I run on the hardware I actually have at hand. Thanks!

funfunfunction · 3h ago

This is a cheap marketing ploy for a GPU reseller with billboards on highway 101 into SF.

LorenDB · 3h ago

Where's AMD support? I have a 9070 XT and would love to see it listed on here.

amanzi · 2h ago

I would have liked to see the RTX 5060 Ti with 16GB mentioned. I can't tell if it's omitted because it won't work, or if it's excluded for some other reason?

amatecha · 1h ago

Yeah, weird miss, but maybe just because it came out more recently. It can be used for ~anything a 5070 could be used for, no? Maybe slower, but still.

chlobunnee · 3h ago

I built a calculator to help researchers and engineers pick the right GPUs for training and inference workloads!

It helps compare GPU options by taking in simple parameters (# of transformer layers, token size, etc) and letting users know which GPUs are compatible + their efficiency for training vs inferencing.

The idea came from talking with ML researchers frustrated by slow cluster queues or wasting money on overkill GPUs.

I'd love feedback on what you feel is missing/confusing!

Some things I'm thinking about incorporating next are >Allowing users to directly compare 2 GPUs and their specs >Allowing users to see whether a fraction of the GPU can complete their workload

I would really appreciate your thoughts/feedback! Thanks!

timothyduong · 2h ago

Where's 3090? Or should that fall in the 4090 (24GB VRAM) category?

snvzz · 2h ago

Rather than GPU calculator, this is an NVIDIA calculator.

nodesocket · 2h ago

In case you’ve been living in a cave, Nvidia is the defacto standard for LLM compute.

quotemstr · 3h ago

No sharding? At all?

Scientists may have found a way to eliminate chromosome linked to Down syndrome (academic.oup.com)

Anthropic teams use Claude Code (anthropic.com)

Graphene OS: a security-enhanced Android build (lwn.net)

Inter-Planetary Network Special Interest Group (ipnsig.org)

Positron – A next-generation data science IDE (positron.posit.co)

I wasted weeks hand optimizing assembly because I benchmarked on random data (vidarholen.net)

AMD CEO sees chips from TSMC's US plant costing 5%-20% more (bloomberg.com)

Alto turns your Apple Notes into a website (alto.so)

There is no memory safety without thread safety (ralfj.de)

New Aarch64 Back End (ziglang.org)

A GPU Calculator That Helps Calculate What GPU to Use (calculator.inference.ai)

Revisiting Moneyball (djpardis.medium.com)

Visa and Mastercard: The global payment duopoly (2024) (quartr.com)

PSA: SQLite WAL checksums fail silently and may lose data (avi.im)

RE#: High performance derivative-based regular expression matching (2024) (arxiv.org)

Air Force unit suspends use of Sig Sauer pistol after shooting death of airman (nhpr.org)

Use Your Type System (dzombak.com)

Vet is a safety net for the curl | bash pattern (github.com)

Open Source Maintenance Fee (github.com)

Intel CEO Letter to Employees (morethanmoore.substack.com)

Covers as a way of learning music and code (ntietz.com)

Why concatenative programming matters (2012) (evincarofautumn.blogspot.com)

Bus Bunching (futilitycloset.com)

American sentenced for helping North Koreans get jobs at U.S. firms (fortune.com)

Mwm – The smallest usable X11 window manager (github.com)

UK: Phone networks down: EE, BT, Three, Vodafone, O2 not working in mass outage (the-independent.com)

Superfunctions: A universal solution against sync/async fragmentation in Python (github.com)

The POSIX specification of vi (pubs.opengroup.org)

Writing is thinking (nature.com)

PSA: Collision Detection is an optimization problem and GJK is Frank-Wolfe (cairno.substack.com)

Show HN: Easy Python Time Parsing (github.com)

Thunder Compute (YC S24) Is Hiring a C++ Systems Engineer (ycombinator.com)

Show HN: Nia – MCP server that gives more docs and repos to coding agents (trynia.ai)

When swiping supplants scissors: The hidden cost of touchscreens (caseorganic.medium.com)

Building MCP servers for ChatGPT and API integrations (platform.openai.com)

Major quantum computing advance made obsolete by teenager (2018) (quantamagazine.org)

Ask HN: What is so good about MCP servers?

Working on a Programming Language in the Age of LLMs (ryelang.org)

The FastLanes File Format [pdf] (github.com)

A valid HTML zip bomb (ache.one)

Transformers without normalization (arxiv.org)

A list of changes to make it easier to build beautiful and walkable places (chrisbarber.co)

Hacker slips malicious 'wiping' command into Amazon's Q AI coding assistant (zdnet.com)

Psilocybin treatment extends cellular lifespan, improves survival of aged mice (news.emory.edu)

The Secret Life of Fsync (2023) (puzpuzpuz.dev)

EPA rescinds $20M for clean water in pesticide-contaminated rural California (theguardian.com)

Detekt – A static code analyzer for Kotlin (detekt.dev)

Leah Remini: Leaked Scientology policies direct lawyers in religious warfare (tonyortega.substack.com)

Show HN: A code editor that integrates into the browser (tachicode.dev)

AI overviews cause massive drop in search clicks (arstechnica.com)

A GPU Calculator That Helps Calculate What GPU to Use

Comments (11)