Activeloop (YC S18) Is Hiring AI Search and Python Back End Engineers(Onsite,MV) (careers.activeloop.ai)

In aggregate? Signs point to yes. For the general purpose SFT base models. We see some evidence even with RNNs vs Transformers. You're essentially finding a function that models language. Use the same optimization function, get a similar result.

However, the RL and especially the RLHF does a lot to reshape the responses, and that's potentially a lot more varied. For the training that wasn't just cribbed from ChatGPT, anyway.

Lastly, it's unlikely that you'll get the _exact same_ responses; there's too many variables at inference time alone. And as for training, we can fingerprint models by their vocabulary to a certain extent. So in practical terms there's probably always going to be some differences.

This assumes our current training approaches don't change too drastically, of course.

l33tbro · 8h ago

I'd guess no. While they have similar training data, there is plenty of novelty and unique data entering each model due to how each user is using it. This is why ideas like model collapse are fun in theory, but don't really play out due to the irregular ways LLMs are used in the real world.

I could be wrong, but I have not heard a convincing argument for what you propose.

Buttons840 · 9h ago

I wonder how much of the AI depends on its initial weights? If in coming decades we understand better how neural networks work, it would be funny to look back and realize that Google beat OpenAI because they got lucky with their initial weights or something.

No comments yet

joules77 · 9h ago

At a basic level it generates a probability distribution of what the next token should be.

There are a zillion questions that can be asked where you can get a prob dist where multiple tokens have the same probability (flat probability distributions). Then it has to randomly pick one and you can get large variation.

moomoo11 · 2h ago

There are like maybe <100 people who actually contribute actively to LLMs.

Just treat it like a commodity (like cloud infrastructure) and build cool shit using it.

If the provider can roll that feature into their offerings then you’re not actually adding any value to the world.

UltraSane · 5h ago

This is called the The Platonic Representation Hypothesis

https://arxiv.org/abs/2405.07987

We argue that representations in AI models, particularly deep networks, are converging. First, we survey many examples of convergence in the literature: over time and across multiple domains, the ways by which different neural networks represent data are becoming more aligned. Next, we demonstrate convergence across data modalities: as vision models and language models get larger, they measure distance between datapoints in a more and more alike way. We hypothesize that this convergence is driving toward a shared statistical model of reality, akin to Plato's concept of an ideal reality. We term such a representation the platonic representation and discuss several possible selective pressures toward it. Finally, we discuss the implications of these trends, their limitations, and counterexamples to our analysis.

allears · 9h ago

Not an expert, but I believe it's just the opposite. Even with the same LLM and the same training data, responses diverge. And that can be a problem.

Mango Health (YC W24) Is Hiring (ycombinator.com)

Resolve (YC W15) Is Hiring an Operations and Billing Lead for Construction VR

Arva AI (YC S24) Is Hiring an AI Research Engineer (London, UK) (arva.ai)

Rejoy Health (YC W21) Is Hiring (ycombinator.com)

Weave (YC W25) is hiring an AI engineer (ycombinator.com)

CoinTracker (YC W18) is hiring to solve crypto taxes and accounting (remote)

Crimson (YC X25) is hiring founding engineers in London (ycombinator.com)

Martin (YC S23) Is Hiring Founding Engineers to Build a Better Siri (ycombinator.com)

Meticulous (YC S21) is hiring in UK to redefine software dev (tinyurl.com)

Infisical (YC W23) Is Hiring DevRel Engineers (ycombinator.com)

Sieve (YC X25) is hiring researchers to build large video datasets for AI labs (sievedata.com)

Activeloop (YC S18) Is Hiring AI Search and Python Back End Engineers(Onsite,MV) (careers.activeloop.ai)

Attimet (YC F24) – Quant Trading Research Lab – Is Hiring Founding Researcher (ycombinator.com)

Metriport (YC S22) is hiring engineers to improve healthcare data exchange (ycombinator.com)

Telli (YC F24) Is Hiring Engineers [On-Site Berlin] (hi.telli.com)

Continue (YC S23) is hiring software engineers in San Francisco (ycombinator.com)

UpCodes (YC S17) is hiring a Head of Ops to automate construction compliance (up.codes)

Enhanced Radar (YC W25) is hiring a founding engineer

Converge (YC S23) well-capitalized New York startup seeks product developers (runconverge.com)

Kyber (YC W23) Is Hiring Enterprise BDRs (ycombinator.com)

MindsDB (YC W20) is hiring an AI solutions engineer (job-boards.greenhouse.io)

Recurse Center (YC S10) Is Hiring a Career Facilitator (recurse.notion.site)

Cua (YC X25) is hiring an engineer (ycombinator.com)

Noloco (YC S21) is hiring a founder's associate in Barcelona (ycombinator.com)

14.ai (YC W24) hiring founding engineers in SF to build a Zendesk alternative (14.ai)

Lago (Open-Source Usage Based Billing) is hiring for ten roles (ycombinator.com)

Spark AI (YC W24) is hiring a full-stack engineer in SF (founding team) (ycombinator.com)

Bitmovin (YC S15) Is Hiring a Junior Solutions Engineer in Denver (bitmovin.com)

SigNoz (YC W21, Open Source Datadog) Is Hiring DevRel Engineers (Remote)(US) (ycombinator.com)

AccessOwl (YC S22) is hiring an Elixir Engineer to connect 100s of SaaS (ycombinator.com)

FurtherAI (YC W24) Is Hiring for Software and AI Roles (ycombinator.com)

Ask HN: Will AI models over time converge into the same system?

Comments (8)