Activeloop (YC S18) Is Hiring AI Search and Python Back End Engineers(Onsite,MV) (careers.activeloop.ai)

This has a lot of caveats and limitations. However, the model is available for download via a script in the repo, and the exact benchmarks I used are available. The white paper gets into theory and application, as well as reveals a lot of limitations and interesting differences from transformers in terms of training and prompting behavior. It also produces extensive appendices (over 100 pages) on training datasets used, and performance on the ~260 (I think?) NIV2 tasks in its validation dataset.

Running inference for the DSRU model + BGE embedding model together takes a bit shy of 10GB of VRAM, and the reference comparison model -- Zephyr 7B -- takes about 15GB of VRAM.

Comments (5)

tripplyons · 7h ago

How does this model compare to just using a linear classifier trained on BGE embeddings?

orderone_ai · 40m ago

Thank you for your question!

Because I'm not sure exactly what you're looking for when you say 'compares to' -- whether accuracy, speed, or architecture -- I'll hit all 3, but sorry if it's a bit much.

1. Accuracy: For simple tasks (like sentiment analysis on straightforward examples), it won't be much more accurate than a classical linear classifier, if at all.

1a. Accuracy on more diverse or challenging tasks: Because a linear classifier is just so damned simplistic, it simply cannot handle anything even resembling a reasoning task. Meanwhile, (when specifically trained), this architecture managed to get 8/10 on textual entailment tasks, which are generally considered the sort of entry level gold standard for reasoning ability.

2. Speed: It's slower than a classical classifier...in light of the ~1B params it's pushing. They're both still pretty much blazing fast, but the tiny classical classifier will definitely be faster.

3. Architecture: Here's where it gets interesting.

The architecture of the core model here differs significant from a classical linear classifier:

Classical Classifier: Input: BGE embedding (in this hypothetical) Output: Class labels through softmax Internal Architecture: No nonlinearity, no hidden layers, direct projection

General Classifier: Input: BGE Embedding Output: Class labels through nearest neighbor cosine similarity search of vocabulary Internal architecture: An input projection sparse layer, a layer for combining the 3 inputs after their upwards projection, and 14 hidden layers with nonlinearity (GELU), layernorms, skip connections -- all of the standard stuff you'd expect in an LLM, but...not in an LLM.

I hope that clears up your questions! If not, I'm happy to tell you more.

throwawayffffas · 10h ago

Can I ask? why do you have a single model for all these tasks?

Wouldn't it be easier and more ergonomic to users to have dedicated models for each of this tasks?

orderone_ai · 10h ago

Thank you for the question!

I would say that ease of use and deployment is actually a good reason to have a single model.

We don't train 20 LLMs for different purposes - we train one (or, I guess 3-4 in practice, each with their own broad specialization), and then prompt it for different tasks.

This simplifies deployment, integration, upgrading, etc.

This model is basically the same - instead of having a restriction to doing single-task classification. This means that a user can complete new tasks using a new prompt, not a new model.

throwawayffffas · 9h ago

While I agree with the general reasoning, isn't it harder for the user to prompt the model correctly as opposed to selecting a specialized model that they wish to use?

That's the feeling I have when I try to use LLMs for more general language processing.

Have you run in cases where the model "forgets" the task at hand and switches to another mid text stream?

Regardless of all of the above. It looks to me that your choice of reasoning and problem solving in the latent space is a great one and where we should be collectively focusing our efforts, keep up the good work.

Meticulous (YC S21) is hiring in UK to redefine software dev (tinyurl.com)

Infisical (YC W23) Is Hiring DevRel Engineers (ycombinator.com)

Sieve (YC X25) is hiring researchers to build large video datasets for AI labs (sievedata.com)

Activeloop (YC S18) Is Hiring AI Search and Python Back End Engineers(Onsite,MV) (careers.activeloop.ai)

Attimet (YC F24) – Quant Trading Research Lab – Is Hiring Founding Researcher (ycombinator.com)

Metriport (YC S22) is hiring engineers to improve healthcare data exchange (ycombinator.com)

Telli (YC F24) Is Hiring Engineers [On-Site Berlin] (hi.telli.com)

Continue (YC S23) is hiring software engineers in San Francisco (ycombinator.com)

UpCodes (YC S17) is hiring a Head of Ops to automate construction compliance (up.codes)

Enhanced Radar (YC W25) is hiring a founding engineer

Converge (YC S23) well-capitalized New York startup seeks product developers (runconverge.com)

Kyber (YC W23) Is Hiring Enterprise BDRs (ycombinator.com)

MindsDB (YC W20) is hiring an AI solutions engineer (job-boards.greenhouse.io)

Recurse Center (YC S10) Is Hiring a Career Facilitator (recurse.notion.site)

Cua (YC X25) is hiring an engineer (ycombinator.com)

Noloco (YC S21) is hiring a founder's associate in Barcelona (ycombinator.com)

14.ai (YC W24) hiring founding engineers in SF to build a Zendesk alternative (14.ai)

Lago (Open-Source Usage Based Billing) is hiring for ten roles (ycombinator.com)

Spark AI (YC W24) is hiring a full-stack engineer in SF (founding team) (ycombinator.com)

Bitmovin (YC S15) Is Hiring a Junior Solutions Engineer in Denver (bitmovin.com)

SigNoz (YC W21, Open Source Datadog) Is Hiring DevRel Engineers (Remote)(US) (ycombinator.com)

AccessOwl (YC S22) is hiring an Elixir Engineer to connect 100s of SaaS (ycombinator.com)

FurtherAI (YC W24) Is Hiring for Software and AI Roles (ycombinator.com)

Yarn (YC W24) is hiring engineers in NYC (ycombinator.com)

Expand.ai (YC S24) is hiring a founding engineer

Optifye.ai (YC W25) is hiring a back end engineer

Kastle (S24) is hiring an engineer (ycombinator.com)

Weave (YC W25) is hiring a founding AI engineer (ycombinator.com)

Qfex (YC X25) – Back End Engineer for a 24/7 Stock Exchange (ycombinator.com)

Attimet (YC F24) – Quant Trading Research Lab – Is Hiring Founding Engineer (ycombinator.com)

Show HN: A reasoning model that infers over whole tasks in 1ms in latent space

Comments (5)