OpenSearch 3.0 Released

104 kmaliszewski 25 5/7/2025, 3:38:46 PM opensearch.org ↗

Comments (25)

wqtz · 7h ago
I feel sad for this project. This was a reactionary project to elasticsearch's license change to say, heck with it, I will open my own elastic spinoff with AWS.

The vibe of the project's community is pretty much reminiscent of a dead multiplier game. The community is not thriving which is essential for an OSS project and elasticsearch is virtually irreplaceable in this space. I do not know any enterprise customers using it because it is unproven and they have failed to show they are going to stick around for the long run.

Then every other SIEM platform is spinning up their own search platforms. Heck I even saw Cribl there in their own partner list which has its own search platform now. And elastic has a SIEM platform now with Elastic Security. Not sure the purpose of this project is now Elastic just won the battle and then later virtue signaled everyone by saying we are open source again y'all because even if we come around and slapped your engineers who said they are not going to touch proprietary code, your management is not going to pay for a migration to an untested fork with no long term commitment and which was essentially made out of spite.

simple10 · 19h ago
Just learning about OpenSearch. Looks like it's a fork of Elasticsearch from 2021 when Elasticsearch changed licensing model. https://github.com/opensearch-project/OpenSearch

Anyone know if it's still a drop in replacement for Elasticsearch? And how does it compare on performance and features?

jillesvangurp · 18h ago
I maintain a kotlin client for both Elasticsearch and Opensearch (jillesvangurp/kt-search). There are some differences but they are mostly still API compatible for most of the commonly used features.

There are some exceptions to this and vector search would be one of those. The feature was added post fork. There are a few other things of course. E.g. search_after works slightly different on both. My client works around that. And there are a lot of newer features on both sides that are annoyingly different. Both have some sql querying capabilities now but they both have their own take on that.

Elastic still has the edge on features IMHO. Especially Kibana has a lot more features than Amazon's fork. And on the aggregation front, Elastic has done quite a bit of feature and optimization work in the last few years (that's what powers the dashboards). For performance it depends what you do. But they both heavily lean on Lucene which remains the open source search library both products use. Elastic cloud is a bit better than opensearch in AWS from what I've seen. If you self host and tune, both should be very similar.

Elastic also just tagged version 9.0, which uses the same new version of Lucene as Opensearch 3.0. I have support for both new versions in my client already (added that a few weeks ago). It now works with Elasticsearch v7, 8, and 9 and Opensearch 1,2, & 3.

A lot of my consulting clients seem to prefer Opensearch lately. That's mainly because of the less complicated licensing and the AWS support. If you have a legacy Elasticsearch setup switching it to Opensearch should be doable (depending on what you use). But expect to reindex all your data. I don't think a direct migration is possible. If you use Elastic's client libraries, you may need to switch to Opensearch specific ones. This is generally a bit painful (package names, feature differences, etc.). That's why I created kt-search a few years ago.

Salgat · 18h ago
That's what we ended up doing for our migrations. We actually had a bunch of old Elasticsearch 2.3 databases (ancient), so we stood up an OpenSearch database in parallel for each and on service startup did a one-time automatic index and bulk copy over of all the data. So far very happy with OpenSearch.
simple10 · 17h ago
Ah thanks for the detail! Super useful comment.
Macha · 13h ago
One thing that Opensearch misses that would have been very nice to have on a recent project is enrich processors (https://www.elastic.co/docs/manage-data/ingest/transform-enr...)

If you're just using the standard document ingestion and search stuff, yeah, they're mostly compatible. But the fancier features that were part of the paid version in the past or have been recently developed are either not compatible or missing.

blueelephanttea · 16h ago
> Anyone know if it's still a drop in replacement for Elasticsearch?

As you point out it was forked a number of years ago so it started from the same place (7.10). Elasticsearch is now on 9.0+ and has 27,000 more commits than OpenSearch. So I doubt it is a drop-in replacement anymore.

I have no idea how many of those 27K commits are key features, but it is clear divergence.

Y-bar · 19h ago
It's worth noting that in September 2024 Elasticsearch once again returned to a fully open source license (A GPLv3).
Salgat · 18h ago
Fool me once...
jsiepkes · 8h ago
But Elastic Search is still open core. So certain "enterprise" functionally will never make it in the OSS version (unlike in OpenSearch).
__s · 19h ago
It is not a drop in replacement (but almost is)

1.x is compatible with ES 7.10

lockhead · 19h ago
It's slower on same hardware, but fine, stay away if you need the UI, the Kibana Fork is hellish slow and riddled with bugs.
darkamaul · 19h ago
It’s slightly more complex that this. Both OpenSearch and Elasticsearch have workflows where they excel.

My company did a fairly comprehensive benchmark of the two products [0] if you are interested in comparing performances.

[0] https://blog.trailofbits.com/2025/03/06/benchmarking-opensea...

ignoramous · 15h ago
> Just learning about OpenSearch. Looks like ...

OpenSearch was once a personal search results aggregator conceived at A9 (Amazon's Silicon Valley subsidiary): https://github.com/dewitt/opensearch

Blackthorn · 13h ago
Sometimes, the same name refers to multiple things.
aabhay · 19h ago
Does anyone use OpenSearch for its knn and vector capabilities? Is it any good? It’s always hard to know with systems like this whether it works at scale until your team is fighting fires.
seanhunter · 18h ago
Irrespective of opensearch, if the dimension of your vector embedding is reasonably large you'll probably want an approximate nearest neighbours approach like HNSW rather than knn itself

https://docs.opensearch.org/docs/1.2/search-plugins/knn/appr...

For whatever an endorsement from a random stranger is worth, we've been using opensearch for a vectordb for hybrid search across text and multimodal embeddings as well as traditional metadata and it's been great but we're not "full production" yet so I can't really speak to scale, but it's opensearch so I expect the scale to be fine most probably.

binarymax · 16h ago
I use it all the time. If it’s “good” depends more on your model for embeddings, but you do need to know a bit to tune the index. Whatever algo you choose, read the paper.

If you’re using lucene HNSW, it will scale but will eat lots and lots of Heap RAM. If you’re using FAISS or nmslib plugins keep an eye out for JNI RAM consumption as well as its outside the heap.

Overall, I’d say that it is a challenge to easily scale ANN past 100M vectors unless it’s given significant attention from the team.

antirez · 18h ago
I don't know about OpenSearch implementation, but recently I implemented from scratch Vector Sets for Redis using the HNSW as a data structure, and there are many other stores that use the same data structure. When HNSWs are well implemented, you can stay assured they scale very well compared to the task at hand, but you can expect insertion speed only on the order of a few thousands per second, if you are hitting a single HNSW. Reads are much faster, in Redis I get 80k/s easily (but it uses multiple cores).

So if you want to build a very, very large index using HSNWs, you have to understand if you normally have many writes that accumulate evenly, or if your index is a mostly read-only thing that is rebuilt from time to time. Mass-insertion the first time is going to be very slow. You can parallelize it if you build N parallel HNSWs, since the searches can be composed as the union of the results (sorted by cosine similarity). But often the bottleneck is the embedding model itself.

What is really not super scalable is the size of HNSWs. They use of memory is big (Redis by default uses 8 bit quantization for this reason), and on disk they require seeks. If you have large vectors, like 1024 components, quantization is a must.

alex_duf · 18h ago
It works with some caveats. I've seen it handle searches with millions of documents no problem, but the KNN search requires to load the entirety of the embedding graph in memory. So watch your RAM consumption.

The quality of your results will depend mostly on the quality of your embeddings

unethical_ban · 19h ago
I just want a quick log ingestion tool that can parse syslog easily and graph/search fields for me.

Setting up a simple log ingestion on Opensearch or ELK felt like a true journey, in a bad way.

binarymax · 18h ago
It’s surprising how challenging this is for both Elastic and Opensearch. The problem is that it’s all configuration and no convention, so you need to roll everything yourself. There should be prescribed recipes to make this simpler. If you’re using something like opentelemetry you can find help easier but it’s still annoying.
dbacar · 16h ago
I think both these tools are more on the easy side of setting up if you follow their guidelines. You can be up and running very quickly. The problems arise when you need some custom logic in processing log files. If you have simple shipping requiremts you can bypass logstash altogether . Elastic and opensearch are not the right tool for application metrics though in my opinion, for that use case just use prometheus and grafana.
nullify88 · 17h ago
It's possible but you need to buy in to the Elastic ecosystem. Stuff like *beats, logstash, etc, they can configure all sorts of index templates, and ingest pipelines depending on what you've configured it to receive.

These days, getting data in and out of Elasticsearch is quite easy with dynamic field mapping. Its keeping it performant which is tricky.

No comments yet

wingmanjd · 13h ago
Have you tried out Graylog? Their core product does pretty decently at my $DAYJOB.