> In the old cluster, we used Java 8 to run Elasticsearch which was the latest Java version supported at the time. In the new cluster, we use the bundled Java 18 version which allows us to utilize a more modern garbage collector that works better with larger heaps. We used CMS in the old cluster and use G1 in the new one.
/me cries in legacy Java 8 application.
rickette 7 days ago [-]
Makes me wonder whats holding back an upgrade?
adql 7 days ago [-]
I'm guessing they were too far behind to do nice incremental one. Pretty much every major version upgrade in the past meant reindexing and some functionality changes.
rickette 7 days ago [-]
In the meldwater case yes, but was interested in the parent's reason.
sedro 7 days ago [-]
You can use G1 on Java 8 with -XX:+UseG1GC
mirashii 7 days ago [-]
I wouldn't recommend it, however, especially with Elasticsearch. I spent a lot of time with their team debugging issues that came down to G1GC issues in those days.
aorth 6 days ago [-]
On a related note, for years I had been following ShawnHeisy's wiki notes on Java / Solr. For a long time there was a HUGE warning about G1GC + Lucene (which is the underlying technology in both Solr and Elasticsearch).
Just upgraded an AWS cluster from Elasticsearch 5.5 to Opensearch 1.3 - which is just really is an ES 7.10 fork. Those are 4 years apart and many releases in between, I was expecting some expressive cost reduction, but not at all, it took the exact amount of VMs for the cluster to stay above water.
I would have expected some memory savings due to the indexes being lighter on the heap. Perhaps your workloads are more cpu heavy than memory?
flockonus 6 days ago [-]
Quite likely yes. I can't share the graph, but memory pressure didn't alleviate at all as I'd hope it would. We'll be experimenting with different VM types next.
karlney 6 days ago [-]
Shameless plug, but one of the other blog posts in the above series explain how we addressed high and unpredictable memory load in relation to wildcard searches.
After those changes were made then we haven't had any major issues with heap spikes. But it depends on a lot of factors, how your data looks, how your queries look and what aggregations that are made so its very hard to give some always applicable advice.
chrsig 7 days ago [-]
It definitely seems like it's gotten more robust since the 0.9.x/1.x/2.x days. And a lot of stuff that had to be done externally is starting to be baked into the service itself (rolling over indices, rolling up based on a query)
I'm a bit cautious using 8.x, because opensearch forked at 7.x. I don't really have any desire to keep track of any divergences between the two :-/ It's really unfortunate that there's so much bad blood between elastic and aws.
olavgg 7 days ago [-]
What do people go for these days? I am working at a startup, and I need to pick either OpenSearch or Elasticsearch. So far I've concluded with that Solr is not that good in a cluster setup. The last option is using FTS in PostgreSQL, which is by far the easiest solution and I can always "upgrade" later.
clone1018 6 days ago [-]
I've had really poor experiences so far from Elastic.co. Their core ElasticSearch offering seems to be on the backburner while they focus on "App Search" and other unrelated products like their APM offering. This leads to pretty sub-par knowledge and support for the ElasticSearch product. Even simple things like paginating results is an unknown, and you really have to pay deep into their ecosystem to get real implementation support.
Oddly enough too, due to the way their cloud hosting infrastructure works it seems they double dip into bandwidth pricing that should otherwise be free (eg AWS => AWS). It's also treated less as a "Software" as a service, and more a "platform" as a service, so you are left ultimately with the responsibility of managing, maintaining, and updating your Elastic cluster, without any ability to SSH into the servers its running on.
Then there's the unprofessionalism of laying off your account manager during a client call, or telling your client that you're sorry their ElasticSearch infrastructure is down, but they can't really help you unless you're on their "Sapphire Extreme Plan" [sic]...
Even if OpenSearch is just a cog in Amazon's offering, at least it's of a known quantity.
cuteboy19 6 days ago [-]
Their documentation is especially confusing when you don't know what you are looking for
vosper 7 days ago [-]
I’m in a similar boat, and my gut says to go with ES over OpenSearch. I’ve no idea how committed Amazon are to OpenSearch, and no idea of the quality or size of the team working on it. For Elastic ES is their core product. At AWS OpenSearch is one minor product amongst many.
jansommer 6 days ago [-]
I've installed Solr in a 3x5xcluster setup (3 data centers, 5 instances on each) called CDCR/cross data center replication. Has been working flawlessly for years.
But definitely go for Postgres FTS for simplicity as long as you can.
caseydm 7 days ago [-]
Our startup uses managed elasticsearch through elastic. We're a very small team so support/consulting advice is important to us and they are good at that.
flockonus 7 days ago [-]
At this point the fork is recent enough that I haven't noted much of a split, altho i'd agree it tends to compound.
Might want to look at the licenses and support to where you plan hosting it.
ddorian43 6 days ago [-]
Check out vespa.ai
adql 7 days ago [-]
Ye back then it seemed like every month or two they had some bug in their own clustering stuff that they wrote from scratch instead of going with tested ones like raft or paxos
chrsig 6 days ago [-]
Yeah, I'd be really curious how it fairs on jespen now. Back then it did...poorly
/me cries in legacy Java 8 application.
https://cwiki.apache.org/confluence/display/solr/shawnheisey
Those warnings have gone away recently. Make of that what you will.
https://underthehood.meltwater.com/blog/2022/11/25/how-we-up...
After those changes were made then we haven't had any major issues with heap spikes. But it depends on a lot of factors, how your data looks, how your queries look and what aggregations that are made so its very hard to give some always applicable advice.
I'm a bit cautious using 8.x, because opensearch forked at 7.x. I don't really have any desire to keep track of any divergences between the two :-/ It's really unfortunate that there's so much bad blood between elastic and aws.
Oddly enough too, due to the way their cloud hosting infrastructure works it seems they double dip into bandwidth pricing that should otherwise be free (eg AWS => AWS). It's also treated less as a "Software" as a service, and more a "platform" as a service, so you are left ultimately with the responsibility of managing, maintaining, and updating your Elastic cluster, without any ability to SSH into the servers its running on.
Then there's the unprofessionalism of laying off your account manager during a client call, or telling your client that you're sorry their ElasticSearch infrastructure is down, but they can't really help you unless you're on their "Sapphire Extreme Plan" [sic]...
Even if OpenSearch is just a cog in Amazon's offering, at least it's of a known quantity.
But definitely go for Postgres FTS for simplicity as long as you can.
https://aphyr.com/posts/323-jepsen-elasticsearch-1-5-0
They have gotten A LOT better since the 1.x days