What Is New in Elasticsearch 8.11
Elasticsearch 8.11 delivers significant enhancements across search, analytics, and data management. This version focuses on improving developer experience, query performance, and the overall resilience of the cluster.
| Category | Key Updates |
|---|---|
| New Features | Semantic Text, ELSER Model Deployment, _terms_enum API, Synthetic Source GA |
| Search & Query | Faster terms aggregations, Concurrent Lucene segments, Improved cross-cluster search |
| Analytics & Aggregations | Faster top-k queries, Optimized composite agg execution |
| Data Management | Searchable Snapshot Caching, Data stream lifecycle, Downsampling API |
| Resilience & Operations | Shard-level allocation awareness, Disk threshold defaults |
How does semantic search work in 8.11?
The semantic text feature introduces a new text type designed for storing and retrieving dense vectors, enabling true semantic search capabilities. This allows you to find documents based on conceptual meaning rather than just keyword matching.
You can now deploy and use the ELSER v2 model directly within your cluster for generating these vectors. In practice, this means you can build a semantic search system entirely within the Elastic ecosystem without relying on external services for vectorization.
What query performance improvements were made?
This release tackles performance from multiple angles. The new _terms_enum API provides a highly efficient way to discover terms in a field, which is a common requirement for building autocomplete-like features.
For aggregations, the terms agg is now significantly faster, especially on time-series datasets. Top-k queries and composite aggregations also see major execution path optimizations, making analytical queries return results much quicker.
Example: Using the _terms_enum API
GET /my-index/_terms_enum
{
"field": "user.id",
"string_prefix": "al"
}
Are there new ways to manage data lifecycle?
Yes, the data stream lifecycle (DSL) feature moves to general availability, providing a built-in and simplified alternative to Index Lifecycle Management (ILM) for managing data streams. It automates rollovers and retention based on your configured rules.
Searchable snapshots get a big boost with the introduction of a cache on the local node. This matters because it dramatically reduces the latency of queries on frozen-tier data, making archived data feel nearly as fast as hot data for many use cases.
How is cluster resilience improved?
Shard-level allocation awareness allows you to define custom attributes to control shard placement with much finer granularity than node-level awareness. This is crucial for ensuring high availability across failure zones like racks or availability zones.
The default low-disk watermark has been raised from 85% to 90%. This change helps prevent nodes from prematurely stopping shard allocation, which in turn improves the cluster's stability during normal disk usage fluctuations.
FAQ
Is the ELSER model free to use in 8.11?
No, the ELSER model is a paid feature. You need to have the appropriate Machine Learning subscription to deploy and use it for semantic text generation within your cluster.
Should I use data stream lifecycle (DSL) instead of ILM now?
For new data streams, DSL is the recommended approach as it's simpler and more integrated. For existing indices managed by ILM, you can continue using it; there's no requirement to migrate immediately.
What's the main benefit of the synthetic _source?
It eliminates the storage overhead of the _source field by reconstructing it on the fly from doc values at query time. This is now generally available and can lead to significant storage savings, especially for time-series data.
How does the new terms enum API differ from the _search API?
The _terms_enum API is purpose-built for term discovery and prefix matching. It's far more efficient for this specific task than using a _search request with a terms aggregation, which is a much heavier operation.
Does the shard-level awareness setting affect existing indices?
No, the shard-level allocation awareness settings only apply to new indices you create after configuring the feature. Existing indices continue to use the allocation rules they were created with.