What Is New in Elasticsearch 6.1
Elasticsearch 6.1 delivers significant enhancements in cross-cluster search, indexing performance, and data management. This release focuses on making distributed operations smoother and improving the core experience for developers.
| Category | Key Updates |
|---|---|
| New Features | Cross-cluster search (beta), Index Lifecycle Management (beta), Painless scripting improvements |
| Performance | Faster indexing throughput, Reduced memory overhead for sparse fields |
| Mapping & Analysis | New search_quote_analyzer setting, Improved ICU analysis plugin |
| Bug Fixes | Over 100 issues resolved across clustering, search, and indexing |
How does cross-cluster search work in 6.1?
Cross-cluster search is now in beta, allowing you to query multiple remote clusters from a single coordinating cluster. You configure remote clusters in elasticsearch.yml and then search them using the remote: index prefix.
In practice, this means you can run a query like GET /remote_cluster:index/_search and get results aggregated from that remote cluster. This is a game-changer for managing geographically distributed data without complex application-side logic.
The feature uses a dedicated transport protocol for these requests, keeping the inter-cluster communication efficient. It's a solid foundation for the distributed search patterns we've been building manually for years.
What indexing improvements should I expect?
Indexing throughput got a major boost, especially for use cases involving sparse fields. The engine now handles documents where many fields are missing much more efficiently, reducing memory overhead and CPU usage.
This matters because real-world data is often messy and incomplete. If you're ingesting documents with varying schemas or many optional fields, you'll likely see a noticeable performance improvement and lower memory pressure on your data nodes.
They've also optimized how internal data structures are built and managed during the indexing process. For bulk indexing jobs, this can translate to faster completion times and more consistent performance under heavy load.
Are there new Painless scripting capabilities?
Yes, Painless scripting gets more powerful with support for the continue keyword and new methods for list and map manipulation. You can now use continue within loops for better control flow in your scripts.
They've added handy methods like List.indexOf and Map.toString, which simplify data manipulation directly within your queries or ingest pipelines. This makes scripted fields and conditional logic easier to write and maintain.
For developers, this means less pre-processing of data outside of Elasticsearch. You can handle more complex transformation logic inline, which is great for real-time data enrichment and custom scoring.
What's new for text analysis?
The ICU analysis plugin has been upgraded with new tokenizers and filters, providing better support for multilingual text. This is crucial for correctly parsing and analyzing languages with complex grammatical structures.
A new search_quote_analyzer setting allows you to specify a different analyzer for phrase searches. This helps in scenarios where you want exact phrase matching to behave differently from standard term analysis, improving the relevance of quoted search results.
In practice, this gives you finer control over how search queries are interpreted. You can now tailor the analysis chain specifically for phrase matching, which is something we've been working around with custom solutions for a while.
How is data management improved?
Index Lifecycle Management (ILM) enters beta, providing a native way to automate index management policies. You can define policies that automatically handle rollovers, force merges, and shrink operations based on index size or age.
This is a step up from using Curator for these tasks, as it's built directly into the cluster and managed through APIs. You can define a policy that moves indices from hot to warm to cold storage tiers automatically.
For operations teams, this means less manual intervention and more consistent management of the index lifecycle. It's particularly useful for time-series data like logs and metrics where retention policies are critical.
FAQ
Is cross-cluster search production-ready?
It's marked as beta in 6.1, so while functional, it's recommended for testing and non-critical workloads. The core mechanics work well, but expect some rough edges in complex failure scenarios.
Do I need to reindex to benefit from the indexing improvements?
No, the performance benefits for sparse fields are applied at the engine level. Existing indices will automatically benefit from the reduced memory overhead during segment merges and general operation.
Can I use the new Painless features on existing indices?
Yes, the scripting enhancements are runtime features. You can immediately use continue and the new collection methods in your queries and scripts without any index changes.
How do I enable Index Lifecycle Management?
ILM is available through the API in beta. You'll need to define lifecycle policies and then apply them to your indices. Check the documentation for the specific API endpoints and policy definitions.
What's the upgrade path from 5.x to 6.1?
You must upgrade to 6.0 first before moving to 6.1. A full cluster restart is required, and you should review the breaking changes in the 6.0 release notes, especially around mapping types and index templates.