What Is New in Elasticsearch 1.6
Elasticsearch 1.6 delivers significant enhancements across core functionality, scripting, and cluster operations. This release focuses on improving stability, performance, and giving developers more powerful tools for data manipulation.
| Category | Key Changes |
|---|---|
| New Features | Painless Scripting Language, Document Versioning in Realtime GET, Wait for Refresh API |
| Improvements | Aggregation Performance, Snapshot/Restore Process, Query Execution |
| Bug Fixes | Resolved issues in Geo, Percolator, and Node Discovery |
| Deprecations | Deprecated Lucene Expressions, Groovy Scripting (in certain contexts) |
What scripting improvements make 1.6 stand out?
The headline feature is the introduction of the Painless scripting language. Painless is designed to be safe, fast, and simple, addressing performance and security concerns with inline Groovy scripts.
It's enabled by default for inline and stored scripts. In practice, this means your aggregations and document updates can run significantly faster without the overhead of the Groovy sandbox. Groovy scripting is now deprecated for inline and stored scripts, so migrating to Painless is recommended.
How did real-time GET operations get better?
Realtime GET requests now honor document versioning. Previously, a realtime GET could return a document that was in the process of being indexed, potentially showing an older version if a newer one had just been committed.
This change ensures strong consistency; you always get the latest versioned document even in a high-throughput environment. It removes a subtle edge case that could cause confusion in applications relying on immediate data visibility after an index operation.
What new control do we have over index refreshes?
The new Wait for Refresh API (_refresh?wait_for) allows you to block until a refresh has made
indexed data available for search. This is different from the forced refresh API, which triggers the action
immediately.
Use this when your application logic requires that a just-indexed document must be immediately searchable, such as in a test suite or a synchronous import process. It provides a way to synchronize indexing and searching without hammering the cluster with manual refresh calls.
Were there any notable aggregation enhancements?
Yes, the terms aggregation saw a major performance boost, especially for large datasets. The optimization reduces memory usage and improves query latency when bucketing on high-cardinality fields.
We also got the new bucket selector aggregation, which lets you filter out buckets from the results based on a script. This is powerful for post-aggregation logic, like only keeping buckets where the average value exceeds a certain threshold.
How was the snapshot/restore process improved?
The snapshot process is now more resilient. It better handles cases where a node drops out during the snapshot, allowing the operation to continue on the remaining nodes instead of failing completely.
Restore operations gained the ability to transform restored data on the fly. You can now apply index name patterns during a restore, making it much easier to clone data from a backup into a new index for testing or data recovery purposes.
FAQ
Is Painless scripting backwards compatible with my existing Groovy scripts?
No, Painless has
its own syntax. You will need to rewrite your existing inline and stored Groovy scripts to use the Painless
language. The documentation provides a guide for this migration.
Does the realtime GET versioning change affect performance?
There is a minor performance cost
because the operation now must check the index's versioning state. However, the trade-off for strong consistency
is generally considered worth it for most use cases.
When should I use the Wait for Refresh API instead of a manual refresh?
Use it when you need
to wait for the next scheduled refresh to happen naturally, rather than forcing one. Forcing a refresh
(_refresh) is more resource-intensive and should be used sparingly.
Are there any settings I need to change to enable Painless?
No, Painless is enabled by
default for all inline and stored scripts. The dynamic scripting setting for Groovy
(script.groovy.sandbox.enabled) is now deprecated in favor of the new Painless language.
What happens to my percolator queries from previous versions?
Several bug fixes were applied
to the percolator to make it more stable. You should not need to change anything, but the percolator should be
more reliable, especially when registering many queries.