Latest in branch 2.3
2.3.5
Released 27 Jul 2016
(9 years ago)
SoftwareElasticsearch
Version2.3
Status
End of life
Initial release2.3.0
25 Mar 2016
(10 years ago)
Latest release2.3.5
27 Jul 2016
(9 years ago)
End of lifeUnavailable
Release noteshttps://www.elastic.co/guide/en/elasticsearch/reference/2.3/release-notes.html
Source codehttps://github.com/elastic/elasticsearch/tree/2.3
Documentationhttps://www.elastic.co/guide/en/elasticsearch/reference/2.3/
Downloadhttps://www.elastic.co/downloads/elasticsearch
Elasticsearch 2.3 ReleasesView full list

What Is New in Elasticsearch 2.3

Elasticsearch 2.3 delivers significant enhancements to query performance, aggregation capabilities, and cluster management. This release focuses on making complex operations faster and more resource-efficient for large-scale deployments.

Category Key Updates
New Features Pipeline Aggregations, Doc Values by default, Reindex API
Performance Faster terms aggregations, Improved global ordinals
Mapping & Querying GeoPoint as multi-field, Percolator updates, Query validation
Cluster Management Shard allocation filtering, Disk-based allocation settings
Deprecations Scripting language deprecations, Old percolator syntax

How did pipeline aggregations change data analysis?

Pipeline aggregations allow you to compute new metrics based on the output of other aggregations. This is a game-changer for analytics, enabling complex calculations like moving averages or derivatives directly within your aggregation pipeline.

Instead of post-processing results client-side, you can now chain aggregations together. For example, you can calculate the cumulative sum of a daily metric or find the bucket with the maximum value. This keeps the entire computational load on the Elasticsearch cluster.

Example Usage

{
  "aggs": {
    "sales_per_month": {
      "date_histogram": {"field": "date", "interval": "month"},
      "aggs": {"sales": {"sum": {"field": "price"}}}
    },
    "cumulative_sales": {
      "cumulative_sum": {"buckets_path": "sales_per_month>sales"}
    }
  }
}

Why are doc values now the default?

Doc values are now enabled by default for all fields except analyzed strings. This shift significantly reduces heap memory usage and improves aggregation performance by using a columnar data structure stored on disk.

In practice, this means your aggregations and sorts will be faster and more memory-efficient out of the box. The old fielddata caching mechanism, which was heap-heavy, is no longer the default for most use cases. You'll see less garbage collection pressure.

If you need to use fielddata for an analyzed string field, you must explicitly enable it in your mapping. This change encourages more sustainable cluster sizing for heavy analytical workloads.

What makes the Reindex API so useful?

The Reindex API provides a built-in way to copy documents from one index to another. This is essential for index maintenance, mapping changes, and data migration tasks that were previously cumbersome.

You can use it to change an index's shard count, update mappings, or even reindex from remote clusters. It handles the scrolling, indexing, and version conflict management for you, making it far more reliable than custom scripts.

Basic Reindex Command

POST _reindex
{
  "source": {"index": "old_index"},
  "dest": {"index": "new_index"}
}

How were geo queries improved?

GeoPoint fields can now be configured as multi-fields. This allows you to index the same geographic point multiple times with different precisions or for different purposes, like a high-precision search and a low-precision aggregation.

The percolator has also been updated to support geo queries. You can now register percolator queries that use geo bounding box, distance, or polygon filters, opening up new real-time alerting scenarios based on location.

What should I know about the deprecations?

Several scripting languages (groovy, javascript) have been deprecated due to security concerns. The recommendation is to migrate to Painless, the new default scripting language designed for safety and performance.

The old percolator syntax (percolate query) is also deprecated in favor of the newer percolator type. You should update your percolation setup to use the dedicated percolator type for future compatibility.

Always check the deprecation logs after upgrading to identify any changes needed in your application code or cluster configuration.

FAQ

Do I need to reindex to benefit from doc values being default?
No, existing indices will continue to use their current settings. The new default only applies to newly created indices. To enable doc values on an existing index, you would need to reindex with an updated mapping.

Can I still use Groovy scripts in 2.3?
Yes, but it's deprecated. Groovy scripting still works if enabled in the elasticsearch.yml configuration file, but you should start migrating your scripts to Painless to avoid issues in future versions.

What is the main advantage of pipeline aggregations?
They allow for complex, multi-stage analytics entirely on the server side. This eliminates the need to transfer large datasets to a client application for post-processing, reducing network overhead and simplifying application code.

Is the Reindex API suitable for large indices?
Yes, but monitor it carefully. The API uses scrolling and bulk indexing, so for very large indices, you may need to tune the scroll_size and slices parameters for optimal performance and to avoid timeouts.

How do I enable fielddata now that it's not the default?
You must explicitly set "fielddata": {"format": "disabled"} to false in the mapping for any analyzed string field where you need to use it for aggregations or sorting. It's generally recommended to use keyword sub-fields instead.

Releases In Branch 2.3

VersionRelease date
2.3.527 Jul 2016
(9 years ago)
2.3.430 Jun 2016
(9 years ago)
2.3.317 May 2016
(10 years ago)
2.3.221 Apr 2016
(10 years ago)
2.3.101 Apr 2016
(10 years ago)
2.3.025 Mar 2016
(10 years ago)