jaegertracing · yurishkuro · Aug 27, 2024 · Aug 27, 2024 · Aug 27, 2024
@@ -0,0 +1,17 @@
+# es-index-cleaner
+
+It is common to only keep observability data for a limited time.
+However, Elasticsearch does no support expiring of old data via TTL.
+To help with this task, `es-index-cleaner` can be used to purge
+old Jaeger indices. For example, to delete indixes older than 14 days:
+
+```
+docker run -it --rm --net=host -e ROLLOVER=true \
+  jaegertracing/jaeger-es-index-cleaner:latest \
+  14 \
+  http://localhost:9200
+```
+
+Another alternative is to use [Elasticsearch Curator][curator].
+
+[curator]: https://www.elastic.co/guide/en/elasticsearch/client/curator/current/about.html
@@ -5,33 +5,24 @@ This provides a storage backend for Jaeger using [Elasticsearch](https://www.ela
 ## Indices
 Indices will be created depending on the spans timestamp. i.e., a span with
 a timestamp on 2017/04/21 will be stored in an index named `jaeger-2017-04-21`.
-ElasticSearch also has no support for TTL, so there exists a script `./esCleaner.py`
-that deletes older indices automatically. The [Elastic Curator](https://www.elastic.co/guide/en/elasticsearch/client/curator/current/about.html)
-can also be used instead to do a similar job.
 
-### Using `./esCleaner.py`
-The script is using `python3`. All dependencies can be installed with: `python3 -m pip install elasticsearch elasticsearch-curator`.
-
-Parameters:
- * Environment variable TIMEOUT that sets the timeout in seconds for indices deletion (default: 120)
- * Optional environment variable ES_USERNAME and ES_PASSWORD
- * a number that will delete any indices older than that number in days
- * ElasticSearch hostnames
- * Example usage: `TIMEOUT=120 ./esCleaner.py 4 localhost:9200`
+It is common to only keep observability data for a limited time.
+However, Elasticsearch does no support expiring of old data via TTL.
+To purge old Jaeger indices, use [jaeger-es-index-cleaner](../../../cmd/es-index-cleaner/).
 
 ### Timestamps
 Because ElasticSearch's `Date` datatype has only millisecond granularity and Jaeger
 requires microsecond granularity, Jaeger spans' `StartTime` is saved as a long type.
 The conversion is done automatically.
 
 ### Nested fields (tags)
-`Tags` are [nested](https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html) fields in the 
+`Tags` are [nested](https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html) fields in the
 ElasticSearch schema used for Jaeger. This allows for better search capabilities and data retention. However, because
 ElasticSearch creates a new document for every nested field, there is currently a limit of 50 nested fields per document.
 
 ### Shards and Replicas
-Number of shards and replicas per index can be specified as parameters to the writer and/or through configs under 
-`./pkg/es/config/config.go`. If not specified, it defaults to ElasticSearch defaults: 5 shards and 1 replica. 
+Number of shards and replicas per index can be specified as parameters to the writer and/or through configs under
+`./pkg/es/config/config.go`. If not specified, it defaults to ElasticSearch defaults: 5 shards and 1 replica.
 [This article](https://qbox.io/blog/optimizing-elasticsearch-how-many-shards-per-index) goes into more information
 about choosing how many shards should be chosen for optimization.
 
@@ -42,7 +33,7 @@ This plugin queries against spans. This means that all tags in a query must lie
 query to successfully return a trace.
 
 ### Case-sensitivity
-Queries are case-sensitive. For example, if a document with service name `ABC` is searched using a query `abc`, 
+Queries are case-sensitive. For example, if a document with service name `ABC` is searched using a query `abc`,
 the document will not be retrieved.
 
 ## Testing
@@ -57,6 +48,6 @@ and that script be run from the top folder to integration test ElasticSearch as
 This script requires Docker to be running.
 
 ### Adding tests
-Integration test framework for storage lie under `../integration`. 
+Integration test framework for storage lie under `../integration`.
 Add to `../integration/fixtures/traces/*.json` and `../integration/fixtures/queries.json` to add more
 trace cases.