Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the description of esCleaner.py from plugin/storage/es/README.md #5891

Merged
merged 2 commits into from
Aug 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions cmd/es-index-cleaner/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# es-index-cleaner

It is common to only keep observability data for a limited time.
However, Elasticsearch does no support expiring of old data via TTL.
To help with this task, `es-index-cleaner` can be used to purge
old Jaeger indices. For example, to delete indixes older than 14 days:

```
docker run -it --rm --net=host -e ROLLOVER=true \
jaegertracing/jaeger-es-index-cleaner:latest \
14 \
http://localhost:9200
```

Another alternative is to use [Elasticsearch Curator][curator].

[curator]: https://www.elastic.co/guide/en/elasticsearch/client/curator/current/about.html
25 changes: 8 additions & 17 deletions plugin/storage/es/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,33 +5,24 @@ This provides a storage backend for Jaeger using [Elasticsearch](https://www.ela
## Indices
Indices will be created depending on the spans timestamp. i.e., a span with
a timestamp on 2017/04/21 will be stored in an index named `jaeger-2017-04-21`.
ElasticSearch also has no support for TTL, so there exists a script `./esCleaner.py`
that deletes older indices automatically. The [Elastic Curator](https://www.elastic.co/guide/en/elasticsearch/client/curator/current/about.html)
can also be used instead to do a similar job.

### Using `./esCleaner.py`
The script is using `python3`. All dependencies can be installed with: `python3 -m pip install elasticsearch elasticsearch-curator`.

Parameters:
* Environment variable TIMEOUT that sets the timeout in seconds for indices deletion (default: 120)
* Optional environment variable ES_USERNAME and ES_PASSWORD
* a number that will delete any indices older than that number in days
* ElasticSearch hostnames
* Example usage: `TIMEOUT=120 ./esCleaner.py 4 localhost:9200`
It is common to only keep observability data for a limited time.
However, Elasticsearch does no support expiring of old data via TTL.
To purge old Jaeger indices, use [jaeger-es-index-cleaner](../../../cmd/es-index-cleaner/).

### Timestamps
Because ElasticSearch's `Date` datatype has only millisecond granularity and Jaeger
requires microsecond granularity, Jaeger spans' `StartTime` is saved as a long type.
The conversion is done automatically.

### Nested fields (tags)
`Tags` are [nested](https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html) fields in the
`Tags` are [nested](https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html) fields in the
ElasticSearch schema used for Jaeger. This allows for better search capabilities and data retention. However, because
ElasticSearch creates a new document for every nested field, there is currently a limit of 50 nested fields per document.

### Shards and Replicas
Number of shards and replicas per index can be specified as parameters to the writer and/or through configs under
`./pkg/es/config/config.go`. If not specified, it defaults to ElasticSearch defaults: 5 shards and 1 replica.
Number of shards and replicas per index can be specified as parameters to the writer and/or through configs under
`./pkg/es/config/config.go`. If not specified, it defaults to ElasticSearch defaults: 5 shards and 1 replica.
[This article](https://qbox.io/blog/optimizing-elasticsearch-how-many-shards-per-index) goes into more information
about choosing how many shards should be chosen for optimization.

Expand All @@ -42,7 +33,7 @@ This plugin queries against spans. This means that all tags in a query must lie
query to successfully return a trace.

### Case-sensitivity
Queries are case-sensitive. For example, if a document with service name `ABC` is searched using a query `abc`,
Queries are case-sensitive. For example, if a document with service name `ABC` is searched using a query `abc`,
the document will not be retrieved.

## Testing
Expand All @@ -57,6 +48,6 @@ and that script be run from the top folder to integration test ElasticSearch as
This script requires Docker to be running.

### Adding tests
Integration test framework for storage lie under `../integration`.
Integration test framework for storage lie under `../integration`.
Add to `../integration/fixtures/traces/*.json` and `../integration/fixtures/queries.json` to add more
trace cases.
Loading