Skip to content

Commit

Permalink
Added advanced index actioins guide & sample code file. (opensearch-p…
Browse files Browse the repository at this point in the history
…roject#541)

Signed-off-by: Djcarrillo6 <[email protected]>
Signed-off-by: roma2023 <[email protected]>
  • Loading branch information
Djcarrillo6 authored and roma2023 committed Dec 28, 2023
1 parent af0ae87 commit 9d7cd43
Show file tree
Hide file tree
Showing 4 changed files with 198 additions and 2 deletions.
4 changes: 2 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- Added support for the security plugin ([#399](https://github.com/opensearch-project/opensearch-py/pull/399))
- Supports OpenSearch 2.1.0 - 2.6.0 ([#381](https://github.com/opensearch-project/opensearch-py/pull/381))
- Added `allow_redirects` to `RequestsHttpConnection#perform_request` ([#401](https://github.com/opensearch-project/opensearch-py/pull/401))
- Enhanced YAML test runner to use OpenSearch `rest-api-spec` YAML tests ([#414](https://github.com/opensearch-project/opensearch-py/pull/414)
- Enhanced YAML test runner to use OpenSearch `rest-api-spec` YAML tests ([#414](https://github.com/opensearch-project/opensearch-py/pull/414))
- Added `Search#collapse` ([#409](https://github.com/opensearch-project/opensearch-py/issues/409))
- Added support for the ISM API ([#398](https://github.com/opensearch-project/opensearch-py/pull/398))
- Added `trust_env` to `AIOHttpConnection` ([#398](https://github.com/opensearch-project/opensearch-py/pull/438))
Expand Down Expand Up @@ -152,4 +152,4 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
[2.2.0]: https://github.com/opensearch-project/opensearch-py/compare/v2.1.1...v2.2.0
[2.3.0]: https://github.com/opensearch-project/opensearch-py/compare/v2.2.0...v2.3.0
[2.3.1]: https://github.com/opensearch-project/opensearch-py/compare/v2.3.0...v2.3.1
[2.3.2]: https://github.com/opensearch-project/opensearch-py/compare/v2.3.1...v2.3.2
[2.3.2]: https://github.com/opensearch-project/opensearch-py/compare/v2.3.1...v2.3.2
1 change: 1 addition & 0 deletions USER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,7 @@ print(response)
- [Using a Proxy](guides/proxy.md)
- [Working with Snapshots](guides/snapshot.md)
- [Index Templates](guides/index_template.md)
- [Advanced Index Actions](guides/advanced_index_actions.md)
- [Connection Classes](guides/connection_classes.md)

## Plugins
Expand Down
113 changes: 113 additions & 0 deletions guides/advanced_index_actions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Advanced Index Actions Guide
- [Advanced Index Actions](#advanced-index-actions)
- [Setup](#setup)
- [Api Actions](#api-actions)
- [Clear Index Cache](#clear-index-cache)
- [Flush Index](#flush-index)
- [Refresh Index](#refresh-index)
- [Open or Close Index](#open-or-close-index)
- [Force Merge Index](#force-merge-index)
- [Clone Index](#clone-index)
- [Split Index](#split-index)
- [Cleanup](#cleanup)


# Advanced Index Actions
In this guide, we will look at some advanced index actions that are not covered in the [Index Lifecycle](index_lifecycle.md) guide.

## Setup
Let's create a client instance, and an index named `movies`:

```python
from opensearchpy import OpenSearch
client = OpenSearch(
hosts=['https://@localhost:9200'],
use_ssl=True,
verify_certs=False,
http_auth=('admin', 'admin')
)
client.indices.create(index='movies')
```

## API Actions
### Clear index cache
You can clear the cache of an index or indices by using the `indices.clear_cache` API action. The following example clears the cache of the `movies` index:

```python
client.indices.clear_cache(index='movies')
```

By default, the `indices.clear_cache` API action clears all types of cache. To clear specific types of cache pass the `query`, `fielddata`, or `request` parameter to the API action:

```python
client.indices.clear_cache(index='movies', query=True)
client.indices.clear_cache(index='movies', fielddata=True, request=True)
```

### Flush index
Sometimes you might want to flush an index or indices to make sure that all data in the transaction log is persisted to the index. To flush an index or indices use the `indices.flush` API action. The following example flushes the `movies` index:

```python
client.indices.flush(index='movies')
```

### Refresh index
You can refresh an index or indices to make sure that all changes are available for search. To refresh an index or indices use the `indices.refresh` API action:

```python
client.indices.refresh(index='movies')
```

### Open or close index
You can close an index to prevent read and write operations on the index. A closed index does not have to maintain certain data structures that an opened index require, reducing the memory and disk space required by the index. The following example closes and reopens the `movies` index:

```python
client.indices.close(index='movies')
client.indices.open(index='movies')
```

### Force merge index
You can force merge an index or indices to reduce the number of segments in the index. This can be useful if you have a large number of small segments in the index. Merging segments reduces the memory footprint of the index. Do note that this action is resource intensive and it is only recommended for read-only indices. The following example force merges the `movies` index:

```python
client.indices.forcemerge(index='movies')
```

### Clone index
You can clone an index to create a new index with the same mappings, data, and MOST of the settings. The source index must be in read-only state for cloning. The following example blocks write operations from `movies` index, clones the said index to create a new index named `movies_clone`, then re-enables write:

```python
client.indices.put_settings(index='movies', body={'index': {'blocks': {'write': True}}})
client.indices.clone(index='movies', target='movies_clone')
client.indices.put_settings(index='movies', body={'index': {'blocks': {'write': False}}})
```

### Split index
You can split an index into another index with more primary shards. The source index must be in read-only state for splitting. The following example create the read-only `books` index with 30 routing shards and 5 shards (which is divisible by 30), splits index into `bigger_books` with 10 shards (which is also divisible by 30), then re-enables write:

```python
client.indices.create(
index='books',
body={ 'settings': {
'index': { 'number_of_shards': 5,
'number_of_routing_shards': 30,
'blocks': { 'write': True } } } })

client.indices.split(
index='books',
target='bigger_books',
body={ 'settings': { 'index': { 'number_of_shards': 10 } } })

client.indices.put_settings(index='books', body={ 'index': { 'blocks': { 'write': False } } })
```

## Cleanup

Let's delete all the indices we created in this guide:

```python
client.indices.delete(index=['movies', 'books', 'movies_clone', 'bigger_books'])
```

# Sample Code
See [advanced_index_actions_sample.py](/samples/advanced_index_actions/advanced_index_actions_sample.py) for a working sample of the concepts in this guide.
82 changes: 82 additions & 0 deletions samples/advanced_index_actions/advanced_index_actions_sample.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
from opensearchpy import OpenSearch
import time


# For cleaner output, comment in the two lines below to disable warnings and informational messages
# import urllib3
# urllib3.disable_warnings()


def test_opensearch_examples():
# Set up
client = OpenSearch(
hosts=['https://localhost:9200'],
use_ssl=True,
verify_certs=False,
http_auth=('admin', 'admin')
)
client.indices.create(index='movies')
print("'movies' index created!")

# Test Clear Index Cache
client.indices.clear_cache(index='movies')
print("Cache for 'movies' index cleared!")
client.indices.clear_cache(index='movies', query=True)
print("Query cache for 'movies' index cleared!")
client.indices.clear_cache(index='movies', fielddata=True, request=True)
print("Field data and request cache for 'movies' index cleared!")

# Test Flush Index
client.indices.flush(index='movies')
print("'movies' index flushed!")

# Test Refresh Index
client.indices.refresh(index='movies')
print("'movies' index refreshed!")

# Test Close or Open Index
client.indices.close(index='movies')
print("'movies' index closed!")
time.sleep(2) # add sleep to ensure the index has time to close
client.indices.open(index='movies')
print("'movies' index opened!")

# Test Force Merge Index
client.indices.forcemerge(index='movies')
print("'movies' index force merged!")

# Test Clone
client.indices.put_settings(index='movies', body={'index': {'blocks': {'write': True}}})
print("Write operations blocked for 'movies' index!")
time.sleep(2)
client.indices.clone(index='movies', target='movies_clone')
print("'movies' index cloned to 'movies_clone'!")
client.indices.put_settings(index='movies', body={'index': {'blocks': {'write': False}}})
print("Write operations enabled for 'movies' index!")

# Test Split
client.indices.create(
index='books',
body={'settings': {
'index': {'number_of_shards': 5, 'number_of_routing_shards': 30, 'blocks': {'write': True}}}}
)
print("'books' index created!")
time.sleep(2) # add sleep to ensure the index has time to become read-only
client.indices.split(
index='books',
target='bigger_books',
body={'settings': {'index': {'number_of_shards': 10 }}}
)
print("'books' index split into 'bigger_books'!")
client.indices.put_settings(index='books', body={'index': {'blocks': {'write': False}}})
print("Write operations enabled for 'books' index!")

# Cleanup
client.indices.delete(index=['movies', 'books', 'movies_clone', 'bigger_books'])
print("All indices deleted!")




if __name__ == "__main__":
test_opensearch_examples()

0 comments on commit 9d7cd43

Please sign in to comment.