Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add spo dls to internal knowledge search #141

Merged
merged 21 commits into from
Jan 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions example-apps/internal-knowledge-search/.flaskenv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
FLASK_APP=api/app.py
FLASK_RUN_PORT=3001
FLASK_DEBUG=1
7 changes: 7 additions & 0 deletions example-apps/internal-knowledge-search/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
frontend/build
frontend/node_modules
api/__pycache__
.venv
venv
.DS_Store
.env
179 changes: 179 additions & 0 deletions example-apps/internal-knowledge-search/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
# Elastic Internal Knowledge Search App

This is a sample app that demonstrates how to build an internal knowledge search application with document-level security on top of Elasticsearch.

**Requires at least 8.11.0 of Elasticsearch.**


## Download the Project

Download the project from Github and extract the `internal-knowledge-search` folder.

```bash
curl https://codeload.github.com/elastic/elasticsearch-labs/tar.gz/main | \
tar -xz --strip=2 elasticsearch-labs-main/example-apps/internal-knowledge-search
```

## Installing and connecting to Elasticsearch

### Install Elasticsearch

There are a number of ways to install Elasticsearch. Cloud is best for most use-cases. Visit the [Install Elasticsearch](https://www.elastic.co/search-labs/tutorials/install-elasticsearch) for more information.

### Connect to Elasticsearch

This app requires the following environment variables to be set to connect to Elasticsearch:

```sh
export ELASTICSEARCH_URL=...
export ELASTIC_USERNAME=...
export ELASTIC_PASSWORD=...
```

You can add these to a `.env` file for convenience. See the `env.example` file for a .env file template.

You can also set the `ELASTIC_CLOUD_ID` instead of the `ELASTICSEARCH_URL` if you're connecting to a cloud instance and prefer to use the cloud ID.

# Workplace Search Reference App

This application shows you how to build an application using [Elastic Search Applications](https://www.elastic.co/guide/en/enterprise-search/current/search-applications.html) for a Workplace Search use case.
![img.png](img.png)

The application uses the [Search Application Client](https://github.com/elastic/search-application-client). Refer to this [guide](https://www.elastic.co/guide/en/enterprise-search/current/search-applications-search.html) for more information.

## Running the application

### Configuring mappings (subject to change in the near future)

The application uses two mapping files (will be replaced with a corresponding UI in the near future).
One specifies the mapping of the documents in your indices to the rendered search result.
The other one maps a source index to a corresponding logo.

#### Data mapping

The data mappings are located inside [config/documentsToSearchResultMappings.json](src/config/documentsToSearchResultMappings.json).
Each entry maps the fields of the documents to the search result UI component for a specific index. The mapping expects `title`, `created`, `previewText`, `fullText`, and `link` as keys.
Specify a field name of the document you want to map for each key.

##### Example:

Content document:

````json
{
"name": "Document name",
"_timestamp": "2342345934",
"summary": "Some summary",
"fullText": "description",
"link": "some listing url"
}
````

Mapping:
````json
{
"search-mongo": {
"title": "name",
"created": "_timestamp",
"previewText": "summary",
"fullText": "description",
"link": "listing_url"
}
}
````

#### Logo mapping
You can specify a logo for each index behind the search application. Place your logo inside [data-source-logos](public/data-source-logos) and configure
your mapping as follows:

````json
{
"search-index-1": "data-source-logos/some_logo.png",
"search-index-2": "data-source-logos/some_other_logo.webp"
}
````

### Configuring the search application

To be able to use the index filtering and sorting in the UI you should update the search template of your search application:

`PUT _application/search_application/{YOUR_SEARCH_APPLICATION_NAME}`
````json
{
"indices": [{YOUR_INDICES_USED_BY_THE_SEARCH_APPLICATION}],
"template": {
"script": {
"lang": "mustache",
"source": """
{
"query": {
"bool": {
"must": [
{{#query}}
{
"query_string": {
"query": "{{query}}"
}
}
{{/query}}
],
"filter": {
"terms": {
"_index": {{#toJson}}indices{{/toJson}}
}
}
}
},
"from": {{from}},
"size": {{size}},
"sort": {{#toJson}}sort{{/toJson}}
}
""",
"params": {
"query": "",
"size": 10,
"from": 0,
"sort": [],
"indices": []
}
}
}
````

### Setting the search app variables

You need to set search application name and search application endpoints to the corresponding values in the UI. You'll get these values when [creating a search application](https://www.elastic.co/guide/en/enterprise-search/current/search-applications.html). Note that for the endpoint you should use just the hostname, so excluding the `/_application/search_application/{application_name}/_search`.

### Disable CORS

By default, Elasticsearch is configured to disallow cross-origin resource requests. To call Elasticsearch from the browser, you will need to [enable CORS on your Elasticsearch deployment](https://www.elastic.co/guide/en/elasticsearch/reference/current/behavioral-analytics-cors.html#behavioral-analytics-cors-enable-cors-elasticsearch).

If you don't feel comfortable enabling CORS on your Elasticsearch deployment, you can set the search endpoint in the UI to `http://localhost:3001/api/search_proxy`. Change the host if you're running the backend elsewhere. This will make the backend act as a proxy for the search calls, which is what you're most likely going to do in production.


### Set up DLS with SPO
1. create a connector in kibana named `search-sharepoint`
2. start connectors-python, if using connector clients
3. enable DLS
4. run an access control sync
5. run a full sync
6. define mappings, as above in this README
7. create search application
8. enable cors: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-application-security.html#search-application-security-cors-elasticsearch

### Change your API host

By default, this app will run on `http://localhost:3000` and the backend on `http://localhost:3001`. If you are running the backend in a different location, set the environment variable `REACT_APP_API_HOST` to wherever you're hosting your backend, plus the `/api` path.


### Run API and frontend

```sh
# Launch API app
flask run

# In a separate terminal launch frontend app
cd app-ui && npm install && npm run start
```

You can now access the frontend at http://localhost:3000. Changes are automatically reloaded.
178 changes: 178 additions & 0 deletions example-apps/internal-knowledge-search/api/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
from flask import Flask, jsonify, request, Response, current_app
from flask_cors import CORS
from elasticsearch_client import elasticsearch_client
import os
import sys
import requests

app = Flask(__name__, static_folder="../frontend/build", static_url_path="/")
CORS(app)


def get_identities_index(search_app_name):
search_app = elasticsearch_client.search_application.get(
name=search_app_name)
identities_indices = elasticsearch_client.indices.get(
index=".search-acl-filter*")
secured_index = [
app_index
for app_index in search_app["indices"]
if ".search-acl-filter-" + app_index in identities_indices
]
if len(secured_index) > 0:
identities_index = ".search-acl-filter-" + secured_index[0]
return identities_index
else:
raise ValueError(
"Could not find identities index for search application %s", search_app_name
)


@app.route("/")
def api_index():
return app.send_static_file("index.html")


@app.route("/api/default_settings", methods=["GET"])
def default_settings():
return {
"elasticsearch_endpoint": os.getenv("ELASTICSEARCH_URL") or "http://localhost:9200"
}


@app.route("/api/search_proxy/<path:text>", methods=["POST"])
def search(text):
response = requests.request(
method="POST",
url=os.getenv("ELASTICSEARCH_URL") + '/' + text,
data=request.get_data(),
allow_redirects=False,
headers={"Authorization": request.headers.get(
"Authorization"), "Content-Type": "application/json"}
)

return response.content


@app.route("/api/persona", methods=["GET"])
def personas():
try:
search_app_name = request.args.get("app_name")
identities_index = get_identities_index(search_app_name)
response = elasticsearch_client.search(
index=identities_index, size=1000)
hits = response["hits"]["hits"]
personas = [x["_id"] for x in hits]
personas.append("admin")
return personas

except Exception as e:
current_app.logger.warn(
"Encountered error %s while fetching personas, returning default persona", e
)
return ["admin"]


@app.route("/api/indices", methods=["GET"])
def indices():
try:
search_app_name = request.args.get("app_name")
search_app = elasticsearch_client.search_application.get(
name=search_app_name)
return search_app['indices']

except Exception as e:
current_app.logger.warn(
"Encountered error %s while fetching indices, returning no indices", e
)
return []


@app.route("/api/api_key", methods=["GET"])
def api_key():
search_app_name = request.args.get("app_name")
role_name = search_app_name + "-key-role"
default_role_descriptor = {}
default_role_descriptor[role_name] = {
"cluster": [],
"indices": [
{
"names": [search_app_name],
"privileges": ["read"],
"allow_restricted_indices": False,
}
],
"applications": [],
"run_as": [],
"metadata": {},
"transient_metadata": {"enabled": True},
"restriction": {"workflows": ["search_application_query"]},
}
identities_index = get_identities_index(search_app_name)
try:
persona = request.args.get("persona")
if persona == "":
raise ValueError("No persona specified")
role_descriptor = {}

if persona == "admin":
role_descriptor = default_role_descriptor
else:
identity = elasticsearch_client.get(
index=identities_index, id=persona)
permissions = identity["_source"]["query"]["template"]["params"][
"access_control"
]
role_descriptor = {
"dls-role": {
"cluster": ["all"],
"indices": [
{
"names": [search_app_name],
"privileges": ["read"],
"query": {
"template": {
"params": {"access_control": permissions},
"source": """{
"bool": {
"filter": {
"bool": {
"should": [
{
"bool": {
"must_not": {
"exists": {
"field": "_allow_access_control"
}
}
}
},
{
"terms": {
"_allow_access_control.enum": {{#toJson}}access_control{{/toJson}}
}
}
]
}
}
}
}""",
}
},
}
],
"restriction": {"workflows": ["search_application_query"]},
}
}
api_key = elasticsearch_client.security.create_api_key(
name=search_app_name+"-internal-knowledge-search-example-"+persona, expiration="1h", role_descriptors=role_descriptor)
return {"api_key": api_key['encoded']}

except Exception as e:
current_app.logger.warn(
"Encountered error %s while fetching api key", e)
raise e


if __name__ == "__main__":
app.run(port=3001, debug=True)
Loading