Skip to content

Commit

Permalink
Merge pull request #256 from grycap/dev-gmolto
Browse files Browse the repository at this point in the history
Update script to prevent overwrite and Documentation Improvements
  • Loading branch information
catttam authored Oct 24, 2024
2 parents 81e92d4 + a1fc547 commit 7316c7c
Show file tree
Hide file tree
Showing 8 changed files with 44 additions and 11 deletions.
4 changes: 4 additions & 0 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,8 @@ OSCAR exposes a secure REST API available at the Kubernetes master's node IP
through an Ingress Controller. This API has been described following the
[OpenAPI Specification](https://www.openapis.org/) and it is available below.

> ℹ️
>
> The bearer token used to run a service can be either the OSCAR [service access token](invoking-sync.md#service-access-tokens) or the [user's Access Token](integration-egi.md#obtaining-an-access-token) if the OSCAR cluster is integrated with EGI Check-in.
!!swagger api.yaml!!
Binary file added docs/images/oidc/egi-checkin-token-portal.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 15 additions & 1 deletion docs/integration-egi.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ grant access for all users from that VO.

The static web interface of OSCAR has been integrated with EGI Check-in and
published in [ui.oscar.grycap.net](https://ui.oscar.grycap.net) to facilitate
the authorization of users. To login through EGI Checkín using OIDC tokens,
the authorization of users. To login through EGI Check-In using OIDC tokens,
users only have to put the endpoint of its OSCAR cluster and click on the
"EGI CHECK-IN" button.

Expand All @@ -87,3 +87,17 @@ create a new account configuration for the
After that, clusters can be
added with the command [`oscar-cli cluster add`](oscar-cli.md#add) specifying
the oidc-agent account name with the `--oidc-account-name` flag.

### Obtaining an Access Token

Once logged in via EGI Check-In you can obtain an Access Token with one of this approaches:

* From the command-line, using `oidc-agent` with the following command:

```sh
oidc-token <account-short-name>
```
where `account-short-name` is the name of your account configuration.
* From the EGI Check-In Token Portal: [https://aai.egi.eu/token](https://aai.egi.eu/token)

![egi-checkin-token-portal.png](images/oidc/egi-checkin-token-portal.png)
8 changes: 4 additions & 4 deletions docs/invoking-async.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@

For event-driven file processing, OSCAR automatically manages the creation
and [notification system](https://docs.min.io/minio/baremetal/monitoring/bucket-notifications/bucket-notifications.html#minio-bucket-notifications)
of MinIO buckets in order to allow the event-driven invocation of services
using asynchronous requests, generating a Kubernetes job for every file to be
processed.

of MinIO buckets. This allow the event-driven invocation of services
using asynchronous requests for every file uploaded to the bucket, which generates a Kubernetes job for every file to be processed.

![oscar-async.png](images/oscar-async.png)

These jobs will be queued up in the Kubernetes scheduler and will be processed whenever there are resources available. If OSCAR cluster has been deployed as an elastic Kubernetes cluster (see [Deployment with IM](https://docs.oscar.grycap.net/deploy-im-dashboard/)), then new Virtual Machines will be provisioned (up to the maximum number of nodes defined) in the underlying Cloud platform and seamlessly integrated in the Kubernetes clusters to proceed with the execution of jobs. These nodes will be terminated as the worload is reduced. Notice that the output files can be stores in MinIO or in any other storage back-end supported by the [FaaS supervisor](oscar-service.md#faas-supervisor).

If you want to process a large number of data files, consider using [OSCAR Batch](https://github.com/grycap/oscar-batch), a tool designed to perform batch-based processing in OSCAR clusters. It includes a coordinator tool where the user provides a MinIO bucket containing files for processing. This service calculates the optimal number of parallel service invocations that can be accommodated within the cluster, according to its current status, and distributes the image processing workload accordingly among the service invocations. This is mainly intended to process large amounts of files, for example, historical data.
4 changes: 2 additions & 2 deletions docs/invoking-sync.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,8 @@ base64 input.png | curl -X POST -H "Authorization: Bearer <TOKEN>" \

## Service access tokens

As detailed in the [API specification](api.md), invocation paths require the
service access token in the request header for authentication. Service access
As detailed in the [API specification](api.md), invocation paths require either the
service access token or the Access Token of the user when the cluster is integrated with EGI Check-in, in the request header for authentication (any of them is valid). Service access
tokens are auto-generated in service creation and update, and MinIO eventing
system is automatically configured to use them for event-driven file
processing. Tokens can be obtained through the API, using the
Expand Down
13 changes: 11 additions & 2 deletions docs/invoking.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,16 @@

OSCAR services can be executed:

- [Synchronously](invoking-sync.md), so that the invocation to the service blocks the client until the response is obtained. Useful for short-lived service invocations.
- [Synchronously](invoking-sync.md), so that the invocation to the service blocks the client until the response is obtained.
- [Asynchronously](invoking-async.md), typically in response to a file upload to MinIO or via the OSCAR API.
- As an [exposed service](exposed-services.md), where the application executed already provides its own API or user interface (e.g. a Jupyter Notebook)
- As an [exposed service](exposed-services.md), where the application executed already provides its own API or user interface (e.g. Jupyter Notebook)


After reading the different service execution types, take into account the following considerations to better decide the most appropriate execution type for your use case:

* **Scalability**: Asynchronous invocations provide the best throughput when dealing with multiple concurrent data processing requests, since these are processed by independent jobs which are managed by the Kubernetes scheduler. A two-level elasticity approach is used (increase in the number of pods and increase in the number of Virtual Machines, if the OSCAR cluster was configured to be elastic). This is the recommended approach when each processing request exceeds the order of tens of seconds.

* **Reduced Latency** Synchronous invocations are oriented for short-lived (< tens of seconds) bursty requests. A certain number of containers can be configured to be kept alive to avoid the performance penalty of spawning new ones while providing an upper bound limit (see [`min_scale` and `max_scale` in the FDL](fdl.md#synchronoussettings), at the expense of always consuming resources in the OSCAR cluster. If the processing file is in the order of several MBytes it may not fit in the payload of the HTTP request.

* **Easy Access** For services that provide their own user interface or their own API, exposed services provide the ability to execute them in OSCAR and benefit for an auto-scaled configuration in case they are [stateless](https://en.wikipedia.org/wiki/Service_statelessness_principle). This way, users can directly access the service using its well-known interfaces by the users.

8 changes: 7 additions & 1 deletion docs/oscar-service.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ is in charge of:



### Input/Output
### FaaS Supervisor

[FaaS Supervisor](https://github.com/grycap/faas-supervisor), the component in
charge of managing the input and output of services, allows JSON or base64
Expand All @@ -37,6 +37,12 @@ The output of synchronous invocations will depend on the application itself:

This way users can adapt OSCAR's services to their own needs.

The FaaS Supervisor supports the following storage back-ends:
* [MinIO](https://min.io)
* [Amazon S3](https://aws.amazon.com/s3/)
* Webdav (and, therefore, [dCache](https://dcache.org))
* Onedata (and, therefore, [EGI DataHub](https://www.egi.eu/service/datahub/))

### Container images

Container images on asynchronous services use the tag `imagePullPolicy: Always`, which means that Kubernetes will check for the image digest on the image registry and download it if it is not present.
Expand Down
2 changes: 1 addition & 1 deletion examples/plants-classification-tensorflow/script.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash

IMAGE_NAME=`basename "$INPUT_FILE_PATH"`
IMAGE_NAME=`basename "$INPUT_FILE_PATH" | cut -d. -f1`
OUTPUT_FILE="$TMP_OUTPUT_DIR/output.json"

deepaas-cli predict --files "$INPUT_FILE_PATH" 2>&1 | grep -Po '{.*}' > "$OUTPUT_FILE"
Expand Down

0 comments on commit 7316c7c

Please sign in to comment.