Skip to content

Commit

Permalink
Replace fsnotify with a HTTP request to trigger actions. (#11)
Browse files Browse the repository at this point in the history
Most shared filesystems are networked in some manner and don't share events
across nodes; this means that fsnotify doesn't actually work. Polling is too
taxing so instead we ask the client to ping an API to indicate that a request
body has been written to the staging directory and is ready for execution. 

We don't ask the client to provide the request details in the API as it's
unauthenticated. We're still relying on the Unix file owner to tell us who is
making the request; the ping just tells us that the file is ready, which
happily eliminates the need for the retry loop. The output is also returned
as a HTTP response, which eliminates the need for the responses directory.

Some extra work is involved in making sure that the correct HTTP status codes
are reported for the different errors. We also mandate go >= 1.22.1 now.
  • Loading branch information
LTLA authored Apr 10, 2024
1 parent 8aa7631 commit 976446a
Show file tree
Hide file tree
Showing 16 changed files with 207 additions and 309 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v4
with:
go-version: '1.20'
go-version: '1.22'
cache-dependency-path: go.sum

- name: Install dependencies
Expand Down
39 changes: 16 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@
The Gobbler implements [**gypsum**](https://github.com/ArtifactDB/gypsum-worker)-like storage of ArtifactDB-managed files on a shared filesystem.
This replaces cloud storage with a world-readable local directory, reducing costs and improving efficiency by avoiding network traffic for uploads/downloads.
We simplify authentication by using Unix file permissions to determine ownership, avoiding the need for a separate identity provider like GitHub.
In fact, no HTTP requests are required at all, as the user and Gobbler communicate solely through filesystem events.

This document is intended for system administrators who want to spin up their own instance or developers of new clients to the Gobbler service.
Users should never have to interact with the Gobbler directly, as this should be mediated by client packages in relevant frameworks like R or Python.
Expand Down Expand Up @@ -135,22 +134,13 @@ Each project's current usage is tracked in `{project}/..usage`, which contains a

### General instructions

Gobbler uses filesystem events to watch a "staging directory", a world-writeable directory on the shared filesystem.
Users submit requests to the Gobbler by simply writing a JSON file with the request parameters inside the staging directory.
The Gobbler requires a "staging directory", a world-writeable directory on the shared filesystem.
Users submit requests to the Gobbler by writing a JSON file with the request parameters inside the staging directory.
Each request file's name should have a prefix of `request-<ACTION>-` where `ACTION` specifies the action to be performed.
Upon creation of a request file, the Gobbler will parse it and execute the request with the specified parameters.

After completing the request, the Gobbler will write a JSON response to the `responses` subdirectory of the staging directory.
This has the same name as the initial request file, so users can easily poll the subdirectory for the existence of this file.
Each response will have at least the `status` property (either `SUCCESS` or `FAILED`).
Once this file is written, users should perform a POST request to the Gobbler API to trigger execution;
this will return a JSON response that has at least the `status` property (either `SUCCESS` or `FAILED`).
For failures, this will be an additional `reason` string property to specify the reason;
for successes, additional proeprties may be present depending on the request action.

When writing the request file, it is recommended to use the write-and-rename paradigm.
Specifically, users should write the JSON request body to a file inside the staging directory that does _not_ have the `request-<ACTION>-` prefix.
Once the write is complete, this file can be renamed to a file with said prefix.
This ensures that the Gobbler does not read a partially-written file.
(That said, a direct write to the final file can still be performed, in which case the Gobbler will perform a few retries to avoid errors from parsing an incomplete file.)
for successes, additional properties may be present depending on the request action.

### Creating projects (admin)

Expand Down Expand Up @@ -318,27 +308,30 @@ cd gobbler && go build
```

Then, set up a staging directory with global read/write permissions.

- The staging directory should be on a filesystem supported by the [`fsnotify`](httsp://github.com/fsnotify/fsnotify) package.
- All parent directories of the staging directory should be at least globally executable.
ll parent directories of the staging directory should be at least globally executable.

```sh
mkdir STAGING
chmod 777 STAGING
```

Then, set up a registry directory with global read-only permissions.

- The registry and staging directories do not need to be on the same filesystem (e.g., for mounted shares), as long as both are accessible to users.
Next, set up a registry directory with global read-only permissions.
Note that the registry and staging directories do not need to be on the same filesystem (e.g., for mounted shares), as long as both are accessible to users.

```sh
mkdir REGISTRY
chmod 755 REGISTRY
```

The Gobbler can then be started by running the binary with a few arguments, including the UIDs of administrators:
Finally, start the Gobbler by running the binary with a few arguments, including the UIDs of administrators:

```sh
./gobbler -staging STAGING -registry REGISTRY -admin ADMIN1,ADMIN2
./gobbler \
-staging STAGING \
-registry REGISTRY \
-admin ADMIN1,ADMIN2 \
-port PORT
```

For requests, clients should write to `STAGING` and hit the API at `PORT` (or any equivalent alias).
All registered files can be read from `REGISTRY`.
15 changes: 8 additions & 7 deletions create.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import (
"encoding/json"
"strconv"
"errors"
"net/http"
)

func createProjectHandler(reqpath string, globals *globalConfiguration) error {
Expand All @@ -16,7 +17,7 @@ func createProjectHandler(reqpath string, globals *globalConfiguration) error {
return fmt.Errorf("failed to find owner of %q; %w", reqpath, err)
}
if !isAuthorizedToAdmin(req_user, globals.Administrators) {
return fmt.Errorf("user %q is not authorized to create a project", req_user)
return newHttpError(http.StatusForbidden, fmt.Errorf("user %q is not authorized to create a project", req_user))
}

request := struct {
Expand All @@ -27,15 +28,15 @@ func createProjectHandler(reqpath string, globals *globalConfiguration) error {
// Reading in the request.
handle, err := os.ReadFile(reqpath)
if err != nil {
return &readRequestError{ Cause: fmt.Errorf("failed to read %q; %w", reqpath, err) }
return fmt.Errorf("failed to read %q; %w", reqpath, err)
}
err = json.Unmarshal(handle, &request)
if err != nil {
return &readRequestError{ Cause: fmt.Errorf("failed to parse JSON from %q; %w", reqpath, err) }
return newHttpError(http.StatusBadRequest, fmt.Errorf("failed to parse JSON from %q; %w", reqpath, err))
}

if request.Project == nil {
return &readRequestError{ Cause: fmt.Errorf("expected a 'project' property in %q", reqpath) }
return newHttpError(http.StatusBadRequest, fmt.Errorf("expected a 'project' property in %q", reqpath))
}
project := *(request.Project)

Expand All @@ -45,13 +46,13 @@ func createProjectHandler(reqpath string, globals *globalConfiguration) error {
func createProject(project string, inperms *unsafePermissionsMetadata, req_user string, globals *globalConfiguration) error {
err := isBadName(project)
if err != nil {
return fmt.Errorf("invalid project name; %w", err)
return newHttpError(http.StatusBadRequest, fmt.Errorf("invalid project name; %w", err))
}

// Creating a new project from a pre-supplied name.
project_dir := filepath.Join(globals.Registry, project)
if _, err = os.Stat(project_dir); !errors.Is(err, os.ErrNotExist) {
return fmt.Errorf("project %q already exists", project)
return newHttpError(http.StatusBadRequest, fmt.Errorf("project %q already exists", project))
}

// No need to lock before MkdirAll, it just no-ops if the directory already exists.
Expand All @@ -72,7 +73,7 @@ func createProject(project string, inperms *unsafePermissionsMetadata, req_user
if inperms != nil && inperms.Uploaders != nil {
san, err := sanitizeUploaders(inperms.Uploaders)
if err != nil {
return fmt.Errorf("invalid 'permissions.uploaders' in the request details; %w", err)
return newHttpError(http.StatusBadRequest, fmt.Errorf("invalid 'permissions.uploaders' in the request details; %w", err))
}
perms.Uploaders = san
} else {
Expand Down
31 changes: 16 additions & 15 deletions delete.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import (
"path/filepath"
"time"
"errors"
"net/http"
)

func deleteProjectHandler(reqpath string, globals *globalConfiguration) error {
Expand All @@ -15,7 +16,7 @@ func deleteProjectHandler(reqpath string, globals *globalConfiguration) error {
return fmt.Errorf("failed to find owner of %q; %w", reqpath, err)
}
if !isAuthorizedToAdmin(req_user, globals.Administrators) {
return fmt.Errorf("user %q is not authorized to delete a project", req_user)
return newHttpError(http.StatusForbidden, fmt.Errorf("user %q is not authorized to delete a project", req_user))
}

incoming := struct {
Expand All @@ -24,17 +25,17 @@ func deleteProjectHandler(reqpath string, globals *globalConfiguration) error {
{
handle, err := os.ReadFile(reqpath)
if err != nil {
return &readRequestError{ Cause: fmt.Errorf("failed to read %q; %w", reqpath, err) }
return fmt.Errorf("failed to read %q; %w", reqpath, err)
}

err = json.Unmarshal(handle, &incoming)
if err != nil {
return &readRequestError{ Cause: fmt.Errorf("failed to parse JSON from %q; %w", reqpath, err) }
return newHttpError(http.StatusBadRequest, fmt.Errorf("failed to parse JSON from %q; %w", reqpath, err))
}

err = isMissingOrBadName(incoming.Project)
if err != nil {
return fmt.Errorf("invalid 'project' property in %q; %w", reqpath, err)
return newHttpError(http.StatusBadRequest, fmt.Errorf("invalid 'project' property in %q; %w", reqpath, err))
}
}

Expand Down Expand Up @@ -65,7 +66,7 @@ func deleteAssetHandler(reqpath string, globals *globalConfiguration) error {
return fmt.Errorf("failed to find owner of %q; %w", reqpath, err)
}
if !isAuthorizedToAdmin(req_user, globals.Administrators) {
return fmt.Errorf("user %q is not authorized to delete a project", req_user)
return newHttpError(http.StatusForbidden, fmt.Errorf("user %q is not authorized to delete a project", req_user))
}

incoming := struct {
Expand All @@ -75,22 +76,22 @@ func deleteAssetHandler(reqpath string, globals *globalConfiguration) error {
{
handle, err := os.ReadFile(reqpath)
if err != nil {
return &readRequestError{ Cause: fmt.Errorf("failed to read %q; %w", reqpath, err) }
return fmt.Errorf("failed to read %q; %w", reqpath, err)
}

err = json.Unmarshal(handle, &incoming)
if err != nil {
return &readRequestError{ Cause: fmt.Errorf("failed to parse JSON from %q; %w", reqpath, err) }
return newHttpError(http.StatusBadRequest, fmt.Errorf("failed to parse JSON from %q; %w", reqpath, err))
}

err = isMissingOrBadName(incoming.Project)
if err != nil {
return fmt.Errorf("invalid 'project' property in %q; %w", reqpath, err)
return newHttpError(http.StatusBadRequest, fmt.Errorf("invalid 'project' property in %q; %w", reqpath, err))
}

err = isMissingOrBadName(incoming.Asset)
if err != nil {
return fmt.Errorf("invalid 'asset' property in %q; %w", reqpath, err)
return newHttpError(http.StatusBadRequest, fmt.Errorf("invalid 'asset' property in %q; %w", reqpath, err))
}
}

Expand Down Expand Up @@ -150,7 +151,7 @@ func deleteVersionHandler(reqpath string, globals *globalConfiguration) error {
return fmt.Errorf("failed to find owner of %q; %w", reqpath, err)
}
if !isAuthorizedToAdmin(req_user, globals.Administrators) {
return fmt.Errorf("user %q is not authorized to delete a project", req_user)
return newHttpError(http.StatusForbidden, fmt.Errorf("user %q is not authorized to delete a project", req_user))
}

incoming := struct {
Expand All @@ -161,25 +162,25 @@ func deleteVersionHandler(reqpath string, globals *globalConfiguration) error {
{
handle, err := os.ReadFile(reqpath)
if err != nil {
return &readRequestError{ Cause: fmt.Errorf("failed to read %q; %w", reqpath, err) }
return fmt.Errorf("failed to read %q; %w", reqpath, err)
}

err = json.Unmarshal(handle, &incoming)
if err != nil {
return &readRequestError{ Cause: fmt.Errorf("failed to parse JSON from %q; %w", reqpath, err) }
return newHttpError(http.StatusBadRequest, fmt.Errorf("failed to parse JSON from %q; %w", reqpath, err))
}

err = isMissingOrBadName(incoming.Project)
if err != nil {
return fmt.Errorf("invalid 'project' property in %q; %w", reqpath, err)
return newHttpError(http.StatusBadRequest, fmt.Errorf("invalid 'project' property in %q; %w", reqpath, err))
}
err = isMissingOrBadName(incoming.Asset)
if err != nil {
return fmt.Errorf("invalid 'asset' property in %q; %w", reqpath, err)
return newHttpError(http.StatusBadRequest, fmt.Errorf("invalid 'asset' property in %q; %w", reqpath, err))
}
err = isMissingOrBadName(incoming.Version)
if err != nil {
return fmt.Errorf("invalid 'version' property in %q; %w", reqpath, err)
return newHttpError(http.StatusBadRequest, fmt.Errorf("invalid 'version' property in %q; %w", reqpath, err))
}
}

Expand Down
7 changes: 1 addition & 6 deletions go.mod
Original file line number Diff line number Diff line change
@@ -1,8 +1,3 @@
module gobbler

go 1.20

require (
github.com/fsnotify/fsnotify v1.7.0 // indirect
golang.org/x/sys v0.4.0 // indirect
)
go 1.22.1
4 changes: 0 additions & 4 deletions go.sum
Original file line number Diff line number Diff line change
@@ -1,4 +0,0 @@
github.com/fsnotify/fsnotify v1.7.0 h1:8JEhPFa5W2WU7YfeZzPNqzMP6Lwt7L2715Ggo0nosvA=
github.com/fsnotify/fsnotify v1.7.0/go.mod h1:40Bi/Hjc2AVfZrqy+aj+yEI+/bRxZnMJyTJwOpGvigM=
golang.org/x/sys v0.4.0 h1:Zr2JFtRQNX3BCZ8YtxRE9hNJYC8J6I1MVbMg6owUp18=
golang.org/x/sys v0.4.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
11 changes: 6 additions & 5 deletions latest.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import (
"path/filepath"
"time"
"strings"
"net/http"
)

type latestMetadata struct {
Expand Down Expand Up @@ -90,7 +91,7 @@ func refreshLatestHandler(reqpath string, globals *globalConfiguration) (*latest
}

if !isAuthorizedToAdmin(source_user, globals.Administrators) {
return nil, fmt.Errorf("user %q is not authorized to refresh the latest version (%q)", source_user, reqpath)
return nil, newHttpError(http.StatusForbidden, fmt.Errorf("user %q is not authorized to refresh the latest version (%q)", source_user, reqpath))
}

incoming := struct {
Expand All @@ -100,22 +101,22 @@ func refreshLatestHandler(reqpath string, globals *globalConfiguration) (*latest
{
handle, err := os.ReadFile(reqpath)
if err != nil {
return nil, &readRequestError{ fmt.Errorf("failed to read %q; %w", reqpath, err) }
return nil, fmt.Errorf("failed to read %q; %w", reqpath, err)
}

err = json.Unmarshal(handle, &incoming)
if err != nil {
return nil, &readRequestError{ fmt.Errorf("failed to parse JSON from %q; %w", reqpath, err) }
return nil, newHttpError(http.StatusBadRequest, fmt.Errorf("failed to parse JSON from %q; %w", reqpath, err))
}

err = isMissingOrBadName(incoming.Project)
if err != nil {
return nil, fmt.Errorf("invalid 'project' property in %q; %w", reqpath, err)
return nil, newHttpError(http.StatusBadRequest, fmt.Errorf("invalid 'project' property in %q; %w", reqpath, err))
}

err = isMissingOrBadName(incoming.Asset)
if err != nil {
return nil, fmt.Errorf("invalid 'asset' property in %q; %w", reqpath, err)
return nil, newHttpError(http.StatusBadRequest, fmt.Errorf("invalid 'asset' property in %q; %w", reqpath, err))
}
}

Expand Down
Loading

0 comments on commit 976446a

Please sign in to comment.