Skip to content

Commit

Permalink
Restructure doc
Browse files Browse the repository at this point in the history
  • Loading branch information
mraspaud committed Apr 30, 2024
1 parent 38c820c commit 2fc72b6
Show file tree
Hide file tree
Showing 6 changed files with 131 additions and 130 deletions.
27 changes: 27 additions & 0 deletions docs/source/backends.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
Available backends
==================

Local watcher
-------------
.. automodule:: pytroll_watchers.local_watcher
:members:

Minio bucket notification watcher
---------------------------------
.. automodule:: pytroll_watchers.minio_notification_watcher
:members:

Copernicus dataspace watcher
----------------------------
.. automodule:: pytroll_watchers.dataspace_watcher
:members:

EUMETSAT datastore watcher
--------------------------
.. automodule:: pytroll_watchers.datastore_watcher
:members:

DHuS watcher
------------
.. automodule:: pytroll_watchers.dhus_watcher
:members:
21 changes: 21 additions & 0 deletions docs/source/cli.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
CLI
***

The command-line tool can be used by invoking `pytroll-watcher <config-file>`. An example config-file can be::

backend: minio
fs_config:
endpoint_url: my_endpoint.pytroll.org
bucket_name: satellite-data-viirs
storage_options:
profile: profile_for_credentials
publisher_config:
name: viirs_watcher
message_config:
subject: /segment/viirs/l1b/
atype: file
data:
sensor: viirs
aliases:
platform_name:
npp: Suomi-NPP
134 changes: 5 additions & 129 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,137 +10,13 @@ Welcome to pytroll-watchers's documentation!
:maxdepth: 2
:caption: Contents:

Pytroll-watcher is a library and command-line tool to detect changes on a local or remote file system.

At the moment we support local filesystems and Minio S3 buckets through bucket notifications.

CLI
***

The command-line tool can be used by invoking `pytroll-watcher <config-file>`. An example config-file can be::

backend: minio
fs_config:
endpoint_url: my_endpoint.pytroll.org
bucket_name: satellite-data-viirs
storage_options:
profile: profile_for_credentials
publisher_config:
name: viirs_watcher
message_config:
subject: /segment/viirs/l1b/
atype: file
data:
sensor: viirs
aliases:
platform_name:
npp: Suomi-NPP

Published messages
******************

The published messages will contain information on how to access the resource advertized. The following parameters will
be present in the message.

uid
---

This is the unique identifier for the resource. In general, it is the basename for the file/objects, since we assume
that two files with the same name will have the same content. In some cases it can include the containing directory.

Examples of uids:

- `SVM13_npp_d20240408_t1006227_e1007469_b64498_c20240408102334392250_cspp_dev.h5`
- `S3B_OL_1_EFR____20240415T074029_20240415T074329_20240415T094236_0179_092_035_1620_PS2_O_NR_003.SEN3/Oa02_radiances.nc`

uri
---

This is the URI that can be used to access the resource. The URI can be composed as fsspec allows for more complex cases.

Examples of uris:

- `s3://viirs-data/sdr/SVM13_npp_d20240408_t1006227_e1007469_b64498_c20240408102334392250_cspp_dev.h5`
- `zip://sdr/SVM13_npp_d20240408_t1006227_e1007469_b64498_c20240408102334392250_cspp_dev.h5::s3://viirs-data/viirs_sdr_npp_d20240408_t1006227_e1007469_b64498.zip`
- `https://someplace.com/files/S3B_OL_1_EFR____20240415T074029_20240415T074329_20240415T094236_0179_092_035_1620_PS2_O_NR_003.SEN3/Oa02_radiances.nc`

filesystem
----------

Sometimes the URI is not enough to gain access to the resource, for example when the hosting service requires
authentification. This is why pytroll-watchers with also provide the filesystem and the path items. The filesystem
parameter is the fsspec json representation of the filesystem. This can be used on the recipient side using eg::

fsspec.AbstractFileSystem.from_json(json.dumps(fs_info))

where `fs_info` is the content of the filesystem parameter.

To pass authentification parameters to the filesystem, use the `storage_options` configuration item.


Example of filesystem:

- `{"cls": "s3fs.core.S3FileSystem", "protocol": "s3", "args": [], "profile": "someprofile"}`

.. warning::

Pytroll-watchers tries to prevent publishing of sensitive information such as passwords and secret keys, and will
raise an error in most cases when this is done. However, always double-check your pytroll-watchers configuration so
that secrets are not passed to the library to start with.
Solutions include ssh-agent for ssh-based filesystems, storing credentials in .aws config files for s3 filesystems.
For http-based filesystems implemented in pytroll-watchers, the username and password are used to generate a token
prior to publishing, and will thus not be published.

path
----

This parameter is the companion to `filesystem` and gives the path to the resource within the filesystem.

Examples of paths:

- `/viirs-data/sdr/SVM13_npp_d20240408_t1006227_e1007469_b64498_c20240408102334392250_cspp_dev.h5`
- `/sdr/SVM13_npp_d20240408_t1006227_e1007469_b64498_c20240408102334392250_cspp_dev.h5`
- `/files/S3B_OL_1_EFR____20240415T074029_20240415T074329_20240415T094236_0179_092_035_1620_PS2_O_NR_003.SEN3/Oa02_radiances.nc`


API
***

Main interface
--------------
.. automodule:: pytroll_watchers
:members:

Local watcher
-------------
.. automodule:: pytroll_watchers.local_watcher
:members:

Minio bucket notification watcher
---------------------------------
.. automodule:: pytroll_watchers.minio_notification_watcher
:members:

Copernicus dataspace watcher
----------------------------
.. automodule:: pytroll_watchers.dataspace_watcher
:members:

EUMETSAT datastore watcher
--------------------------
.. automodule:: pytroll_watchers.datastore_watcher
:members:

DHuS watcher
------------
.. automodule:: pytroll_watchers.dhus_watcher
:members:
cli
published
backends
other_api

Pytroll-watcher is a library and command-line tool to detect changes on a local or remote file system.

Testing utilities
-----------------
.. automodule:: pytroll_watchers.testing
:members:

Indices and tables
==================
Expand Down
12 changes: 12 additions & 0 deletions docs/source/other_api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Common API
**********

Main interface
--------------
.. automodule:: pytroll_watchers
:members:

Testing utilities
-----------------
.. automodule:: pytroll_watchers.testing
:members:
65 changes: 65 additions & 0 deletions docs/source/published.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
Published messages
******************

The published messages will contain information on how to access the resource advertized. The following parameters will
be present in the message.

uid
---

This is the unique identifier for the resource. In general, it is the basename for the file/objects, since we assume
that two files with the same name will have the same content. In some cases it can include the containing directory.

Examples of uids:

- `SVM13_npp_d20240408_t1006227_e1007469_b64498_c20240408102334392250_cspp_dev.h5`
- `S3B_OL_1_EFR____20240415T074029_20240415T074329_20240415T094236_0179_092_035_1620_PS2_O_NR_003.SEN3/Oa02_radiances.nc`

uri
---

This is the URI that can be used to access the resource. The URI can be composed as fsspec allows for more complex cases.

Examples of uris:

- `s3://viirs-data/sdr/SVM13_npp_d20240408_t1006227_e1007469_b64498_c20240408102334392250_cspp_dev.h5`
- `zip://sdr/SVM13_npp_d20240408_t1006227_e1007469_b64498_c20240408102334392250_cspp_dev.h5::s3://viirs-data/viirs_sdr_npp_d20240408_t1006227_e1007469_b64498.zip`
- `https://someplace.com/files/S3B_OL_1_EFR____20240415T074029_20240415T074329_20240415T094236_0179_092_035_1620_PS2_O_NR_003.SEN3/Oa02_radiances.nc`

filesystem
----------

Sometimes the URI is not enough to gain access to the resource, for example when the hosting service requires
authentification. This is why pytroll-watchers with also provide the filesystem and the path items. The filesystem
parameter is the fsspec json representation of the filesystem. This can be used on the recipient side using eg::

fsspec.AbstractFileSystem.from_json(json.dumps(fs_info))

where `fs_info` is the content of the filesystem parameter.

To pass authentification parameters to the filesystem, use the `storage_options` configuration item.


Example of filesystem:

- `{"cls": "s3fs.core.S3FileSystem", "protocol": "s3", "args": [], "profile": "someprofile"}`

.. warning::

Pytroll-watchers tries to prevent publishing of sensitive information such as passwords and secret keys, and will
raise an error in most cases when this is done. However, always double-check your pytroll-watchers configuration so
that secrets are not passed to the library to start with.
Solutions include ssh-agent for ssh-based filesystems, storing credentials in .aws config files for s3 filesystems.
For http-based filesystems implemented in pytroll-watchers, the username and password are used to generate a token
prior to publishing, and will thus not be published.

path
----

This parameter is the companion to `filesystem` and gives the path to the resource within the filesystem.

Examples of paths:

- `/viirs-data/sdr/SVM13_npp_d20240408_t1006227_e1007469_b64498_c20240408102334392250_cspp_dev.h5`
- `/sdr/SVM13_npp_d20240408_t1006227_e1007469_b64498_c20240408102334392250_cspp_dev.h5`
- `/files/S3B_OL_1_EFR____20240415T074029_20240415T074329_20240415T094236_0179_092_035_1620_PS2_O_NR_003.SEN3/Oa02_radiances.nc`
2 changes: 1 addition & 1 deletion tests/test_datastore.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ def test_datastore_file_generator(tmp_path, search_params):
expected_token = "eceba4e1-95e6-3526-8c42-c3c9dc14ff5c" # noqa

assert len(features) == 5
path, mda = features[0]
path, _ = features[0]
assert str(path) == "https://api.eumetsat.int/data/download/1.0.0/collections/EO%3AEUM%3ADAT%3A0407/products/S3B_OL_2_WFR____20240416T104217_20240416T104517_20240417T182315_0180_092_051_1980_MAR_O_NT_003.SEN3"
assert expected_token in path.storage_options["client_kwargs"]["headers"]["Authorization"]

Expand Down

0 comments on commit 2fc72b6

Please sign in to comment.