Skip to content

Latest commit

 

History

History
326 lines (235 loc) · 11.8 KB

create.md

File metadata and controls

326 lines (235 loc) · 11.8 KB

Guide to Creating a tiled Server

There are a few steps to create a tiled server for bluesky data.

Jan Ilavsky has created an article about reading data from such a server.

CONTENTS

Download the template

The download steps must be done on a workstation that can reach the public network. Use the same account that will be used to run the tiled server.

The tiled server should run on a workstation that has access to the controls subnet and any relevant filesystems with data to be served.

cd your/projects/directory
git clone https://github.com/BCDA-APS/tiled-template ./tiled-server
cd ./tiled-server

TODO: What about changing the cloned repo origin?

Create conda environment

conda env create --force -n tiled -f environment.yml --solver=libmamba
conda activate tiled

This may install a few hundred packages, including databroker v2+.

Might seem slow...

In a networked scenario like the APS, with many filesystems provided by NFS exports and file backup & cache automation, processes that write many files to NFS filesystems (such as creating a conda environment) may be very slow. It could take 5-10 minutes to create this conda environment. Compare with the procedures for creating a conda environment for bluesky operations. Many of the same advisories apply here, too.

Configure

config.yml

Create your tiled configuration file from the template provided.

cp config.yml.template config.yml

Keep in mind, YAML, like Python uses indentation as syntax.

databroker catalogs

Edit config.yml for your databroker catalog information:

  • path: name of this catalog (use this name from your bluesky sessions); can be found in:
    • bluesky/instrument/iconfig.yml
    • catalog name is at the end of line ~8: DATABROKER_CATALOG: &databroker_catalog some_catalog_name
  • uri : address of your MongoDB catalog; in mongodb://DB_SERVER.xray.aps.anl.gov:27017/45id_instrument-bluesky replace:
    • DB_SERVER with db_host_name (can be found in 2nd column of APS list table)
    • 45id_instrument with catalog_name
  • In line 4 http://SERVER.xray.aps.anl.gov:8020/:
    • replace SERVER with the host name (computer running the tiled server)
    • make sure the port number is consistent with the ./start-tiled.sh script

WARNING: consider whether you want this information publicly available or not (i.e. host tiled repo on aps gitlab or github)

Repeat this block if you have more than one catalog to be served (such as retired catalogs). A comment section of the template shows how to add addtional catalogs.

Sharp-eyed observers will note that the databroker configuration details specified for tiled are different than the ones they have been using with databroker v1.2. The config for tiled uses the same info but in databroker v2 format.

databroker v1.2 format

  example:
    args:
      asset_registry_db: mongodb://mymongoserver.localdomain:27017/example
      metadatastore_db: mongodb://mymongoserver.localdomain:27017/example
    driver: bluesky-mongo-normalized-catalog

same content in tiled format

  - path: example
    tree: databroker.mongo_normalized:Tree.from_uri
    args:
      uri: mongodb://mymongoserver.localdomain:27017/example

Why the change? databroker is moving away from the intake library (the one that reads the v1.2 format). intake seems to be slow to load (you see that when importing databroker v1.2). New databroker v2 does not use intake. And is faster to import. (Other improvements under the hood.)

EPICS Area Detector data files

If your files are written by EPICS area detector during bluesky runs, you do not need to add file directories to your tiled server configuration if these conditions are met:

  • Lightweight references to the file(s) and image(s) were written in databroker (standard ophyd practice).
  • Referenced files are available to the tiled server when their data is requested by a client of the tiled server.
Missing files...

If a client requests data that comes from a referenced file and that file is not available at the time of the request, the tiled server will return a 500 Internal Server Error to the client. For security reasons, a more detailed answer is not provided to the tiled client. The tiled server console will usually provide the detail that the file could not be found.

local data files

Skip this section if you are just getting started.

If you want tiled to serve data files, the config file becomes longer. The config.yml file has examples. Each file directory tree (including all its subdirectories) is a separate entry in the config.yml file and a separate SQLite file.

Note: Very likely that details are missing in this section. Ask for help or create an issue.

Steps to add a directory tree

Note: This is documentation is preliminary.

For each directory tree, these steps:

  1. Identify a data file directory tree to be served by tiled.

    1. Create a new block in config.yml for the tree.
    2. Assign a name (like a catalog name) to identify the directory tree.
  2. Recognize files by mimetype.

    1. Prepare Python code that recognizes new file types and assigns mimetype to each.
      1. Recognized by common file extension (such as .mda or .xml).
      2. Recognized by content analysis (such as NeXus, SPEC, or XML).
    2. Prepare Python tiled adapter code for each new mimetype.
    3. Add line(s) for each new mimetype to config.yml.
  3. Create an SQLite catalog for the directory tree.

    1. Shell script recreate_sampler.sh
      1. SQL_CATALOG=dev_sampler.sql: name of SQLite file to be (re)created
      2. FILE_DIR=./dev_sampler : directory to be served
    2. Example (hypothetical) local directory
      1. Directory: ./dev_sampler (does not exist in template here)
      2. Contains these types of file: MDA, NeXus, SPEC, images, XML, HDF4, text
  4. Add SQLite file details to config.yml file:

    args:
        uri: ./dev_sampler.sql
        readable_storage:
        - ./dev_sampler
Details

You specify data files by providing their directory (which includes all subdirectories within).

Files are recognized by mimetype. The configuration template has several examples. Here is an example for a SPEC data file:

    text/x-spec_data: spec_data:read_spec_data

The mimetype is text/x-spec_data. The adapter is the read_spec_data() function in file spec_data.py (in the same directory as the config.yml).

Custom mimetypes, such as text/x-spec_data are assigned in function detect_mimetype() (in local file custom.py). This code identifies SPEC, NeXus, and (non-NeXus) HDF5 files.

Well-known file types, such as JPEG, TIFF, PNG, plain text, are recognized by library functions called by the tiled server library code.

For the SQLite file (at least at APS beamlines), keep in mind that NFS file access is noticeably slower than local file access. It is recommended to store the SQLite file on a local filesystem for the tiled server.

Run the server

A bash shell script is available to run your tiled server. Take note of two important environment variables:

  • HOST: What client IP numbers will this server respond to? If 0.0.0.0, the server will respond to clients from any IP number. If 127.0.0.1, the server will only respond to clients on this workstation (localhost).
  • PORT: What port will this server listen to? Your choice here. The default choice here is arbitrary yet advised. Port 8000 is common but may be used by some other local web server software. We choose port 8020 to avoid this possibility.

Once the config.yml and start-tiled.sh (and any configured SQLite) files are prepared, start the tiled server for testing:

$ ./start-tiled.sh

Here is the output from my tiled server as it starts:

Using configuration from /home/beams1/JEMIAN/Documents/projects/BCDA-APS/tiled-template/config.yml

    Tiled server is running in "public" mode, permitting open, anonymous access
    for reading. Any data that is not specifically controlled with an access
    policy will be visible to anyone who can connect to this server.


    Navigate a web browser or connect a Tiled client to:

    http://0.0.0.0:8020?api_key=d8edc247909a0246b4e2dd8ca8d75443f87f2c5facd627b703d6635284e2f2fc


    Because this server is public, the '?api_key=...' portion of
    the URL is needed only for _writing_ data (if applicable).


INFO:     Started server process [2033851]
INFO:     Waiting for application startup.
OBJECT CACHE: Will use up to 1_190_568_960 bytes (15% of total physical RAM)
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8020 (Press CTRL+C to quit)

Note: In this example, the api_key is randomly chosen by the server as it starts. With this option for tiled server startup, a new key is generated each time. A local installation should make a different choice, to provide its own key to allow authorized clients to write data (as bluesky documents from a RunEngine subscription).

Enter the server URL (above: http://0.0.0.0:8020, not https) in a web browser to test the server responds. Observe the server's console output each time the web browser makes a new request.

Press ^C to quit the server.

Enable auto (re)start

A bash shell script is available to help you manage the tiled server. It runs the tiled server in a screen sessions (so the server does not quit when you logout). The help command shows the commands available:

$ ./tiled-manage.sh help
Usage: ./tiled-manage.sh {start|stop|restart|checkup|status}

For example, this linux command shows the server status on my workstation:

./tiled-manage.sh status
# [2023-12-08T11:06:36-06:00 ./tiled-manage.sh] running fine, so it seems

Launch the server (for regular use):

./tiled-manage.sh start

The checkup command may be used to (re)start the server. For example, to enable automatic (re)start, add this line to your linux cron tasks.

*/5 * * * * /full/path/to/your/tiled-server/tiled-manage.sh checkup 2>&1 > /dev/null

Linux command crontab -e will open an editor where you can paste this line. The tiled-manage.sh checkup task will run every 5 minutes (9:10, 9:15, 9:20, ...). Within 5 minutes of a workstation reboot, the tiled server will be started.

Clients

Enter the server URL (above: http://0.0.0.0:8020, not https) in a web browser to test the server responds. Observe the server's console output each time the web browser makes a new request.

You can use a web browser or find it more convenient to develop your own code that makes requests using either URIs or Python tiled.client calls.

Gemviz, a Python Qt5 GUI program, is being developed to browse and visualize data from your databroker catalogs.