Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default/shared filesystem paths #3956

Open
philrz opened this issue Jun 29, 2022 · 6 comments · Fixed by #4758
Open

Default/shared filesystem paths #3956

philrz opened this issue Jun 29, 2022 · 6 comments · Fixed by #4758

Comments

@philrz
Copy link
Contributor

philrz commented Jun 29, 2022

The Problem

The Zed/Zui/Brimcap tools as they've existed to date have caused user confusion due to how they rely on a combination of default paths and/or environment variables to locate configuration, state, and storage data on which they rely. Some examples:

  • After a user has installed/launched the Zui app, an app-managed Zed lake is stored in a platform-specific directory (paths described here). If they later install/run standalone Zed CLI commands, they would need to have a ZED_LAKE environment variable pointed at this same path or localhost network endpoint if they want their app and CLI experiences to converge on the same lake.

  • The same problem occurs in reverse if the user has first run the Zed CLI tools to create/query a Zed lake, then they install the Zui app: They're likely to end up with two wholly separate lakes.

  • A similar set of problems also exist for config state, such as the cached record of the most recent pool/branch selection made with zed use.

  • In order to manage its Zed lake, the Zui app has always shipped with its own Zed binaries that live in yet another platform-specific directory (paths described here). If the user is unaware of the presence/location of these embedded binaries, they may install redundant Zed CLI commands, which wastes storage and has the potential to create versioning issues if the "a la carte" CLI commands are out-of-sync with the lake storage format as of the app's ship date.

  • Once again in reverse, if the user already had the Zed CLI tools installed and in their PATH, they are exposed to a similar problem if they later install the Zui app: Redundant binaries and possible versioning problems.

  • A similar set of problems exist for the "Brimcap root", as the Zui app traditionally ships with an embedded brimcap binary and manages a packet index in own "Brimcap root", but the Brimcap tools could also be installed/run "a la carte" outside the app and potentially point to the same/different Brimcap root (Brimcap & Zui could share a default brimcap root brimcap#154). For simplification, the remainder of this issue will speak only of the Zed lake example, with the assumption that a similar approach could be applied to solving the problem for Brimcap if we choose.

Goals

In brainstorming solutions within the Dev team, there's been general consensus around a few goals.

  1. No tool/app should require users to set environment variables before storing/querying data. By default, different tools that use a Zed lake should find/leverage the same lake in well-known locations.

  2. Zed CLI commands should no longer default to attempting to find a local lake on a localhost service endpoint, but rather should rely on default filesystem paths.

  3. Users should not need to reference detailed documents to find storage/config files on their filesystem, and these paths should be platform-independent whenever possible. One example would be short pathnames relative to a user's "home directory".

  4. Customized overrides to these paths (such as via environment variables for CLI tooling or prompts in a GUI installer for an app) should be possible, but most users are expected to stick with defaults, so the majority of effort should focus on sensible default behavior.

  5. If choosing a convention to follow, behavior can lean more toward behaving like developer tools (e.g., git) rather than desktop apps (e.g., a photo editor).

  6. To allow tools to safely use shared storage/config and binaries, strict implementation and adherence to Zed versioning conventions should be fully implemented as part of addressing this problem. This will allow notification of users when upgrades are necessary.

Precedent

In seeking inspiration to solve this problem in a way that may click with users, the XDG specification is worthy of study. In brief, it attempts to solve this problem for Linux desktop apps similarly to how it's believed to have ben addressed for macOS (Apple's docs, and XDG comparison) and Windows.

It's worthy of note that, probably due to its use of Electron, the Zui app has ended up following at least some of the XDG convention on Linux, since running the app results in the creation of a $HOME/.config/Zui path, which could also be described as $XDG_CONFIG_HOME/Zui.

$ ls -l ~/.config/Zui
total 68
-rw-rw-r-- 1 phil phil  1781 Aug 22 17:06  appState.json
drwx------ 3 phil phil  4096 Aug 22 17:06  blob_storage
drwx------ 3 phil phil  4096 Aug 22 17:06  Cache
drwx------ 4 phil phil  4096 Aug 22 17:06 'Code Cache'
-rw------- 1 phil phil 20480 Aug 22 17:06  Cookies
-rw------- 1 phil phil     0 Aug 22 17:06  Cookies-journal
drwx------ 2 phil phil  4096 Aug 22 17:06  Crashpad
drwx------ 2 phil phil  4096 Aug 22 17:06  DawnCache
drwx------ 2 phil phil  4096 Aug 22 17:06  Dictionaries
-rw-rw-r-- 1 phil phil     0 Aug 22 17:06  first-run
drwx------ 2 phil phil  4096 Aug 22 17:06  GPUCache
drwxr-xr-x 4 phil phil  4096 Aug 22 17:06  lake
drwx------ 3 phil phil  4096 Aug 22 17:06 'Local Storage'
drwxrwxr-x 2 phil phil  4096 Aug 22 17:06  logs
drwxrwxr-x 4 phil phil  4096 Aug 22 17:06  plugins
lrwxrwxrwx 1 phil phil    20 Aug 22 17:06  SingletonCookie -> 15310173946632788756
lrwxrwxrwx 1 phil phil    20 Aug 22 17:06  SingletonLock -> phil-VirtualBox-3507
lrwxrwxrwx 1 phil phil    37 Aug 22 17:06  SingletonSocket -> /tmp/scoped_dir0Zwzpo/SingletonSocket

That said, this does not wholly match with what XDG advocates, as the specification advocates the use of a separate $XDG_DATA_HOME for data files, hence Zui's use of the lake/ directory somewhere below ~/.config goes against convention.

Once again, XDG only seeks to solve this problem for Linux. While the links provided above show developers following their instincts to try to apply the same approach on macOS/Windows and (hence leverage a universal cross-platform convention), consensus response seems to point users back at the traditional Apple/Microsoft-provided conventions, which leads back to the long/ugly paths we wanted to avoid in our 2nd goal above. That said, if we apply our 4th goal from above about leaning toward developer-centric behavior, some precedent appears such as Git's global .gitconfig that's assumed to live below the user's home directory even on macOS and Windows (though note the Git docs also mention ~/.config/git/config as an alternative, as if making room for the XDG-style behavior) or NPM configstore, which similarly relies on "$XDG_CONFIG_HOME or ~/.config", hence a .config path below the user's home directory on both macOS and Windows.

Note a "home directory" is only assumed to be defined in a $HOME env var on Linux and macOS, whereas the Windows equivalent is %USERPROFILE%.

@mattnibs
Copy link
Collaborator

mattnibs commented Jul 1, 2022

@philrz I'm not an XDG expert but I've seen other projects following XDG place config in $HOME/.config and data in $HOME/.local/share.

@philrz
Copy link
Contributor Author

philrz commented Jul 5, 2022

@mattnibs: Yes, what you describe is what I saw in the spec, so it makes sense that you've seen other projects do the same.  I was trying to highlight two things in the examples I cited:

  1. The fact that Zui has always ended up putting config under ~/.config happens to line up with XDG convention, but the fact Zui puts data storage below there does not.
  2. The way I saw Git or NPM configstore use ~/.config on macOS and Windows happens to line up with XDG convention, even though XDG is only described as targeted at Linux.

@philrz
Copy link
Contributor Author

philrz commented Jul 25, 2022

A community user @cn-fairy recently mentioned in a separate issue brimdata/zui#2371 the desire to customize the installation path, which is separate from the config/storage paths described above, but perhaps this should be rolled into the same effort. In their original wording:

Windows platform:
Hope to customize the installation path or provide a portable installation package, easy to put on the U disk can be used anywhere (currently the installed files moved to other paths can be used, no problems found at the moment)

@philrz philrz changed the title Default/shared filesystem paths for config & storage Default/shared filesystem paths Jul 25, 2022
@philrz
Copy link
Contributor Author

philrz commented Dec 6, 2022

Another community user asked in a Slack thread about customizing the location for lake storage.

I have two drives on my PC. One is C:\ (OS) and the other is M:\ (storage). Is there a config file or other means to point Zui to a lake on the M:\ drive, outside of the default C:\Users\me\AppData\Roaming\Zui - Insiders? The only way I know to do this at the moment is with zed serve -lake <filepath>. Trying to see if there's a way to do this without having to start the zed service.

I was able to confirm an effective one-off workaround by creating a directory junction. However, a more direct/documented approach would surely have surely helped.

@philrz
Copy link
Contributor Author

philrz commented Apr 22, 2023

Another community Zui user asked in brimdata/zui#2753 about wanting to make some customizations to where their lake data is stored. In their own words:

Changing the storage location of pools

In some materials where the data is very large, for example 20-30 GB, currently all data is stored in the following folder.
The problem of lack of hard disk space occurs
C:\Users*\AppData\Roaming\Zui\lake

Is it possible to store lake data in another location? Possibility of optional selection for the storage location of each pool

For now I've pointed them at the same "directory junction" trick cited in the previous comment. Note that they implied a desire to maybe change storage location on a pool-by-pool basis rather than just whole lake.

@philrz
Copy link
Contributor Author

philrz commented Aug 23, 2023

@mccanne recently once again confronted the aspects of this issue related to lake storage. Specifically, he already had some data he'd loaded into Zed lakes using the zed CLI commands, then he started Zui in Dev mode hoping that it would automatically find his existing lake data but saw that it did not. He was aware he could start zed serve -lake ... & specify his pre-existing storage path before starting Zui, but this is not his preference.

During group brainstorming on the topic, @mattnibs proposed & established consensus on an approach he's seen in other tools: zed serve could look for lakes in an ordered set of storage locations, including (but not necessarily limited to) the XDG & common OS-convention paths such as those described above in this issue. Since Zui already starts a zed serve process, this would have made @mccanne's use case work as expected as long as the CLI tooling/docs had steered him toward one of the storage locations that the zed serve started behind Zui would have found. For the case of a new user that has no prior lake storage, no lakes would be found among the ordered set of storage locations and therefore the current Zui behavior would still apply where the zed serve would initialize an empty lake at one of these locations. Also, an env variable (e.g., ZED_LAKE) could still override the check of all these paths, if set.

As I wasn't part of the brainstorming session, I'll just add on a plea for consideration: Can we please implement this such that Zui is able to report to the user which location it ended up using? This way if a user happens to briefly use the Zed CLI tooling to create a scratch lake in another location and forgot they left it behind, they won't be shocked by Zui finding & starting to use it instead the next time they launch the app. Perhaps such a notification could be of the variety that has a "Don't show this again" checkbox.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants