title | layout |
---|---|
Cluster Administration |
main |
Welcome to the "hidden" wiki of administering software installations on the Totient cluster! This document serves as a catalog of how Totient was setup to use the glorious, terrifying, mystifying, and stupendously fantastic package manager spack.
To protect his identity, we endearingly refer to the individual who assisted with many aspects of this setup as SpackMan. His word is law, even if we did not obey all of his suggestions. Thank you, SpackMan.
TL;DR
: 1. First, get the Admin Shell Configurations setup.
2. Next, try and install things. The Syntax section should have enough
for you to start blindly trying to install things ;)
3. NEVER install
anything before checking the spec
(see Commands).
4. Load the appropriate compiler module for what you just installed, and now execute
the update_totient_lmod_db
command.
Notice: : Parts of this document are better written than others. This is generally proportional to how much I understood what was the "right" thing to do.
- Spack Cheat Sheet
- Totient and Spack
- Totient and Spack and Lmod
- Totient and Spack and Python
- Debugging
- Factory Reset
- Future TODO
The most important thing to understand about spack
is what goes where. Generally
speaking, you should never modify anything manually underneath the spack
root
directory. The only manual changes I have made are for site-specific configurations,
and some hacks to get the Intel 2015 compiler to work.
The second most important thing to understand about spack
: do not, under any
circumstances, execute git pull
if you have installed anything. Though you may be
able to acquire new packages that others have added, if anything from the default
variants of a package to the dependencies of a given package change, your installations
may become orphaned, deleted, replaced, or even worse: spack
may get so confused that
you have no choice but to delete it and start over. You have been warned.
On a fresh
clone of spack
, the directory structure is:
spack/
├── bin
│ ├── sbang
│ ├── spack
│ └── spack-python
├── etc
│ └── spack
│ └── defaults
├── lib
│ └── spack
│ ├── docs
│ ├── env
│ ├── external
│ ├── llnl
│ └── spack
├── share
│ └── spack
│ ├── csh
│ ├── logo
│ ├── qa
│ ├── setup-env.csh
│ ├── setup-env.sh
│ └── spack-completion.bash
└── var
└── spack
├── gpg
├── gpg.mock
├── mock_configs
└── repos
spack/bin
: The main executable(s). You can simply run ./bin/spack
, or get your shell setup so
that spack
is available as a command. More on that in the
Admin Shell Configurations section.
spack/etc
: These are where the site-specific YAML configurations go.
spack/etc/defaults
: The default configurations, never change.
spack/lib
: The core spack
library. See spack/var
below. This is the code for general
purpose I/O, concretization, etc.
spack/share
: This is where shell utilities, module files, etc are (or get symlinked to from an
installation).
spack/opt
(not shown)
: Where all of the installations after compilation end up.
spack/var
: The relevant folder here is that this is where the staging area for packages gets
symlinked so that you can go find out why some package did not install. The folder
spack/var/spack/repos/builtin/packages
are where all of the package definitions are.
Spack allows for three levels of configuration. The order in which they are loaded determines the overall output (where conflicts are concerned).
-
First, the default configurations from
spack/etc/defaults
are loaded. Some are global defaults, some are specific to a given operating system. -
Next, any
*.yaml
files found inspack/etc
are loaded. These will be referred to as site-specific configurations. In our case, we are going to have customconfig.yaml
(overall configs, only used for changing where the staging area is),compilers.yaml
(what compilers are available and where they are), andpackages.yaml
(default variants for the packages we care about). -
Last, Spack looks in
$HOME/.spack
.
So in the event that a user has specific customizations that override our site-specific configurations, the user configurations take precendence. Hence I highly encourage to keep all Spack configurations as site-specific to avoid hard to understand discrepancies.
It's worth quickly mentioning the lifecycle of a spack install
command.
- The specification of the package to install (hereby called spec) is concretized, and dependencies are determined.
- All dependencies are built (if needed).
- The source code for the package is downloaded and staged. Staging typically means
extracting the
.tar.gz
and runningcmake
,configure
, etc. - The
install
phase (as printed on the command line) is then typically going to first runmake
and thenmake install
.
This is important for us because /share
has limited space, so I have customized where
the staging area is (since /tmp
is also very small). make install
will put the
files somewhere underneath spack/opt
, and likely also create symlinks to module files
and put them underneath spack/share
.
There are many features of spack
that are only available if you perform the full shell
setup. Basics such as installing or checking specifications do not need this. So
for example, if you just want to see what was installed in which Spack instance, you
could
$ cd /share/apps/spack
# See what the compilers instance had installed
$ ./spack_compilers/bin/spack find
# See what the all instance had installed
$ ./spack_all/bin/spack find
On the other hand, things like loading modules, convenience functions of going to a
failed build's stage, etc, require the full shell integration. Thankfully it's easy,
just set the SPACK_ROOT
variable and source a script. I'll show you an interactive
version, for the spack_all
instance, but of course you would put this in your
~/.bash_profile
(only need to source setup-env.sh
once, so don't do it in the
~/.bashrc
unless you want every tmux
pane to source it).
$ export SPACK_ROOT="/share/apps/spack/spack_all"
$ source $SPACK_ROOT/share/spack/setup-env.sh
# Now `spack` is available as a regular old command
# This should give you the same results as the raw
# ./spack_all/bin/spack find above
$ spack find
There are two files you should be sourcing, from your ~/.bashrc
and ~/.bash_profile
respectively.
Put somewhere in your ~/.bash_profile
:
# Because ~/.bash* is shared across all systems, make sure you only
# try and load this from totient.
if [[ $(hostname -s) =~ totient ]]; then
source /share/apps/spack/totient_spack_configs/admin_configs_bash_profile.sh
fi
It sets SPACK_ROOT
to be /share/apps/spack/spack_all
, and sources setup-env.sh
.
- As it informs you by sourcing it, this means that
spack
refers tospack_all
. - Use
spack_compilers
bycd /share/apps/spack/spack_compilers
and then execute./bin/spack
.
We don't want to (indirectly) source setup-env.sh
for every shell (e.g. when using
tmux
) as the script takes a little bit.
Put somewhere in your ~/.bashrc
:
# Because ~/.bash* is shared across all systems, make sure you only
# try and load this from totient.
if [[ $(hostname -s) =~ totient ]]; then
source /share/apps/spack/totient_spack_configs/admin_configs_bashrc.sh
fi
Functions are treated specially by your shell and need to be defined in the ~/.bashrc
file. The functions will not be available if you source them from your
~/.bash_profile
.
-
It defines the convenience function
spack_node_install
, which will launch the job script/share/apps/spack/totient_spack_configs/node_compile.pbs
. Example usage:# Make sure it will install what you expect, as well as # use dependencies you have compiled. $ spack spec -I boost %[email protected] # Launch the job script. $ spack_node_install boost %[email protected] # Wait for it to finish $ qstat # The installation log gets put here $ cd $HOME/spack_install_logs # I've kept the colors, hence `less -R` # Assuming it gave job number 34239 $ less -R spack_node_install.o34239
-
It defines the convenience function
update_totient_lmod_db
. If you are only installing things withspack_node_install
(no compilers), just run it after and the module for the package you just installed should now be available.-
If you were installing for
[email protected]
, you may need to do# Load the `gcc/7.2.0` module so that the `spack_all` directory shows up in # the $MODULEPATH $ module load gcc/7.2.0 # Now update the database $ update_totient_lmod_db
-
More on why that's good for you (and ONLY you the admin) to do in the Debugging section.
Spack introduces a wide range of syntax and options, this should be enough for you to simply compile and install packages.
The primary commands that you will use:
spack list
: Lists all packages available for installing with spack
. Best served with a side of
grep -i
for what you actually want to install.
spack info X
: Pulls up the information on package X
. It is very important that you look at
this page before trying to install package X
, as there may be variants of the
package that you will want to include that are turned off by default. See
command-line syntax in next section for an example.
spack spec -I X
: Performs the concretization of what it will take to install package X
on this
system. ALWAYS ALWAYS ALWAYS run spec -I
before trying to install package
X
. Let's consider the boost
package. It has a variant that allows you to compile
boost.python
. Now suppose that you have also installed python
using spack
, but
you installed python+tk
(so that you can have matplotlib
). By default, spack
will re-install a python~tk
and use that as the dependency of boost
.
In short: by running the spec
, you will be able to determine if the dependency you
want to use to build something will actually be used.
spack install X
: Installs the package! Note that for non-compilers, we will never run this
directly (the job script will do this). This is so that when the compiler is
optimizing the code, it is tailored to the compute nodes rather than the login node.
spack uninstall X
: Uninstalls the package X
. If another package Y
was built using package X
,
spack
will fail out explaining this. If you really want to force it, you can
spack uninstall --dependents X
. Exercise extreme caution.
spack find [package]
: Allows you to view what you have installed. The version of this command I use most
frequently is spack find -ldfv X
, which provides me with a description of the
variants package X
was installed with, what its dependencies were, and most
importantly the hashes of each. The hashes also come into play when trying to force
the concretizer to use something as a dependency.
The syntax elements we will focus on will be compilers, versions, dependencies, and
variants. We'll be using python
and boost
for example packages, since generally
speaking python
is actually the hardest part to get right.
Note: the syntaxes shown are for use on the command-line (or by argument to the job script). This admonition exists to call to your attention that command-line specifications take the highest precendence, superceding any site-specific configurations present in the various YAML files I have created. The YAML side is explained later.
Every package in Spack has at least one version, and every package in Spack also has a
preferred version. In some cases the preferred version is simply the latest stable
version, in some cases it could be an earlier one due to community preferences and/or
bugs found. For example, Python's preferred version is 2.7.13
, because even though
it's 2017...(ok I'll spare the rant, it's HPC). A better example is Boost --- at the
time of writing this, the latest "stable" is 1.64.0
, but the preferred is 1.63.0
.
This is because there are a lot of bugs when boost+mpi
is desired...
Some examples:
spack spec -I python
->[email protected]
(inherits preferred version)spack spec -I [email protected]
gives us 3.6.2.spack spec -I [email protected]
Simply use @
followed by the version number.
Tip:
: View the versions for package X
by reading spack info X
.
Spack and lmod
get along so well because they treat compilers specially. The general
idea is that if you have two compilers, say gcc
and llvm
, you do not want to try and
compile package X
using gcc
and build it with a dependency that was compiled with
llvm
. ABI compatibility aside, even just two different versions of gcc
can lead to
difficult bugs. You can view the compilers available to you with
$ spack compiler list
==> Available compilers
-- gcc rhel6-x86_64 ---------------------------------------------
[email protected] [email protected] [email protected] [email protected]
-- intel rhel6-x86_64 -------------------------------------------
[email protected]
Right now, if you checkout $SPACK_ROOT/etc/spack/packages.yaml
you will see that at
the bottom the default compiler for all
packages is intel
. So if we do
$ spack spec -I python
Input spec
--------------------------------
python
Normalized
--------------------------------
python
^bzip2
^ncurses
^pkg-config
^openssl
^zlib
^readline
^sqlite
Concretized
--------------------------------
[email protected]%[email protected]+shared~tk~ucs4 arch=linux-rhel6-x86_64
^[email protected]%[email protected]+shared arch=linux-rhel6-x86_64
^[email protected]%[email protected]~symlinks arch=linux-rhel6-x86_64
^[email protected]%[email protected]+internal_glib arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected]+pic+shared arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
The packages.yaml
had some very specific impacts here. If we instead wanted to do
things with [email protected]
:
$ spack spec -I python %[email protected]
Input spec
--------------------------------
python%[email protected]
Normalized
--------------------------------
python%[email protected]
^bzip2
^ncurses
^pkg-config
^openssl
^zlib
^readline
^sqlite
Concretized
--------------------------------
[email protected]%[email protected]+shared~tk~ucs4 arch=linux-rhel6-x86_64
^[email protected]%[email protected]+shared arch=linux-rhel6-x86_64
^[email protected]%[email protected]~symlinks arch=linux-rhel6-x86_64
^[email protected]%[email protected]+internal_glib arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected]+pic+shared arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
Note: : We see here that Spack will want to use the same compiler for every dependency. This cannot be changed (nor should it be).
Tip:
: You can just do %intel
, for example, if it's the only one. Generally, you just need
to provide Spack with enough information for it to complete. It's very smart.
Variants typically are just what components of a package you want to build. In the
Python case, the variant that defaults to off is tk
, since Spack is generally designed
for clusters which typically don't have graphical logins. If you take a closer look at
the output of the previous %[email protected]
output, you'll see that Python defaulted to be
python+shared~tk~ucs4
:
- Build shared libs
- Do not build with
tk
- Do not build with (wide) unicode strings
If we on the other hand wanted to install it with tk
:
$ spack spec -I python+tk
Input spec
--------------------------------
python+tk
Normalized
--------------------------------
python+tk
^bzip2
^ncurses
^[email protected]:
^openssl
^zlib
^readline
^sqlite
^tcl
^tk
^libx11
^inputproto
^util-macros
^kbproto
^[email protected]:
^libpthread-stubs
^[email protected]:
^[email protected]:
^libxdmcp
^xcb-proto
^xextproto
^xtrans
Concretized
--------------------------------
[email protected]%[email protected]+shared+tk~ucs4 arch=linux-rhel6-x86_64
^[email protected]%[email protected]+shared arch=linux-rhel6-x86_64
^[email protected]%[email protected]~symlinks arch=linux-rhel6-x86_64
^[email protected]%[email protected]+internal_glib arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected]+pic+shared arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
^[email protected]%[email protected] arch=linux-rhel6-x86_64
Of course, we can still use gcc
if we want. I won't include the output, but the
command would be
$ spack spec -I python %[email protected] +tk
The reason you should always check spack spec -I X
is because the package defaults
may be more relaxed, and Spack will install the relaxed version. Boost is an excellent
test case to see what happens.
Basically, if you check spack spec -I boost
you will see that it picks up the
[email protected]
already compiled. In order for you to compile boost.python
with
[email protected]
, you have to do something really obscene. Traditionally, you should be
able to do spack spec -I boost ^[email protected]
, but there is a known concretizer bug
and it will simply say "Boost does not depend on Python". Even though our packages.yaml
is specifically asking for +python
, this basically gets discarded.
So to actually install it, say for [email protected]
, copy-paste the variants from the
packages.yaml
and then incorporate ^[email protected]
:
# woah
$ spack spec -I boost +atomic+chrono+date_time+filesystem+graph+iostreams+locale+log+math+mpi+multithreaded+program_options+python+random+regex+serialization+shared+signals+system+test+thread+timer+wave %[email protected] ^[email protected]
It's also worth mentioning that you can specify the "hash" explicitly. This will not
solve the needing-to-paste-the-full-variants-list problem, but can be helpful if you are
trying to say force boost
as a dependency and don't want to type all that. Keeping
with the above example, where we wanted to use %[email protected] ^[email protected]
, you can use
$ spack find -ldfv python
-- linux-rhel6-x86_64 / [email protected] -------------------------------
55rmf7o [email protected]%gcc+shared~tk~ucs4
vbjrkfg ^[email protected]%gcc+shared
vpld45l ^[email protected]%gcc~symlinks
ljl2c6v ^[email protected]%gcc
52qgokm ^[email protected]%gcc+pic+shared
tnmwqjt ^[email protected]%gcc
srs4rn5 ^[email protected]%gcc
mmc4fw7 [email protected]%gcc+shared~tk~ucs4
vbjrkfg ^[email protected]%gcc+shared
vpld45l ^[email protected]%gcc~symlinks
ljl2c6v ^[email protected]%gcc
52qgokm ^[email protected]%gcc+pic+shared
tnmwqjt ^[email protected]%gcc
srs4rn5 ^[email protected]%gcc
So instead of using ^[email protected]
, we could also use ^/mmc4fw7
.
Note:
: The syntax for specifying a hash in spack
is /<hash>
with the /
character being
the thing that tells spack
"this is a hash". This is true for all commands, not
just checking spec -I
or install
.
One final note about dependencies is that you may end up in situations where if you try and specify the dependencies manually, the concretizer will say "cannot depend on X twice." This is a frustrating situation to be in, and usually involves being clever. Do they have a common dependency that you can specify instead? Usually, if you ended up in this scenario, they do. Happy hunting.
Since compiling the compilers is something we only want to do once (if possible), the
approach suggested in the Spack docs of keeping a separate spack
instance was
employed. The directory structure:
/share/apps/spack/
├── spack_all
├── spack_compilers
├── totient_spack_configs
└── zzz_install_logs
spack_all
: The spack_all
directory is where all non-compilers are installed, except for lmod
and tmux
.
spack_compilers
: The spack_compilers
directory is where all compilers are installed. The lmod
and
tmux
packages have also been installed here as we want to ensure these packages
remain available regardless of what happens with spack_all
.
Note: all compilers were compiled using [email protected]
provided by the RHEL
devtoolset/3
package. Traditionally you would use the host compiler, but
[email protected]
(at least as installed on Totient) is insufficient, lacking libatomic
.
This is particularly relevant for the configurations of each spack
instance
described next.
totient_spack_configs : See Totient Specific Spack Configurations.
zzz_install_logs : Where I put all of the completed job script outputs for reference. Not very well organized.
Note: : SpackMan explained to me that although the official docs describe this approach, he and the majority of the other leads strongly disagree with this tactic. Amusingly, the proponent of this tactic is somebody I generally disagree with.
In this instance, though, I wholeheartedly agree with the approach. Why? It means
that you can ./spack_all/bin/spack uninstall -a
and completely start from scratch
if you want, without having to worry about obliterating the compilers.
It can be very easy to perform actions that can leave Spack in a confused, conflicted,
and ultimately broken state. Because we are qsub
bing jobs for compiling things, you
need to be EXCEPTIONALLY CAREFUL about doing this blindly.
Consider wanting to compile [email protected]
and [email protected]
. If you check the specs,
you will see that they pretty much share all of the same dependencies. The only thing
that differs really is downloading a different Python source tarball and compiling it.
If you qsub
a job to compile [email protected]
and [email protected]
at the same time, and
none of the dependencies have been compiled yet, you just made what I will call a
compilation race condition. Spack may be able to notice this, it may not.
Heisenbugs, blood, and tears.
Solution:
: Do not ever try and compile something that will need to compile the same dependency
at the same time. ALWAYS check spack spec -I
.
If you look in the /share/spack/totient_spack_configs/setup/spack_yaml
folder, you
will find the various configurations. There are two things worth explicitly noting:
-
When you try and build
llvm
, you'll likely need to change theconfig.yaml
and un-comment the part that changes where thestage
goes (so that you have enough space to actually compile it, because its HUGE). -
The
all_packages.yaml
gets symlinked intospack_all
, andcc_packages.yaml
intospack_compilers
. Though separating the spack instances makes things cumbersome when thelmod
stuff comes into play, this alone is worth it. We want to compile everything with the[email protected]
inspack_compilers
.
The config.yaml
, modules.yaml
, and compilers.yaml
get symlinked into both.
See the create_and_verify_links.sh
script.
The modules.yaml
file is what controls how module files are generated, and which ones
are excluded. You may decide that you want to split them into an all_modules.yaml
and a cc_modules.yaml
.
After making changes to modules.yaml
, e.g. to add more things to the blacklist so as
not to confuse students, you should follow the directions in the
Spack Module Setup section.
The setup for lmod
is a little convoluted, it was setup this way partially to support
custom configurations, and partly because this is what worked. AKA I make no claims as
to this being the officially correct way to configure it all, but it at least works.
Lmod operates hierarchically:
- First, a user is expected to
module load some_compiler
- Only after a specific compiler module has been loaded will modules compiled with that compiler be available.
- The next layer is for MPI implementation used. In general I do not think this applies directly to Totient, because we only have OpenMPI or Intel's MPI (?). But if different MPI implementations are used the future, that becomes relevant.
You can enable deeper hierarchies, e.g. for LAPACK. This feature seems to still be in development, and was not employed on the Totient configurations. See the Spack documentation on extended hierarchies.
- Official Lmod Documentation
- Lmod User Guide
- Writing and converting Modulefiles
- Site specific customizations
- Lmod Customization Using
lmodrc.lua
-
Link broken? Search in google
cache:http://lmod.readthedocs.io/en/latest/155_lmodrc.html
-
- Assigning Properties to Modules
- Loading Default Modulefiles for all Users
- Lmod Customization Using
- Setting up the
lmod
spider cache
- Spack and Modules Documenation
- Spack and Lmod
- Related issues that helped create the Totient Setup
The Lmod startup and configurations are paired: first the "traditional" Lmod startup files are used (as described in the Lmod Installation Guide). We then add another startup script that configures the important aspects of Totient's module system.
In order for every user to have access to Lmod, the following links must be created
(you will need to submit a ticket to IT since we don't have root
):
-
The standard Lmod startup files, which enable the
module
command for users.# This works for bourne shells and zsh $ ln -s /share/apps/spack/totient_spack_configs/setup/login/z00_lmod.sh /etc/profile.d/z00_lmod.sh # For csh and tsch users $ ln -s /share/apps/spack/totient_spack_configs/setup/login/z00_lmod.csh /etc/profile.d/z00_lmod.csh
The files
/share/apps/spack/totient_spack_configs/setup/login/z00_lmod.[c]sh
are in turn links to the actuallmod
installation files. This enables you to, if you so desire, re-installlmod
and change where these point to. -
Now that the Lmod startup files have been linked, we need to link the site-specific startup files:
# This is for the bourne shells and zsh $ ln -s /share/apps/spack/totient_spack_configs/setup/login/z01_lmod_Totient.sh /etc/profile.d/z01_lmod_Totient.sh # For csh and tsch users $ ln -s /share/apps/spack/totient_spack_configs/setup/login/z01_lmod_Totient.csh /etc/profile.d/z01_lmod_Totient.csh
Warning:
: The naming of the files is important! The z00
files provide the module
command,
which is then directly used in the z01
files. Since files in /etc/profile.d
are
sourced using a glob (*.sh in /etc/profile.d
for bourne shells, *.csh
for tsch
and csh
), files get sourced in alphabetical order. z00
will get sourced before
z01
.
The z01
files are the first form of site-specific customization. Let's take a look at
the .sh
file:
# Disable the system modules from showing up (hack)
# Links to both spack instances
TOTIENT_SPACK_ROOT="/share/apps/spack"
CC_ROOT="$TOTIENT_SPACK_ROOT/spack_compilers"
CORE="share/spack/lmod/linux-rhel6-x86_64/Core"
CC_CORE="$CC_ROOT/$CORE"
# Links to the default Totient, psxe/2015, and devtoolset/3
TOTIENT_MODULES_PREFIX="$TOTIENT_SPACK_ROOT/totient_spack_configs/modules"
TOTIENT_CORE="$TOTIENT_MODULES_PREFIX/lmod/Core"
export MODULEPATH="$TOTIENT_CORE:$CC_CORE"
export LMOD_RC="$TOTIENT_SPACK_ROOT/totient_spack_configs/setup/lmod/lmodrc.lua"
# See documentation:
# http://lmod.readthedocs.io/en/latest/070_standard_modules.html
if [ -z "$__Init_Default_Modules" ]; then
export __Init_Default_Modules=1;
## ability to predefine elsewhere the default list
LMOD_SYSTEM_DEFAULT_MODULES=${LMOD_SYSTEM_DEFAULT_MODULES:-"Totient"}
export LMOD_SYSTEM_DEFAULT_MODULES
module --initial_load --no_redirect restore
else
module refresh
fi
Recalling that we have two separate Spack instances, this part is important. We are
explicitly overriding any system populated MODULEPATH
and setting it to instead be
/share/apps/spack/totient_spack_configs/modules
/share/apps/spack/spack_compilers/share/spack/lmod/linux-rhel6-x86_64/Core
The default module /share/apps/spack/totient_spack_configs/Core/Totient.lua
is what
gets loaded by default for every user, because we set this explicitly in the z01
files
being sourced. Edit the file to load in whatever specific modules you would like!
We have deliberately excluded the spack_all
module path from being included in the
MODULEPATH
of the users. This is the one tricky part of the setup that has the
capability of being broken (fix described in next section). Remember that crazy script
create_and_verify_links.sh
? The very bottom of the script is the relevant section
here:
###############################################################################
# Try and automate the spack_compiler -> spack_all lmod hacks #
###############################################################################
vsep "Attempting 'spack_compiler' -> 'spack_all' patches READ THE OUTPUT"
the_patches=( gcc/6.4.0.patch gcc/7.2.0.patch )
p_dir="$HERE/patches/spack_modules"
for patch in "${the_patches[@]}"; do
echo '*** Executing: patch -d / -N -p 1 --reject-file="-" -i "'"$p_dir/$patch"'"'
patch -d / -N -p 1 --reject-file="-" -i "$p_dir/$patch"
echo -e "\n\n"
done
Let's take a look at the patch
/share/apps/spack/totient_spack_configs/setup/patches/spack_modules/gcc/6.4.0.patch
:
--- a/share/apps/spack/spack_compilers/share/spack/lmod/linux-rhel6-x86_64/Core/gcc/6.4.0.lua
+++ b/share/apps/spack/spack_compilers/share/spack/lmod/linux-rhel6-x86_64/Core/gcc/6.4.0.lua
@@ -11,7 +11,7 @@ family("compiler")
-- MODULEPATH modifications
-prepend_path("MODULEPATH", "/share/apps/spack/spack_compilers/share/spack/lmod/linux-rhel6-x86_64/gcc/6.4.0")
+prepend_path("MODULEPATH", "/share/apps/spack/spack_all/share/spack/lmod/linux-rhel6-x86_64/gcc/6.4.0")
-- END MODULEPATH modifications
We have to tell spack_compilers
modules to look in the spack_all
directory. This
must be done for every compiler you get setup. SpackMan told me they plan on
rewriting the module generation stuff, so these patches may become stale and need to be
re-written.
So let's say that you've got a new core module you want to load automatically for the
students. In this case, I just installed a newer copy of vim
. Recall that core
utilities such as compilers, tmux
, vim
, etc are compiled using spack_compilers
.
So I've just gone through and successfully compiled
./spack_compilers/bin/spack install vim
, but right now there is no vim
module
generated underneath spack_compilers/share/spack/lmod/Core
. This is because of our
modules.yaml
file. The relevant excerpts:
modules:
enable::
- lmod
lmod:
core_compilers:
- '[email protected]'
hash_length: 0
whitelist:
- cmake
- curl
- gcc
- git
# NOTE: DO NOT PUT `lmod` HERE! It comes from login/z01_Totient.[c]sh
- lua
- tmux
blacklist:
# NOTE: spack generated module file infinite recursion.
# made custom module file
- intel-parallel-studio
- '%[email protected]'
I need to add vim
to the whitelist, since the spack_compilers
instance is setup
(via it's packages.yaml
) to always use [email protected]
from devtoolset/3
. Because we
specifically added %[email protected]
to the blacklist, this says "blacklist anything that
is not explicitly in the whitelist, and was compiled with [email protected]
." This is overall
very desireable, since we don't want all of the dependencies of the compilers to show
up for students (because it's a lot, and would be very confusing).
So now I've added vim
to the whitelist, but we've already installed it!
Thankfully Spack is very lenient on this, you can regenerate all of the modules.
# Go to `spack_compilers`
$ cd /share/apps/spack/spack_compilers
# Regenerate the modules
$ ./bin/spack module refresh --module-type lmod --delete-tree -y
==> Regenerating lmod module files
The vim
module file is now generated, but is not available yet. Before we try and
make it available, though, we need to re-patch the modulefiles to point to spack_all
.
# Go to the configs setup directory
$ cd /share/apps/spack/totient_spack_configs/setup
# Execute the linkage script, which patches at the end
# Only including relevant patch output
$ ./create_and_verify_links.sh
...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>> Attempting 'spack_compiler' -> 'spack_all' patches READ THE OUTPUT
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
*** Executing: patch -d / -N -p 1 --reject-file="-" -i "/share/apps/spack/totient_spack_configs/setup/patches/spack_modules/gcc/6.4.0.patch"
patching file share/apps/spack/spack_compilers/share/spack/lmod/linux-rhel6-x86_64/Core/gcc/6.4.0.lua
*** Executing: patch -d / -N -p 1 --reject-file="-" -i "/share/apps/spack/totient_spack_configs/setup/patches/spack_modules/gcc/7.2.0.patch"
patching file share/apps/spack/spack_compilers/share/spack/lmod/linux-rhel6-x86_64/Core/gcc/7.2.0.lua
In the previous section I showed you what has to happen when you want to make a new
compiler or default core utility when installed with spack_compilers
. After making
sure to re-patch the spack_compilers
compiler modules so that they point to what is
compiled with spack_all
, we need to update the spider cache. Assuming you've followed
the directions in the Admin Shell Configuations section,
you should have the update_totient_lmod_db
function avaiable to you.
In the below output, vim
will show up in the top-right after we run the update.
$ module av
/share/apps/spack/spack_compilers/share/spack/lmod/linux-rhel6-x86_64/Core
cmake/3.9.0 (L) gcc/7.2.0 (D) tmux/2.4 (L)
curl/7.54.0 (L) git/2.13.0 (L)
gcc/6.4.0 lua/5.3.4 (L)
---- /share/apps/spack/totient_spack_configs/modules/lmod/Core -----
Totient (I,L) devtoolset/3 (I) psxe/2015 (I)
TotientAdmin (I) intel/15.0.3
Where:
L: Module is loaded
I: Ignore - not intended for direct use.
D: Default Module
Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible
modules matching any of the "keys".
$ update_totient_lmod_db
---> Loading the 'TotientAdmin' module.
---> Running 'update_lmod_system_cache_files', this may take a while...
---> Unloading the 'TotientAdmin' module.
$ module av
/share/apps/spack/spack_compilers/share/spack/lmod/linux-rhel6-x86_64/Core
cmake/3.9.0 (L) gcc/7.2.0 (D) tmux/2.4 (L)
curl/7.54.0 (L) git/2.13.0 (L) vim/8.0.0503
gcc/6.4.0 lua/5.3.4 (L)
---- /share/apps/spack/totient_spack_configs/modules/lmod/Core -----
Totient (I,L) devtoolset/3 (I) psxe/2015 (I)
TotientAdmin (I) intel/15.0.3
Where:
L: Module is loaded
I: Ignore - not intended for direct use.
D: Default Module
Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible
modules matching any of the "keys".
Tip:
: You only have to re-patch things when installing into spack_compilers
. When using
spack_all
(e.g. via spack_node_install
), simply re-update the database.
Note:
: In the above example we went through start-to-finish how to get a default module from
install to show up in the lmod
setup. The last step would be, if you want it to be
loaded on login for everybody, add it to Totient.lua
.
For python
, something a little non-standard was done. Because we created the
Totient_intel
, Totient_clang
, and Totient_gcc
modules for students to load, we
can be pretty confident that where things like numpy
etc are concerned, we will have
a BLAS library somewhere available.
This is something called spack activate
, the idea is that for a given python library
such as py-numpy
, you have two ways to make this available:
- Enable users to load the specific module.
- Main problem: the
lmod
modules generated do not load in the dependencies e.g.openblas
forpy-numpy
. - Do do this, you must write a module for every python package by first doing
spack module loads --dependents py-numpy
- Write your own custom module for the students to load.
- Main problem: the
- More convenient, simply
spack activate py-numpy
.- It creates a symlink to the
site-packages
directory so thatpython
can use it. - Added benefit is that
pip
will think that those dependencies are satisfied.
- It creates a symlink to the
So the idea here is that we've mirrored every python installation, they all have the
same things installed (except Intel, because it's broken). When a user is going to
module switch
to a different compiler, their python
will change, so now that we
have everything we want installed and activated, the only module that needs to get
loaded is python
(as opposed to py-numpy
etc). So basically, by using activate
we get to
- Use
pip
to installjupyter
(this is the onlypip
installed package). - Support
module switch
be default, since everything gets symlinked into thesite-packages
directory of the specific python installed.
How did you do this, Sir?
Well. First you want to make sure every python installation has the same stuff. Recall
the power of spack find
, and ye shall succeed:
# Find all python packages for gcc
$ spack find %gcc | grep 'py-' | cut -d '@' -f 1 | sed 's/\(.*\)/spack activate \1 %gcc/g' > activate_gcc.sh
# Find all python packages for clang
$ spack find %clang | grep 'py-' | cut -d '@' -f 1 | sed 's/\(.*\)/spack activate \1 %clang/g' > activate_clang.sh
# Find all python packages for intel
$ spack find %intel | grep 'py-' | cut -d '@' -f 1 | sed 's/\(.*\)/spack activate \1 %intel/g' > activate_intel.sh
So then a healthy serving of say vimdiff activate_clang.sh activate_gcc.sh
will help
you see if there are missing packages from one or the other. In this instance, we were
not able to install for intel
:
py-matplotlib
py-scipy
Then quickly make the script executable and add a shebang (#!/usr/bin/env bash
) on the
first line. Then run it and something like this would happen:
$ ./activate_clang.sh
==> Activated extension [email protected]%[email protected] arch=linux-rhel6-x86_64 /d64jp4n for [email protected]+shared~tk~ucs4%[email protected]
==> Activated extension [email protected]%[email protected] arch=linux-rhel6-x86_64 /e3rzbjo for [email protected]+shared~tk~ucs4%[email protected]
==> Activated extension [email protected]%[email protected]+blas+lapack arch=linux-rhel6-x86_64 /3ea47ve for [email protected]+shared~tk~ucs4%[email protected]
==> Activated extension [email protected]%[email protected] arch=linux-rhel6-x86_64 /m4k4gnz for [email protected]+shared~tk~ucs4%[email protected]
Eventually, because activate
is very clever and makes sure to activate dependencies,
you will get some warnings such as
==> Error: Package py-ipython-genutils%clang/hazvtyn is already activated.
simply because py-ipython
already activated it.
Final Step
: spack activate gcc %clang
and spack activate opencv %gcc
to use the python side!
OpenCV was also broken with intel
, otherwise we'd activate that too.
You only ever need to activate
once, since it's just symlinking to the directories.
While I wish you could just ignore this section, I'm sure you'll end up here. In the
customized config.yaml
section, we had explicitly moved the Spack staging area away
from /tmp
. This is where having the shell integrations becomes particularly useful.
Suppose we were trying to compile python
and things didn't work out. There are two
things that come out immediately:
-
Check the output on the job script. Assuming you were using my wrapper, it should have ended up in
/home/$USER/spack_install_logs
. Sometimes the error was easy for Spack to discern, and it will tell you what went wrong.- For example, you ran out of space and it gave you an
IOException: Out of Disk Space
.
- For example, you ran out of space and it gave you an
-
The staging area is in-tact still! In the staging area, there are two files that are very useful in figuring out what went wrong:
-
spack-build.out
: this is the full output of the build, from./configure
tomake
and beyond. For example, this is the file where theld killed with signal 9
error was shown when trying to compilellvm
without enough RAM / swap space on the login node. -
spack-build.env
: this file provides a full dump of the build environment that was used to compile this package. If you failed on trying to link against something that showed up in your personal$LD_LIBRARY_PATH
, chances are it wasn't in this file. This is because Spack aims to sanitize any and all aspects of the build environment it can.
-
Generally speaking the spack-build.out
file is you want to look at, but it's good to
know about the spack-build.env
as well.
The easiest thing you can try is to turn off variants for the package you are trying
to install. spack info X
and see what dials you can turn. Many packages build
out-of-the-box, but a few choice packages are always troublesome.
Example:
: When using Fedora, I was never able to get gcc+binutils
to build. It used to
default to +binutils
, but so many other users encountered this issue that they
switched it to ~binutils
as the default.
Interestingly, I could only get gcc+binutils
to compile on Totient!
If you really do need a specific variant, or turning them on / off does not work, this is where making sure you have the shell integrations described in the Admin Shell Configuations section becomes very helpful.
spack cd X
: Go to the staging directory for package X
. This only remains available when you
actually have a failed build, or have explicitly run spack stage X
. AKA if the
installation was successful, spack cd X
does not work.
Example:
sjm324@en-cs-totient-01:~> spack cd [email protected]
sjm324@en-cs-totient-01:/share/apps/spack/spack_all/var/spack/stage/python-3.6.1-664mnn7
vb6ta6q6pusoupoepeumbftod/Python-3.6.1> less spack-build.out
Or just remember that spack_all/var/spack/stage
is where they are. Now do some
digging and try and figure out what went wrong.
In the /share/apps/spack/totient_spack_configs/node_compile.pbs
script, there is a
line commented out that uses /share/apps/spack/spack_all/bin/spack -d install ...
.
Switch that on and comment out the other one. The spack -d
mode will include a lot
more information about what went wrong where.
Because you can. However, recall that Spack has installed all of our dependencies.
So when you check spack spec -I X
(I'm assuming X
actually failed, not its
dependencies), when trying to compile manually you will want to spack load Y
to load
the dependency Y
. This way you can try and replicate what Spack will do. Then maybe
if it is a package like a compression library or something, spack unload Y
and see if
it compiles now?
Maybe you tried compiling manually, maybe not. Another useful feature to know about is
that Spack can load up the file directly. You can spack edit X
to bust that out and
read the source code. Maybe there is a TODO:
in there that affects you!
Pro-Tip:
: Make sure you have the EDITOR
environment variable set to what you want, this is
what Spack will open up the file in.
DANGER ZONE:
: YOU ARE NOW EDITING SPACK DIRECTLY. I mean look I'm a hacker and I do that all the
time. But the more you change, the more confusing things can get. It's very easy to
change package Y
trying to fix something, then change Z
, then forget to un-change
Y
and be even more broken.
This is also why spack_all
and spack_compilers
was created. So that you truly can
obliterate and restart :)
The spack_all
instance is NOT on the develop
branch, because the 2015 Intel
compiler has a different directory structure than what SpackMan had access to.
Try this first, so that you don't have to recreate my nonsense.
$ cd /share/apps/spack/spack_all
# DANGER: uninstalls it ALL
$ ./bin/spack uninstall -a
# You should be on `old_intel_compilers`, which has two commits
$ git branch
# If you were editing files and don't know what you did
$ git status
# Or whatever. You know what you did.
$ git reset --hard
So if you straight up rm -rf spack_all
, you will want to
-
Fresh clone, go back in time (Spack updates a lot).
$ cd /share/apps/spack $ git clone https://github.com/LLNL/spack.git spack_all $ cd spack_all # This is the commit on develop I diverged from $ git checkout 99fb394ac18b7ef1bc4f9ee34242a69b42781ab8
-
Checkout a new branch and apply the patch
$ git checkout -b old_intel_compilers $ git apply /share/apps/spack/totient_spack_configs/setup/patches/intel_2015.patch
If you apply the patch to a newer version of develop
, it may be more broken than not.
- release swap space for failed (super sad face) llvm
- FIX THE PERMISSIONS.
umask 0022
instead of defaultumask 0002
is probably what should be done in the admin setup scripts?- Only relevant if you want to enable somebody else to still
spack_node_install
, but that probably won't happen anymore.
I added a spack_node_intel_dirty_install
to the admin shell configurations, all it
is doing is calling spack install --dirty
, where --dirty
is half of the key to
getting intel to work.
RECALL
: You must manually edit totient_spack_configs/setup/spack_yaml/packages_all.yaml
to
get it ready for anything intel
. Search for HERE HERE HERE
and make sure that
all of the intel
providers etc are setup.
So the packages that were not able to build (boost
, opencv
, py-scipy
,
py-matplotlib
) all seem to share the same issue: it can't seem to find std::complex
which (from internet trolling) seems to be some sort of bad intel
configuration
problem.
As long as you are able to fix either the intel/15.0.3
module (or psxe/2015
) to get
it to fix the shell configurations to be able to find std::complex
from intel
correctly, the --dirty
basically tells spack
to not sanitize the install environment.
There were a couple of packages that did not build with "regular" spack install
that
a spack install --dirty
did work. But amidst the other issues, who knows how much of
the intel
stuff is actually working...
-
If I could do it again, I'd call them
spack_core
andspack_all
. Thespack_compilers
one ended up with a lot more "core" stuffs. -
If you upgrade to Intel 2017, maybe try and get
spack_compilers
to install it and then mark it asbuildable: False
inall_packages.yaml
, the path is just whereverspack
installed it to underspack_compilers/opt/spack/...
.- The reports online seem to indicate that
spack
works will with 2017.
- The reports online seem to indicate that
-
Or better yet, just use one to make your life easier and just don't you dare ever do
spack uninstall -a
;)
Unclear how MODULEPATH is actually getting set on login (I gave up and manually overrode instead of appending to...)
- wassup with
/etc/auto.share
?/etc/exports
?