feat: Add pixi project configuration #227

matthewfeickert · 2024-10-22T12:44:21Z

Resolves Core dependencies for running AGC implementation #199
Resolves Provided requirements.txt does not work with Python 3.11 #144
Resolves scikit-learn missing from requirements.txt #140

Add pixi manifest (pixi.toml) and pixi lockfile (pixi.lock) to fully specify the project dependencies. This provides a multi-environment multi-platform (Linux, macOS) lockfile.
In addition to the default feature, add latest, cms-open-data-ttbar, and local pixi features and corresponding environments composed from the features. The cms-open-data-ttbar feature is designed to be compatible with the Coffea Base image which uses SemVer coffea (Coffea-casa build with coffea 0.7.21/dask 2022.05.0/HTCondor and cheese).
- The cms-open-data-ttbar feature has an install-ipykernel task that installs a kernel such that the pixi environment can be used on a coffea-casa instance from a notebook.
- The local features have the canonical start task that will launch a jupyter lab session inside of the environment.

This will also be able to support the results of PR #225 after that PR is merged with just a few updates from pixi. 👍

Tip

Instructions for reviewers testing the PR:

Navigate to https://coffea-opendata.casa/ and launch an "Coffea-casa build with coffea 0.7.21/dask 2022.05.0/HTCondor and cheese" instance
Clone and checkout the PR branch

git clone https://github.com/matthewfeickert/analysis-grand-challenge.git analysis-grand-challenge-pr-227 && cd analysis-grand-challenge-pr-227 && git checkout origin/feat/add-pixi -b feat/add-pixi

Install pixi if you haven't already

curl -fsSL https://pixi.sh/install.sh | bash
. ~/.bashrc  # make pixi active in shell

Install the ipykernel for the cms-open-data-ttbar environment

pixi run --environment cms-open-data-ttbar install-ipykernel

In the Coffea-casa Jupyter Lab browser, navigate and open up the analyses/cms-open-data-ttbar/ttbar_analysis_pipeline.ipynb
Change the kernel of the notebook to be cms-open-data-ttbar
Run the notebook as you desire

matthewfeickert · 2024-10-22T12:48:57Z

@alexander-held My guess is that the list that @eguiraud determined in #144 (comment) has changed since then, but this PR currently just implements the requirements described in #199 (comment) but I assume there will be more that we will need to test with.

pixi.toml

matthewfeickert · 2024-10-24T10:31:24Z

@alexander-held Can the analyses/cms-open-data-ttbar/requirements.txt be removed, or is that important to retain for some reason that won't use pixi?

matthewfeickert · 2024-10-24T10:38:26Z

Okay, I'll want to rebase this to get it into a single commit before merge, but to run the analyses/cms-open-data-ttbar/ttbar_analysis_pipeline.ipynb on the CMS open data coffea-casa with the coffea v0.7 image you just need to do (post cloning this branch)

pixi run install-ipykernel

and then you're good to go as that will also properly install the environment you need (making sure that you select the cms-open-data-ttbar kernel in the notebook).

alexander-held · 2024-10-24T11:34:42Z

@matthewfeickert yes let's remove the requirements.txt, I can't think of anything depending on it at the moment. If it causes problems down the line we can add something like that back again.

pixi.toml

matthewfeickert · 2024-10-25T23:46:30Z

@alexander-held @oshadura I've managed to get the environment to solve but I need help debugging some issues testing it:

What ServiceX instance should I be targeting if I am running this on the UNL Open Data coffea-casa?
When I run as is, even if I have USE_SERVICEX = False:

### GLOBAL CONFIGURATION
# input files per process, set to e.g. 10 (smaller number = faster)
N_FILES_MAX_PER_SAMPLE = 5

# enable Dask
USE_DASK = True

# enable ServiceX
USE_SERVICEX = False

### ML-INFERENCE SETTINGS

# enable ML inference
USE_INFERENCE = True

# enable inference using NVIDIA Triton server
USE_TRITON = False

during the "`Execute the data delivery pipeline" cell of the notebook things fail with the following

Traceback:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[7], line 29
     27 t0 = time.monotonic()
     28 # processing
---> 29 all_histograms, metrics = run(
     30     fileset,
     31     treename,
     32     processor_instance=TtbarAnalysis(USE_INFERENCE, USE_TRITON)
     33 )
     34 exec_time = time.monotonic() - t0
     36 print(f"\nexecution took {exec_time:.2f} seconds")

File ~/analysis-grand-challenge-debug/.pixi/envs/cms-open-data-ttbar/lib/python3.9/site-packages/coffea/processor/executor.py:1700, in Runner.__call__(self, fileset, treename, processor_instance)
   1679 def __call__(
   1680     self,
   1681     fileset: Dict,
   1682     treename: str,
   1683     processor_instance: ProcessorABC,
   1684 ) -> Accumulatable:
   1685     """Run the processor_instance on a given fileset
   1686 
   1687     Parameters
   (...)
   1697             An instance of a class deriving from ProcessorABC
   1698     """
-> 1700     wrapped_out = self.run(fileset, processor_instance, treename)
   1701     if self.use_dataframes:
   1702         return wrapped_out  # not wrapped anymore

File ~/analysis-grand-challenge-debug/.pixi/envs/cms-open-data-ttbar/lib/python3.9/site-packages/coffea/processor/executor.py:1848, in Runner.run(self, fileset, processor_instance, treename)
   1843 closure = partial(
   1844     self.automatic_retries, self.retries, self.skipbadfiles, closure
   1845 )
   1847 executor = self.executor.copy(**exe_args)
-> 1848 wrapped_out, e = executor(chunks, closure, None)
   1849 if wrapped_out is None:
   1850     raise ValueError(
   1851         "No chunks returned results, verify ``processor`` instance structure.\n\
   1852         if you used skipbadfiles=True, it is possible all your files are bad."
   1853     )

File ~/analysis-grand-challenge-debug/.pixi/envs/cms-open-data-ttbar/lib/python3.9/site-packages/coffea/processor/executor.py:974, in DaskExecutor.__call__(self, items, function, accumulator)
    967         # FIXME: fancy widget doesn't appear, have to live with boring pbar
    968         progress(work, multi=True, notebook=False)
    969     return (
    970         accumulate(
    971             [
    972                 work.result()
    973                 if self.compression is None
--> 974                 else _decompress(work.result())
    975             ],
    976             accumulator,
    977         ),
    978         0,
    979     )
    980 except KilledWorker as ex:
    981     baditem = key_to_item[ex.task]

File ~/analysis-grand-challenge-debug/.pixi/envs/cms-open-data-ttbar/lib/python3.9/site-packages/distributed/client.py:322, in Future.result(self, timeout)
    320 self._verify_initialized()
    321 with shorten_traceback():
--> 322     return self.client.sync(self._result, callback_timeout=timeout)

File /opt/conda/lib/python3.9/site-packages/coffea/processor/executor.py:221, in __call__()
    220 def __call__(self, *args, **kwargs):
--> 221     out = self.function(*args, **kwargs)
    222     return _compress(out, self.level)

File /opt/conda/lib/python3.9/site-packages/coffea/processor/executor.py:1367, in automatic_retries()
   1361         break
   1362     if (
   1363         not skipbadfiles
   1364         or any("Auth failed" in str(c) for c in chain)
   1365         or retries == retry_count
   1366     ):
-> 1367         raise e
   1368     warnings.warn("Attempt %d of %d." % (retry_count + 1, retries + 1))
   1369 retry_count += 1

File /opt/conda/lib/python3.9/site-packages/coffea/processor/executor.py:1336, in automatic_retries()
   1334 while retry_count <= retries:
   1335     try:
-> 1336         return func(*args, **kwargs)
   1337     # catch xrootd errors and optionally skip
   1338     # or retry to read the file
   1339     except Exception as e:

File /opt/conda/lib/python3.9/site-packages/coffea/processor/executor.py:1572, in _work_function()
   1570     item, processor_instance = item
   1571 if not isinstance(processor_instance, ProcessorABC):
-> 1572     processor_instance = cloudpickle.loads(lz4f.decompress(processor_instance))
   1574 if format == "root":
   1575     filecontext = uproot.open(
   1576         {item.filename: None},
   1577         timeout=xrootdtimeout,
   (...)
   1580         else uproot.MultithreadedFileSource,
   1581     )

ModuleNotFoundError: No module named 'servicex'

which seems to indicate that the existence of the servicex library being installed in my environment is causing other problems regardless of what the steering shell variables are (if I uninstall servicex and leave everything else in the environment the same, then I'm able to run without errors).

matthewfeickert · 2024-10-28T17:48:37Z

A follow up question: Is there an analysis facility where the CMS ttbar open data workflow has been run with USE_SERVICEX=True and things worked? If so, I can try to diff the environment there in comparison to what I have given that

$ git grep --name-only "USE_SERVICEX"
analyses/cms-open-data-ttbar/ttbar_analysis_pipeline.ipynb
analyses/cms-open-data-ttbar/ttbar_analysis_pipeline.py
analyses/cms-open-data-ttbar/utils/metrics.py
docs/facilityinstructions.rst

isn't particularly deep.

alexander-held · 2024-10-29T13:44:13Z

Now that #225 is merged, we can target the v3 API of the ServiceX frontend.

What ServiceX instance should I be targeting if I am running this on the UNL Open Data coffea-casa?

Should be https://opendataaf-servicex.servicex.coffea-opendata.casa/.

As for the other question about importing, that's with your own environment? Not sure what causes this but perhaps we can update to v3 and then debug that one.

oshadura · 2024-10-29T14:20:10Z

@matthewfeickert The ServiceX instance was upgraded during the last couple of days, and now it back works.
You have a config file generated for you at the facility, so you should just run the current version of a notebook without any issues.

oshadura · 2024-10-29T14:22:43Z

22k lines of changes are coming from pixie.lock?

oshadura · 2024-10-29T14:25:31Z

I am not sure why we need to remove requirements.txt? Andrea Sciaba for example was using it for his test setup, and maybe we should keep it as a backward compatibility for such a case?

sciaba · 2024-10-29T15:04:36Z

I used requirements.txt to create from scratch a conda environment to run my I/O tests. I'm not familiar with pixi, but if it can be used for the exact same use case, it should be fine. Otherwise, keeping a requirements.txt might be handy.

oshadura · 2024-10-29T15:06:19Z

@sciaba I agree with you :) and I was just telling Alex about your use-case

oshadura · 2024-10-29T15:08:10Z

@matthewfeickert can we keep in sync both environments? prefix-dev/pixi#1410

matthewfeickert · 2024-10-30T16:59:09Z

Now that #225 is merged, we can target the v3 API of the ServiceX frontend.

Okay, let me refactor this to use v3. That will be easier.

As for the other question about importing, that's with your own environment? Not sure what causes this but perhaps we can update to v3 and then debug that one.

@alexander-held Yes. I don't think that having a different version of the library will matter, but we'll see.

You have a config file generated for you at the facility, so you should just run the current version of a notebook without any issues.

Merci @oshadura! 🙏

22k lines of changes are coming from pixi.lock?

@oshadura Yes, lock files are long to begin with and this is a multi-platform and multi-environment lock file.

I would suggest not trying to keep around the old requirements.txt as it is not something that humans are going to be able to keep updated manually (there's no information encoded RE: the respective dependencies and requirements/constraints). Installing pixi is a pretty small ask IMO, and you can even do so on LXPLUS. Of course if this is really needed we can keep it, but I would view it as a legacy file that we don't try to maintain.

I used requirements.txt to create from scratch a conda environment to run my I/O tests. I'm not familiar with pixi, but if it can be used for the exact same use case, it should be fine.

@sciaba Yes, pixi will just skip steps here and get you a conda-like environment immediately. Check out https://pixi.sh/ to get started and feel free to ping me if you have questions.

can we keep in sync both environments? prefix-dev/pixi#1410

The suggested idea in that issue is going the wrong direction (requirements.txt -> pixi.toml) for what we want. The pixi manifest and lock files are multi-platfrom and multi-environment and so can not be generated by a single high-level environment file (like a requirements.txt or environment.yml).

matthewfeickert · 2024-10-30T17:09:15Z

When I rebase my PR I'll not remove the requirements.txt and let people do that in a follow up PR.

oshadura · 2024-10-31T08:44:55Z

I am suggesting to remove jupyterlab environment or make it optional. This is very confusing for users, especially for power users who want to test notebook / python script in the facility or particular environment where is not needed jupyterlab.

matthewfeickert · 2024-10-31T14:47:36Z

I am suggesting to remove jupyterlab environment or make it optional. This is very confusing for users, especially for power users who want to test notebook / python script in the facility or particular environment where is not needed jupyterlab.

Okay, I can refactor this into another feature + environment. Why is this confusing for users though? I would think they should be unaware of its existence.

oshadura · 2024-10-31T14:50:56Z

I tried to test, and pixie run automatically starts for me jupyterlab session in the same terminal I was running command. If you have your custom jupyterlab setup (e.g. another facility) or you just want to run .py, this is not what you expect to have a result.

matthewfeickert · 2024-10-31T15:30:27Z

@alexander-held @oshadura I've moved this out of draft and this is now ready for review. I've added notes for reviewers in the PR body, but all information should be clear from the additions to the README. If not, then I need to revise it.

(sorry, last force-with-lease pushes were fixing typos)

analyses/cms-open-data-ttbar/README.md

matthewfeickert

Some high level guiding notes if you're new to how pixi manifest files work. Feel free to ignore.

pixi.toml

matthewfeickert · 2024-11-05T16:41:25Z

@alexander-held @oshadura If you have time to review this week that would be great. I'll also note for context here that I went with the idea of having things be in a top level for the whole project, but if it would be of more interest to have each analysis be a separate pixi project that's possible too.

oshadura

I tried to run locally and I see next error:

2024-11-06 14:36:45,504 - distributed.worker - WARNING - Compute Failed
Key:       TtbarAnalysis-5c778b8f1e703fd7fe17b7cd2972d7ed
Function:  TtbarAnalysis
args:      ((WorkItem(dataset='wjets__nominal', filename='https://xrootd-local.unl.edu:1094//store/user/AGC/nanoAOD/WJetsToLNu_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/cmsopendata2015_wjets_20547_PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext2-v1_10000_0004.root', treename='Events', entrystart=788276, entrystop=985345, fileuuid=b'#\x96\x8fdt\x8a\x11\xed\x8e[\xa6\xef]\x81\xbe\xef', usermeta={'process': 'wjets', 'xsec': 15487.164, 'nevts': 5913030, 'variation': 'nominal'}), b'\x04"M\x18H@{\x02"\x00\x00\x00\x00\x00a\x04\x94\x00\x00a\x80\x05\x95@A\x00\x01\x00\xe7\x8c\x17cloudpickle.\x0c\x00\xf6@\x94\x8c\x14_make_skeleton_class\x94\x93\x94(\x8c\x03abc\x94\x8c\x07ABCMeta\x94\x93\x94\x8c\rTtbarAnalysis\x94\x8c\x1acoffea.processor\n\x00D\x94\x8c\x0cP\x16\x00\xf2@ABC\x94\x93\x94\x85\x94}\x94\x8c\n__module__\x94\x8c\x08__main__\x94s\x8c c4f9f7e4f41d480e87c970e516ebf57a\x94Nt\x94R\x94h\x00\x8c\x0f\xa3\x00\xf2\x15_setstate\x94\x93\x94h\x10}\x94(h\x0ch\r\x8c\x08__init__\x94h\x00\x8c\x0e\xdb\x00\xf5Tfunction\x9
kwargs:    {}
Exception: 'AttributeError("module \'setuptools\' has no attribute \'extern\'")'

I think you need to pin setuptools, but I dont know how to do this with pixie.

Also please add in README to run locally you need to update in utils/config.py: "AF": "local"

oshadura

We will need to remove coffea-casa part for now since we don't have a solution on how to ship the pixie environment to workers and we can try to resolve it in the next pull request.
Already running with the new kernel, I see a version mistmatch between, client, scheduler and workers...

/home/cms-jovyan/agc-servicex/.pixi/envs/cms-open-data-ttbar/lib/python3.9/site-packages/distributed/client.py:1391: VersionMismatchWarning: Mismatched versions found

+---------+----------------+----------------+---------+
| Package | Client         | Scheduler      | Workers |
+---------+----------------+----------------+---------+
| lz4     | 4.3.3          | 4.3.2          | None    |
| msgpack | 1.1.0          | 1.0.6          | None    |
| python  | 3.9.20.final.0 | 3.9.18.final.0 | None    |
| toolz   | 1.0.0          | 0.12.0         | None    |
| tornado | 6.4.1          | 6.3.3          | None    |
+---------+----------------+----------------+---------+
  warnings.warn(version_module.VersionMismatchWarning(msg[0]["warning"]))

matthewfeickert · 2024-11-06T20:08:28Z

We will need to remove coffea-casa part for now since we don't have a solution on how to ship the pixie environment to workers and we can try to resolve it in the next pull request.

Why is that needed? The current workers are using the same coffea-casa environment as the coffea-casa client the user drops into at pod launch, right? You didn't ship the analyses/cms-open-data-ttbar/requirements.txt to them, right?

Already running with the new kernel, I see a version mistmatch between, client, scheduler and workers...

/home/cms-jovyan/agc-servicex/.pixi/envs/cms-open-data-ttbar/lib/python3.9/site-packages/distributed/client.py:1391: VersionMismatchWarning: Mismatched versions found

+---------+----------------+----------------+---------+
| Package | Client         | Scheduler      | Workers |
+---------+----------------+----------------+---------+
| lz4     | 4.3.3          | 4.3.2          | None    |
| msgpack | 1.1.0          | 1.0.6          | None    |
| python  | 3.9.20.final.0 | 3.9.18.final.0 | None    |
| toolz   | 1.0.0          | 0.12.0         | None    |
| tornado | 6.4.1          | 6.3.3          | None    |
+---------+----------------+----------------+---------+
  warnings.warn(version_module.VersionMismatchWarning(msg[0]["warning"]))

Yes, I already evaluated that having the exact scheduler versions pinned here isn't needed. We can of course match things exactly (and I did earlier in this PR), but for runtime evaluation these differences don't seem to matter.

matthewfeickert · 2024-11-06T20:13:25Z

I tried to run locally and I see next error:

2024-11-06 14:36:45,504 - distributed.worker - WARNING - Compute Failed
Key:       TtbarAnalysis-5c778b8f1e703fd7fe17b7cd2972d7ed
Function:  TtbarAnalysis
args:      ((WorkItem(dataset='wjets__nominal', filename='https://xrootd-local.unl.edu:1094//store/user/AGC/nanoAOD/WJetsToLNu_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/cmsopendata2015_wjets_20547_PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext2-v1_10000_0004.root', treename='Events', entrystart=788276, entrystop=985345, fileuuid=b'#\x96\x8fdt\x8a\x11\xed\x8e[\xa6\xef]\x81\xbe\xef', usermeta={'process': 'wjets', 'xsec': 15487.164, 'nevts': 5913030, 'variation': 'nominal'}), b'\x04"M\x18H@{\x02"\x00\x00\x00\x00\x00a\x04\x94\x00\x00a\x80\x05\x95@A\x00\x01\x00\xe7\x8c\x17cloudpickle.\x0c\x00\xf6@\x94\x8c\x14_make_skeleton_class\x94\x93\x94(\x8c\x03abc\x94\x8c\x07ABCMeta\x94\x93\x94\x8c\rTtbarAnalysis\x94\x8c\x1acoffea.processor\n\x00D\x94\x8c\x0cP\x16\x00\xf2@ABC\x94\x93\x94\x85\x94}\x94\x8c\n__module__\x94\x8c\x08__main__\x94s\x8c c4f9f7e4f41d480e87c970e516ebf57a\x94Nt\x94R\x94h\x00\x8c\x0f\xa3\x00\xf2\x15_setstate\x94\x93\x94h\x10}\x94(h\x0ch\r\x8c\x08__init__\x94h\x00\x8c\x0e\xdb\x00\xf5Tfunction\x9
kwargs:    {}
Exception: 'AttributeError("module \'setuptools\' has no attribute \'extern\'")'

Just to confirm, you don't see this when running locally with an environment created from analyses/cms-open-data-ttbar/requirements.txt?

oshadura · 2024-11-07T09:50:51Z

We will need to remove coffea-casa part for now since we don't have a solution on how to ship the pixie environment to workers and we can try to resolve it in the next pull request.

Why is that needed? The current workers are using the same coffea-casa environment as the coffea-casa client the user drops into at pod launch, right? You didn't ship the analyses/cms-open-data-ttbar/requirements.txt to them, right?
Already running with the new kernel, I see a version mistmatch between, client, scheduler and workers...
/home/cms-jovyan/agc-servicex/.pixi/envs/cms-open-data-ttbar/lib/python3.9/site-packages/distributed/client.py:1391: VersionMismatchWarning: Mismatched versions found

+---------+----------------+----------------+---------+
| Package | Client         | Scheduler      | Workers |
+---------+----------------+----------------+---------+
| lz4     | 4.3.3          | 4.3.2          | None    |
| msgpack | 1.1.0          | 1.0.6          | None    |
| python  | 3.9.20.final.0 | 3.9.18.final.0 | None    |
| toolz   | 1.0.0          | 0.12.0         | None    |
| tornado | 6.4.1          | 6.3.3          | None    |
+---------+----------------+----------------+---------+
  warnings.warn(version_module.VersionMismatchWarning(msg[0]["warning"]))
Yes, I already evaluated that having the exact scheduler versions pinned here isn't needed. We can of course match things exactly (and I did earlier in this PR), but for runtime evaluation these differences don't seem to matter.

The version on client/scheduler and workers should be exactly the same, otherwise distributed dask usually crash (that is why we have a warning).

What is happening is that your client environment has now a different version of python (and other packages) compared to my scheduler and worker environment on coffea-casa.

oshadura · 2024-11-07T10:07:45Z

I tried to run locally and I see next error:

2024-11-06 14:36:45,504 - distributed.worker - WARNING - Compute Failed
Key:       TtbarAnalysis-5c778b8f1e703fd7fe17b7cd2972d7ed
Function:  TtbarAnalysis
args:      ((WorkItem(dataset='wjets__nominal', filename='https://xrootd-local.unl.edu:1094//store/user/AGC/nanoAOD/WJetsToLNu_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/cmsopendata2015_wjets_20547_PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext2-v1_10000_0004.root', treename='Events', entrystart=788276, entrystop=985345, fileuuid=b'#\x96\x8fdt\x8a\x11\xed\x8e[\xa6\xef]\x81\xbe\xef', usermeta={'process': 'wjets', 'xsec': 15487.164, 'nevts': 5913030, 'variation': 'nominal'}), b'\x04"M\x18H@{\x02"\x00\x00\x00\x00\x00a\x04\x94\x00\x00a\x80\x05\x95@A\x00\x01\x00\xe7\x8c\x17cloudpickle.\x0c\x00\xf6@\x94\x8c\x14_make_skeleton_class\x94\x93\x94(\x8c\x03abc\x94\x8c\x07ABCMeta\x94\x93\x94\x8c\rTtbarAnalysis\x94\x8c\x1acoffea.processor\n\x00D\x94\x8c\x0cP\x16\x00\xf2@ABC\x94\x93\x94\x85\x94}\x94\x8c\n__module__\x94\x8c\x08__main__\x94s\x8c c4f9f7e4f41d480e87c970e516ebf57a\x94Nt\x94R\x94h\x00\x8c\x0f\xa3\x00\xf2\x15_setstate\x94\x93\x94h\x10}\x94(h\x0ch\r\x8c\x08__init__\x94h\x00\x8c\x0e\xdb\x00\xf5Tfunction\x9
kwargs:    {}
Exception: 'AttributeError("module \'setuptools\' has no attribute \'extern\'")'

Just to confirm, you don't see this when running locally with an environment created from analyses/cms-open-data-ttbar/requirements.txt?

0.7.x coffea works only with "setuptools<71" and I see in your environment you have higher version:

oshadura · 2024-11-07T10:20:23Z

I think honestly the main focus of this PR could help to run AGC locally, since on facility usually the environment is customized and not easy to handle in such a way (we have too many components). I would suggest dividing PR functionality on local setup and facility and to follow up on the facility setup in separate PR?

matthewfeickert · 2024-11-07T10:54:42Z

What is happening is that your client environment has now a different version of python (and other packages) compared to my scheduler and worker environment on coffea-casa.

Yes, that is why I tested it to find versions that wouldn't crash and noted the precautions section

# coffea-casa precautions: keep the drift from scheduler environment small
pandas = ">=2.1.2, <2.2.4"
lz4 = ">=4.3.2, <4.3.4"
msgpack-python = ">=1.0.6, <1.1.1"
toolz = ">=0.12.0, <1.0.1"
tornado = ">=6.3.3, <6.4.2"

but I'll just change these to not be bounds but exact versions. We can do that same with the CPython version, but as the versions differ only in the patch version (which is for security patches) the language feature set is the same across all Python 3.9.x versions and so packages should be insensitive to what that x in 3.9.x is.

matthewfeickert · 2024-11-07T11:06:27Z

I would suggest dividing PR functionality on local setup and facility and to follow up on the facility setup in separate PR?

I think this already does that. This PR is just meant to give people an environment lock file that reproduces the same runtime state as the "Coffea-casa build with coffea 0.7.21/dask 2022.05.0/HTCondor and cheese" instance. It provides a more tractable way to describe the environment than the existing requirements.txt, but in the same way that you're not using that requirements.txt on the facility side to actually do anything this PR wouldn't get used that way either. Though to verify that the environments match the client environment in the Coffea-casa pod you need to run with that client environment in Coffea-casa.

matthewfeickert · 2024-11-18T18:42:13Z

Okay, thanks to @oshadura's work on the UNL Coffea-casa things are running there again for me as expected on both the default Coffea-casa environment and the pixi cms-open-data-ttbar environment. 👍

So @oshadura and @alexander-held this should be ready for review again. The client environment now matches the scheduler environment fully. c.f. the

# coffea-casa precautions: exactly match scheduler environment
python = "3.9.18.*"
pandas = "2.1.2.*"
lz4 = "4.3.2.*"
msgpack-python = "1.0.6.*"
toolz = "0.12.0.*"
tornado = "6.3.3.*"

section.

I've run on the UNL Coffea-casa with the cms-open-data-ttbar environment (without and with using ServiceX) and then run on my local Linux and macOS machines using the local-cms-open-data-ttbar environment and ServiceX and everything has worked as expected.

pixi.toml

analyses/cms-open-data-ttbar/README.md

* Add pixi manifest (pixi.toml) and pixi lockfile (pixi.lock) to fully specify the project dependencies. This provides a multi-environment multi-platform (Linux, macOS) lockfile. * In addition to the default feature, add 'latest', 'cms-open-data-ttbar', and 'local' features and corresponding environments composed from the features. The 'cms-open-data-ttbar' feature is designed to be compatible with the Coffea Base image which uses SemVer coffea (Coffea-casa build with coffea 0.7.21/dask 2022.05.0/HTCondor and cheese). - The cms-open-data-ttbar feature has a 'install-ipykernel' task that installs a kernel such that the pixi environment can be used on a coffea-casa instance from a notebook. - The local features have the canonical 'start' task that will launch a jupyter lab session inside of the environment. * Add use instructions for the pixi environments to the cms-open-data-ttbar README.

alexander-held

Thanks for all the discussion and work here, I think we're good to merge!

matthewfeickert · 2024-11-21T17:26:01Z

Thanks for all the help on this one, @oshadura and @alexander-held!

matthewfeickert self-assigned this Oct 22, 2024

alexander-held reviewed Oct 22, 2024

View reviewed changes

pixi.toml Outdated Show resolved Hide resolved

matthewfeickert commented Oct 22, 2024

View reviewed changes

pixi.toml Outdated Show resolved Hide resolved

matthewfeickert force-pushed the feat/add-pixi branch from 396beda to 09b06ed Compare October 22, 2024 13:10

matthewfeickert force-pushed the feat/add-pixi branch 2 times, most recently from 94544e8 to 57df86b Compare October 24, 2024 12:01

matthewfeickert commented Oct 24, 2024

View reviewed changes

pixi.toml Outdated Show resolved Hide resolved

matthewfeickert commented Oct 24, 2024

View reviewed changes

pixi.toml Outdated Show resolved Hide resolved

alexander-held requested a review from oshadura October 24, 2024 13:17

matthewfeickert force-pushed the feat/add-pixi branch from a08008f to eb4aa30 Compare October 30, 2024 17:45

matthewfeickert force-pushed the feat/add-pixi branch 2 times, most recently from 5f00329 to 021c741 Compare October 31, 2024 14:36

matthewfeickert requested a review from alexander-held October 31, 2024 15:29

matthewfeickert force-pushed the feat/add-pixi branch 2 times, most recently from 3786611 to 3989c45 Compare October 31, 2024 15:36

matthewfeickert commented Oct 31, 2024

View reviewed changes

analyses/cms-open-data-ttbar/README.md Show resolved Hide resolved

matthewfeickert commented Oct 31, 2024

View reviewed changes

pixi.toml Show resolved Hide resolved

pixi.toml Show resolved Hide resolved

pixi.toml Show resolved Hide resolved

pixi.toml Show resolved Hide resolved

pixi.toml Show resolved Hide resolved

matthewfeickert force-pushed the feat/add-pixi branch from a8a68b9 to 9ab4a45 Compare November 1, 2024 16:34

oshadura reviewed Nov 6, 2024

View reviewed changes

matthewfeickert force-pushed the feat/add-pixi branch 2 times, most recently from 1e99b3b to 96c0dc1 Compare November 18, 2024 18:36

matthewfeickert force-pushed the feat/add-pixi branch from 96c0dc1 to eff6727 Compare November 18, 2024 19:17

matthewfeickert commented Nov 18, 2024

View reviewed changes

pixi.toml Show resolved Hide resolved

matthewfeickert force-pushed the feat/add-pixi branch from e0163b1 to 99288b0 Compare November 18, 2024 21:40

matthewfeickert commented Nov 18, 2024

View reviewed changes

analyses/cms-open-data-ttbar/README.md Outdated Show resolved Hide resolved

matthewfeickert force-pushed the feat/add-pixi branch from 99288b0 to bc16750 Compare November 19, 2024 01:36

alexander-held approved these changes Nov 21, 2024

View reviewed changes

alexander-held merged commit 9f5bb83 into iris-hep:main Nov 21, 2024
2 checks passed

matthewfeickert deleted the feat/add-pixi branch November 21, 2024 17:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add pixi project configuration #227

feat: Add pixi project configuration #227

matthewfeickert commented Oct 22, 2024 •

edited

Loading

matthewfeickert commented Oct 22, 2024

matthewfeickert commented Oct 24, 2024 •

edited

Loading

matthewfeickert commented Oct 24, 2024

alexander-held commented Oct 24, 2024

matthewfeickert commented Oct 25, 2024 •

edited

Loading

matthewfeickert commented Oct 28, 2024 •

edited

Loading

alexander-held commented Oct 29, 2024

oshadura commented Oct 29, 2024

oshadura commented Oct 29, 2024

oshadura commented Oct 29, 2024

sciaba commented Oct 29, 2024

oshadura commented Oct 29, 2024 •

edited

Loading

oshadura commented Oct 29, 2024

matthewfeickert commented Oct 30, 2024 •

edited

Loading

matthewfeickert commented Oct 30, 2024

oshadura commented Oct 31, 2024

matthewfeickert commented Oct 31, 2024

oshadura commented Oct 31, 2024 •

edited

Loading

matthewfeickert commented Oct 31, 2024 •

edited

Loading

matthewfeickert left a comment

matthewfeickert commented Nov 5, 2024

oshadura left a comment •

edited

Loading

oshadura left a comment •

edited

Loading

matthewfeickert commented Nov 6, 2024 •

edited

Loading

matthewfeickert commented Nov 6, 2024

oshadura commented Nov 7, 2024 •

edited

Loading

oshadura commented Nov 7, 2024

oshadura commented Nov 7, 2024 •

edited

Loading

matthewfeickert commented Nov 7, 2024 •

edited

Loading

matthewfeickert commented Nov 7, 2024

matthewfeickert commented Nov 18, 2024 •

edited

Loading

alexander-held left a comment

matthewfeickert commented Nov 21, 2024

feat: Add pixi project configuration #227

feat: Add pixi project configuration #227

Conversation

matthewfeickert commented Oct 22, 2024 • edited Loading

matthewfeickert commented Oct 22, 2024

matthewfeickert commented Oct 24, 2024 • edited Loading

matthewfeickert commented Oct 24, 2024

alexander-held commented Oct 24, 2024

matthewfeickert commented Oct 25, 2024 • edited Loading

matthewfeickert commented Oct 28, 2024 • edited Loading

alexander-held commented Oct 29, 2024

oshadura commented Oct 29, 2024

oshadura commented Oct 29, 2024

oshadura commented Oct 29, 2024

sciaba commented Oct 29, 2024

oshadura commented Oct 29, 2024 • edited Loading

oshadura commented Oct 29, 2024

matthewfeickert commented Oct 30, 2024 • edited Loading

matthewfeickert commented Oct 30, 2024

oshadura commented Oct 31, 2024

matthewfeickert commented Oct 31, 2024

oshadura commented Oct 31, 2024 • edited Loading

matthewfeickert commented Oct 31, 2024 • edited Loading

matthewfeickert left a comment

Choose a reason for hiding this comment

matthewfeickert commented Nov 5, 2024

oshadura left a comment • edited Loading

Choose a reason for hiding this comment

oshadura left a comment • edited Loading

Choose a reason for hiding this comment

matthewfeickert commented Nov 6, 2024 • edited Loading

matthewfeickert commented Nov 6, 2024

oshadura commented Nov 7, 2024 • edited Loading

oshadura commented Nov 7, 2024

oshadura commented Nov 7, 2024 • edited Loading

matthewfeickert commented Nov 7, 2024 • edited Loading

matthewfeickert commented Nov 7, 2024

matthewfeickert commented Nov 18, 2024 • edited Loading

alexander-held left a comment

Choose a reason for hiding this comment

matthewfeickert commented Nov 21, 2024

matthewfeickert commented Oct 22, 2024 •

edited

Loading

matthewfeickert commented Oct 24, 2024 •

edited

Loading

matthewfeickert commented Oct 25, 2024 •

edited

Loading

matthewfeickert commented Oct 28, 2024 •

edited

Loading

oshadura commented Oct 29, 2024 •

edited

Loading

matthewfeickert commented Oct 30, 2024 •

edited

Loading

oshadura commented Oct 31, 2024 •

edited

Loading

matthewfeickert commented Oct 31, 2024 •

edited

Loading

oshadura left a comment •

edited

Loading

oshadura left a comment •

edited

Loading

matthewfeickert commented Nov 6, 2024 •

edited

Loading

oshadura commented Nov 7, 2024 •

edited

Loading

oshadura commented Nov 7, 2024 •

edited

Loading

matthewfeickert commented Nov 7, 2024 •

edited

Loading

matthewfeickert commented Nov 18, 2024 •

edited

Loading