Cell dependency graph #1175

nvdv · 2016-03-05T14:16:41Z

At present all Notebook cells are executed linearly:

Cell 1
   |
Cell 2
   |
Cell 3

but sometimes there's no need to calculate Cell 2 in order to get result from Cell 3 and calculating Cell 2 might be time-consuming.
Setting cell dependency graph somehow would resolve this issue.

The text was updated successfully, but these errors were encountered:

takluyver · 2016-03-05T15:20:33Z

Have a look at ipycache if you have long-running cells that you don't always want to re-run. I don't think we want to get into defining a DAG of cells.

Carreau · 2016-03-07T17:31:53Z

There is a long thread we had a few years[*] ago about that on the mailing list.

[*] OMG I'm old now.

JamiesHQ · 2017-04-27T01:05:25Z

@nvdv : We're doing a little housekeeping on our issue log and noticed this thread from 2016. Has this issue been resolved to your satisfaction and can it be closed? thanks!

nvdv · 2017-04-27T05:57:25Z

It is feature request. I am not sure it was implemented, but its up to you to close it if you think its out of scope. On Apr 27, 2017 04:05, "JamieW" <[email protected]> wrote: @nvdv <https://github.com/nvdv> : We're doing a little housekeeping on our issue log and noticed this thread from 2016. Has this issue been resolved to your satisfaction and can it be closed? thanks! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1175 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAtf16APooTt3sBAP6TLQPtctfkAultEks5rz-nXgaJpZM4HqAPF> .

adam-m-jcbs · 2018-10-03T16:42:10Z

The long thread discussing this, linked above by @Carreau , is unreachable for me. So apologies if I'm rehashing things discussed there.

I certainly agree managing a DAG of cells is not desirable. But it would be cool if there was a built-in cell magic for stating cells to be automatically run first before running the current cell. Naively, this doesn't seem to be too burdensome a feature to implement, but I'm mostly a Jupyter notebook user, not developer, so I could be wrong. Does there exist any such cell magic, or a cell magic that could be used for this purpose?

mxxun · 2018-10-17T12:09:37Z

For future reference: the long thread was moved.

nickurak · 2018-11-22T22:53:08Z

Conversely, while a dependency graph might tell you you don't need to evaluate/re-evaluate cell B just because A changed, it might also tell you that you're going to have a bad time trying to evaluate C if C depends on A.

In accordance with https://jupyter-notebook.readthedocs.io/en/stable/security.html , if someone tried to execute a cell that depended on another, I wonder if it would make sense to do so automatically?

At a minimum, it might be helpful to have some visual feedback to indicate that the cell isn't runnable until some particular cell above satisfies its dependencies.

pedrovgp · 2019-11-22T14:57:10Z

@takluyver, is there any reason for a DAG of cells to be out of question? Visualising cells in a graph would certainly allow both cell dependency to become clearer as well as improve story telling capabilities, since non-linear (branching) stories are hard to tell within today's notebooks.

For a simple concrete example: imagine a notebook to evaluate three real estate expansion plans for a given city. The first node of cells loads the current real estate data and describes the current state of affairs. From there, you get three branches, each of them following similar logic but following different scenario premisses and arriving to comparable (but different) end results.

Today, this analysis could be done using a chapter for each scenario, but that still requires rolling up and down to compare, maybe unclear settings of which cell to run before scenario A, maybe (accidentally) re-running scenario A before B (run all is sooo easy to click on), etc.

jasongrout · 2019-11-22T16:01:40Z

I think using a magic (or cell metadata) to explicitly define dependencies for a DAG of cells is a very interesting idea. I think automatically coming up with the DAG on the front end is probably prohibitively hard, given that we have a number of kernels of different languages. There was some work from a CalPoly group of students on a kernel that would keep track of a DAG, IIRC, somewhat like ObservableHQ.

nickurak · 2019-11-22T22:09:56Z

Because it's been a year, and this idea has been bouncing around my head a little -- here's a sketch of a thought in this area:

I'd be really interested in a world where the cells run in actual scopes, and cells were more explict about what they were pulling in from each other. This might be reasonably easy in python, but maybe tricky in different languages.

label_cell("utility")
def func_that_makes_a_df():
   <code>

<Some markdown explaining that function>

label_cell("get_pf")
from cell("utilty") import func that_makes_a_df()
df = func_that_makes_a_df()

<Some markdown that talk about a dataframe>

from cell("get_pf") import df as plotttable_df
import plotly

plotly.plot_something(plottable_df)

Making the only things that are shared between cells super-explicit might help:

reduce all kinds of unexpected behavior and unexpected side-effects of scope mixing
allow Jupyter to reason about the dependencies
give good errors when the dependencies are missing
automatically execute cells as they're needed.

I haven't really thought at all about what this might look like outside of the Python world.

nickurak · 2019-11-22T22:11:45Z

In that world, attempting to refer to func_that_makes_a_df in a cell that isn't explicitly importing it from another cell would, for example, fail, with a NameError: name 'func_that_makes_a_df' is not defined exception.

pedrovgp · 2019-11-24T21:19:54Z

@nickurak , I can see other use cases for that, but the use case you've described could be solved establishing cell dependency and splitting code in different cells accordingly. That would be a more generic approach as well, since it could apply to other languages.

Your example would be something like:

Label cell 1 as "utility"
Label cell 2 as "get_pf"
Add "depends on 'utility'" to cell "get_pf"
Add "depends on 'get_pf'" to cell 3 (which plots something)

If you need a function (but not another) that is defined in a given cell, simply split it into two cells and add the dependency only to the one you need.

pedrovgp · 2019-11-25T21:18:04Z

I have worked on a (quick and dirty) visual proposition of how to use cell dependencies to facilitate story telling and organize notebook flows. It probably makes more sense in JupyterLab project, but anyway, this is what I envision: https://docs.google.com/presentation/d/1nWAjvuCZb4MEu9SiTy-QWfMWBThpDpZFnuKNp1S_fHs/edit?usp=sharing

Any comments are appreciated.

toobaz · 2019-11-25T21:21:07Z

If you need a function (but not another) that is defined in a given cell, simply split it into two cells and add the dependency only to the one you need.

A question, which I see as a prerequisite for this discussion: is there already in any Jupyter plugin a standard, or at least popular, way to uniquely identify cells?

pedrovgp · 2019-11-26T14:47:46Z

Seems like it is going to be a part of JupyterLab Core [https://github.com/jupyterlab/jupyterlab-celltags]

jasongrout · 2019-11-26T16:07:07Z

A question, which I see as a prerequisite for this discussion: is there already in any Jupyter plugin a standard, or at least popular, way to uniquely identify cells?

Yes. In the Jupyter official notebook format, a cell can have an optional unique name in its metadata: https://nbformat.readthedocs.io/en/latest/format_description.html#cell-metadata

toobaz · 2019-11-26T16:37:07Z

Yes. In the Jupyter official notebook format, a cell can have an optional unique name in its metadata: https://nbformat.readthedocs.io/en/latest/format_description.html#cell-metadata

Cool! And is this already exposed somewhere?

jasongrout · 2019-11-26T16:47:21Z

Cool! And is this already exposed somewhere?

It's exposed everywhere, in the sense that any library or frontend that can write to cell metadata can write this key. Jupyter notebook and JupyterLab, for example, expose an interface for writing to the cell metadata.

jasongrout · 2019-11-26T16:48:14Z

(To be clear, as with any metadata, it is optional and up to the writer to set this value. It is not set by default in JupyterLab, though it may be set in the notebook by default to some sort of UUID).

toobaz · 2019-11-26T20:29:28Z

It's exposed everywhere, in the sense that any library or frontend that can write to cell metadata can write this key.

Yes, sorry, my question was misleading. I should have asked: is there already some UI for allowing the user to see/change this?

jasongrout · 2019-11-26T21:18:46Z

Yes (though it's just a json editor). In JupyterLab, it's the wrench icon in the left sidebar. In classic notebook, it's the View > Cell Toolbar > Edit Metadata.

Carreau · 2019-11-26T21:37:33Z

In case that has not been posted already, please see also https://github.com/dataflownb and https://github.com/stitchfix/nodebook

Carreau · 2019-11-26T21:38:08Z

Both of those got talks at JupyterCon in 2018 so should be somewhere on Youtube.

meeseeksmachine · 2021-10-08T13:38:36Z

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/dag-based-notebooks/11173/2

stefaneidelloth · 2021-10-13T12:01:22Z

https://observablehq.com/ uses a DAG and I would love to see a JupyterLab extension providing similar features:

https://observablehq.com/@observablehq/how-observable-runs

Edit

Moved overview of projects to jupyterlab:
https://discourse.jupyter.org/t/dag-based-notebooks/11173/4

meeseeksmachine · 2021-10-21T12:21:30Z

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/how-to-get-output-model-for-a-given-cell-in-a-jupyterlab-extension/11342/1

meeseeksmachine · 2021-10-22T08:21:58Z

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/dag-based-notebooks/11173/4

jondo · 2024-10-14T12:38:06Z

Also see https://marimo.io/ .

krassowski · 2024-10-14T13:01:48Z

It's surprising that no one mentioned https://github.com/ipyflow/ipyflow.

Carreau added this to the no action milestone Mar 7, 2016

pedrovgp mentioned this issue Nov 27, 2019

Multiple storylines with dependency graph dataflownb/dfkernel#64

Open

jd41 mentioned this issue Jan 26, 2021

Discussion on implementing directed acyclic graph-like structure in notebooks jupyterlab/frontends-team-compass#118

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cell dependency graph #1175

Cell dependency graph #1175

nvdv commented Mar 5, 2016

takluyver commented Mar 5, 2016

Carreau commented Mar 7, 2016

JamiesHQ commented Apr 27, 2017

nvdv commented Apr 27, 2017 via email

adam-m-jcbs commented Oct 3, 2018

mxxun commented Oct 17, 2018

nickurak commented Nov 22, 2018

pedrovgp commented Nov 22, 2019

jasongrout commented Nov 22, 2019

nickurak commented Nov 22, 2019

nickurak commented Nov 22, 2019

pedrovgp commented Nov 24, 2019

pedrovgp commented Nov 25, 2019

toobaz commented Nov 25, 2019

pedrovgp commented Nov 26, 2019

jasongrout commented Nov 26, 2019

toobaz commented Nov 26, 2019

jasongrout commented Nov 26, 2019

jasongrout commented Nov 26, 2019

toobaz commented Nov 26, 2019

jasongrout commented Nov 26, 2019

Carreau commented Nov 26, 2019

Carreau commented Nov 26, 2019

meeseeksmachine commented Oct 8, 2021

stefaneidelloth commented Oct 13, 2021 •

edited

Loading

meeseeksmachine commented Oct 21, 2021

meeseeksmachine commented Oct 22, 2021

jondo commented Oct 14, 2024

krassowski commented Oct 14, 2024

Cell dependency graph #1175

Cell dependency graph #1175

Comments

nvdv commented Mar 5, 2016

takluyver commented Mar 5, 2016

Carreau commented Mar 7, 2016

JamiesHQ commented Apr 27, 2017

nvdv commented Apr 27, 2017 via email

adam-m-jcbs commented Oct 3, 2018

mxxun commented Oct 17, 2018

nickurak commented Nov 22, 2018

pedrovgp commented Nov 22, 2019

jasongrout commented Nov 22, 2019

nickurak commented Nov 22, 2019

nickurak commented Nov 22, 2019

pedrovgp commented Nov 24, 2019

pedrovgp commented Nov 25, 2019

toobaz commented Nov 25, 2019

pedrovgp commented Nov 26, 2019

jasongrout commented Nov 26, 2019

toobaz commented Nov 26, 2019

jasongrout commented Nov 26, 2019

jasongrout commented Nov 26, 2019

toobaz commented Nov 26, 2019

jasongrout commented Nov 26, 2019

Carreau commented Nov 26, 2019

Carreau commented Nov 26, 2019

meeseeksmachine commented Oct 8, 2021

stefaneidelloth commented Oct 13, 2021 • edited Loading

meeseeksmachine commented Oct 21, 2021

meeseeksmachine commented Oct 22, 2021

jondo commented Oct 14, 2024

krassowski commented Oct 14, 2024

stefaneidelloth commented Oct 13, 2021 •

edited

Loading