-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Make figure models subclassable #606
Conversation
…e-figures-subclassable
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…kinsey/vizro into fix/make-figures-subclassable
…e-figures-subclassable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, couple of suggestions
vizro-core/changelog.d/20240801_113636_petar_pejovic_make_figures_subclassable.md
Outdated
Show resolved
Hide resolved
@@ -32,7 +31,7 @@ class Action(VizroBaseModel): | |||
|
|||
""" | |||
|
|||
function: CapturedCallable = Field(..., import_path=vizro.actions, mode="action", description="Action function.") | |||
function: CapturedCallable = Field(..., import_path="vizro.actions", mode="action", description="Action function.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just one question to double check - could this now allow people to circumvent security? I remember that part of this was also to forbid people injecting arbitrary code - now that this is a string, is this maybe somehow easier now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: @antonymilne, @huong-li-nguyen
Thank you @maxschulz-COL for the great question that prompted me to do more research on this topic!
The main purpose of the import_path
is not to forbid you to use arbitrary code, but to enable you to use predefined figure functions
(e.g. scatter, dash_ag_grid, kpi_card) when you're developing a dashboard using the yaml
configuration. So, the import_path represents a path from where the _target_
(string) figure function (scatter, dash_ag_grid, kpi_card) will be imported.
Okay, but is it possible to inject a custom figure function (e.g. my_advanced_scatter, my_custom_dash_ag_grid) into your app? Yes, it's possible to do it with the Python
Vizro configuration.
So what about the yaml
config? -> That was not possible before this PR, but now it is.
Changing the import_path to the relative path string (changes made in this PR) only enables you to in a really hacky way inject a custom figure function into your yaml dashboard. Here are the steps to do it:
- (rename vizro lib)
mv <path_to_site_packages>/vizro <path_to_site_packages>/vizro_original
- create a
vizro
folder near your app.py and add atable.py
module file inside:
example/
├── vizro/
│ └── table.py
└── app.py
- define the custom table figure function inside the table.py: ->
@capture("table")(def my_custom_table(data_frame)...)
- set the following yaml configuration
- figure:
_target_: my_custom_table
data_frame: df
type: table
That was pretty hacky (and probably possible even in the current vizro version) 😄
What is more important to acknowledge is that with "making figure models to be inheritable" we actually enable our users to in a pretty much native way inject custom figure function into the yaml
Vizro config 🎉 . Here are the steps to do it:
# app.py
class MyTable(vm.Table):
type: Literal["my_table"] = "my_table"
figure: CapturedCallable = Field(
..., import_path="my_table_file", mode="table", description="Function that returns a `Dash DataTable`."
)
vm.Page.add_type("components", MyTable)
# my_table_file.py
from vizro.models.types import capture
from vizro.tables import dash_data_table
@capture("table")
def my_custom_table_figure(data_frame):
return dash_data_table(data_frame)()
# dashboard.yaml
- figure:
_target_: my_custom_table_figure
data_frame: df_kpi
type: my_table
Okay the question is: Should we announce it?
There are many examples in our docs where it stands: "# Custom X are currently only possible via python configuration"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing research @petar-qb ! I think this is a great addition then! And I think we should consider changing it in the docs/adding it to the docs. What do you think @huong-li-nguyen @antonymilne ?
@antonymilne I am still not 100% sure about the fact that this is not security related. I understand now that we allow our custom figures to be used like that in a yaml
config, and now even user created custom figures could be used in a yaml
config (by subclassing the figure models), but there is also a conversation I remember we had with Ismail that was dedicatedly around this topic. Can you recall?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question @maxschulz-COL and great information indeed @petar-qb.
@maxschulz-COL I do remember the conversation with Ismail, and you're absolutely right to question the security of this 👍 I spent a while thinking about this while writing the solution and decided that nothing has changed here at all.
- previously we imported e.g.
vizro.tables
and then users could specify a target function in that namespace; now exactly the same thing happens, it's just that the place that import happens has changed (it's inCapturedCallable
rather than the model that usesCapturedCallable
). Either way we are doing exactly the samegetattr(imported_module, target_function)
so security here is the same - so long as you have access to Python, it's always possible to "circumvent" security of this to inject an arbitrary function from yaml. e.g. before this change it would already be possible to make your own model (just not by subclassing) and specify an arbitrary import path in there. Or actually you could even just do
vizro.tables.my_custom_function = custom_function
without needing a custom model at all (we never advertised this because it feels hacky). So nothing has changed here also. The significance of the security is that someone who has access only to yaml should not be able to inject an arbitrary Python function, and that is still the case now as it was before. (if you have access to Python then you can easily do damage through millions of ways anyway, so doing it through custom graphs is pretty much irrelevant)
So the only difference this change really makes is that now that subclassing is possible, there's a much more obvious way to inject a custom function than there was before. From a security perspective this is fine, because it still requires access to Python. The interesting thing here is that, as @petar-qb pointed out (but I had completely forgotten about so thank you very much for saying it!!), from a usability perspective there's now a much cleaner route for a user to inject a custom function than before 💯
So should we announce this new route? Not sure. Let me just look back on my notes to see if there's any other routes we should prefer instead. e.g. imagine the capture
decorator could register the function as usable from YAML configuration like this:
@capture("graph") # or maybe you need to explicitly say register=True to register it as available from YAML
def f():
...
# in capture function somewhere
setattr(vizro.tables, captured_function)
# or maybe we make a new namespace for it setattr(my_custom_stuff, captured_function)
# vizro.tables.captured_function exists and hence can be used from
# YAML without any subclassing or any new models needed
The advantage of the subclassing approach is that it's already possible (and will always remain possible, even if we enable other routes in future) and requires no effort from us.
So what I think we should do is a make a ticket for this to consider possible solutions and whether we want to advertise this one or if we think actually there are better approaches that we would prefer. While it would be really nice to have complete parity between YAML and Python, I don't think it's super urgent tbh. But if we like we can already advertise this new subclassing approach as one way of doing it (since it will always remain possible) and then in the future we might develop an even better solution. The YAML code should stay the same whatever the solution so it's just a question of writing the subclassing part up in docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antonymilne really thanks for your opinion. 🥇
Here is the ticket that includes the considering of the final solution and the announcement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, read through it as well in detail, great summary indeed! 💪
…e-figures-subclassable
Co-authored-by: Li Nguyen <[email protected]>
Co-authored-by: Li Nguyen <[email protected]>
Co-authored-by: Li Nguyen <[email protected]>
…res_subclassable.md Co-authored-by: Maximilian Schulz <[email protected]>
…kinsey/vizro into fix/make-figures-subclassable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for finishing this off! I can't approve my own PR but take this as an approval 😁 Subject to one comment I had about tests.
@@ -32,7 +31,7 @@ class Action(VizroBaseModel): | |||
|
|||
""" | |||
|
|||
function: CapturedCallable = Field(..., import_path=vizro.actions, mode="action", description="Action function.") | |||
function: CapturedCallable = Field(..., import_path="vizro.actions", mode="action", description="Action function.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question @maxschulz-COL and great information indeed @petar-qb.
@maxschulz-COL I do remember the conversation with Ismail, and you're absolutely right to question the security of this 👍 I spent a while thinking about this while writing the solution and decided that nothing has changed here at all.
- previously we imported e.g.
vizro.tables
and then users could specify a target function in that namespace; now exactly the same thing happens, it's just that the place that import happens has changed (it's inCapturedCallable
rather than the model that usesCapturedCallable
). Either way we are doing exactly the samegetattr(imported_module, target_function)
so security here is the same - so long as you have access to Python, it's always possible to "circumvent" security of this to inject an arbitrary function from yaml. e.g. before this change it would already be possible to make your own model (just not by subclassing) and specify an arbitrary import path in there. Or actually you could even just do
vizro.tables.my_custom_function = custom_function
without needing a custom model at all (we never advertised this because it feels hacky). So nothing has changed here also. The significance of the security is that someone who has access only to yaml should not be able to inject an arbitrary Python function, and that is still the case now as it was before. (if you have access to Python then you can easily do damage through millions of ways anyway, so doing it through custom graphs is pretty much irrelevant)
So the only difference this change really makes is that now that subclassing is possible, there's a much more obvious way to inject a custom function than there was before. From a security perspective this is fine, because it still requires access to Python. The interesting thing here is that, as @petar-qb pointed out (but I had completely forgotten about so thank you very much for saying it!!), from a usability perspective there's now a much cleaner route for a user to inject a custom function than before 💯
So should we announce this new route? Not sure. Let me just look back on my notes to see if there's any other routes we should prefer instead. e.g. imagine the capture
decorator could register the function as usable from YAML configuration like this:
@capture("graph") # or maybe you need to explicitly say register=True to register it as available from YAML
def f():
...
# in capture function somewhere
setattr(vizro.tables, captured_function)
# or maybe we make a new namespace for it setattr(my_custom_stuff, captured_function)
# vizro.tables.captured_function exists and hence can be used from
# YAML without any subclassing or any new models needed
The advantage of the subclassing approach is that it's already possible (and will always remain possible, even if we enable other routes in future) and requires no effort from us.
So what I think we should do is a make a ticket for this to consider possible solutions and whether we want to advertise this one or if we think actually there are better approaches that we would prefer. While it would be really nice to have complete parity between YAML and Python, I don't think it's super urgent tbh. But if we like we can already advertise this new subclassing approach as one way of doing it (since it will always remain possible) and then in the future we might develop an even better solution. The YAML code should stay the same whatever the solution so it's just a question of writing the subclassing part up in docs.
Description
Solving -> https://github.com/McK-Internal/vizro-internal/issues/1166
Screenshot
Notice
I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":