-
I want to find outliers within each group: import ibis
from ibis import _
t = ibis.examples.penguins.fetch()
is_outlier = _.body_mass_g > _.body_mass_g.quantile(0.95).over(group_by=["sex", "species"])
t[is_outlier] but this causes I can get this working in multiple steps with t2 = t.group_by(["sex", "species"]).mutate(
is_outlier=_.body_mass_g > _.body_mass_g.quantile(0.95)
)
t2[_.is_outlier] but this just feels clunky, I don't want to have to create a second relation with the temp column floating around.
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
FWIW, dbplyr allows window functions in library(dplyr)
con <- DBI::dbConnect(duckdb::duckdb())
t <- copy_to(con, palmerpenguins::penguins)
t |>
filter(body_mass_g > mean(body_mass_g, na.rm = TRUE), .by = c(sex, species)) |>
show_query()
## <SQL>
## SELECT
## species,
## island,
## bill_length_mm,
## bill_depth_mm,
## flipper_length_mm,
## body_mass_g,
## sex,
## "year"
## FROM (
## SELECT *, AVG(body_mass_g) OVER (PARTITION BY sex, species) AS q02
## FROM "palmerpenguins::penguins"
## ) q01
## WHERE (body_mass_g > q02)
|
Beta Was this translation helpful? Give feedback.
-
I think we'll have a solution for this soon™️ sqlglot supports transforming From there, sqlglot will compile either to |
Beta Was this translation helpful? Give feedback.
I think we'll have a solution for this soon™️
sqlglot supports transforming
QUALIFY
intoWHERE
(this is only important for some backends), but we need a way in ibis to represent a predicate that should be compiled into aQUALIFY
clause. We may be able to do this with an internal-onlyops.Value
, or by partitioning predicates into a newqualify
field onops.Selection
.From there, sqlglot will compile either to
QUALIFY
if the backend supports it, or the equivalent projection +WHERE
similar to what dplyr is doing.