How to explode and select structs in cells #1964
-
Hi all, amazing project! test.json {
"Records": [
{"a": 10, "b": 2},
{"a": 11, "b": 3},
{"a": 12, "b": 4, "c": 3},
{"a": 13, "b": "aa", "c": 3}
]
} test2.json
import daft
from daft import col, lit
from daft.datatype import DataType
df = daft.read_json(
"test2.json",
schema_hints={
"Records": DataType.list(DataType.struct({
"a": DataType.string(),
"b": DataType.string(),
"c": DataType.string(),
"d": DataType.string(),
}))
}
)
print(df.show())
"""
╭──────────────────────────────────────────────────╮
│ Records │
│ --- │
│ List[Struct[a: Utf8, b: Utf8, c: Utf8, d: Utf8]] │
╞══════════════════════════════════════════════════╡
│ [{a: 10, │
│ b: 2, │
│ c: None, │
│ d: No… │
╰──────────────────────────────────────────────────╯
"""
print(df.explode(col("Records")).show())
"""
╭────────────────────────────────────────────╮
│ Records │
│ --- │
│ Struct[a: Utf8, b: Utf8, c: Utf8, d: Utf8] │
╞════════════════════════════════════════════╡
│ {a: 10, │
│ b: 2, │
│ c: None, │
│ d: Non… │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ {a: 11, │
"""
print(df.explode(col("Records")).select(col("Records.*")).show()) last command outputs the following error
expected behavior was to have them as columns
regards,c. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @CrashLaker, Thanks for trying out daft! So the issue here is that we currently don't support selector expressions with regex In the mean time, we do provide a struct accessor. Another idea of what we can do is add a Workaround example!
|
Beta Was this translation helpful? Give feedback.
Hi @CrashLaker,
Thanks for trying out daft! So the issue here is that we currently don't support selector expressions with regex
col("Records.*")
but we will be adding that soon!In the mean time, we do provide a struct accessor.
Another idea of what we can do is add a
flatten()
operation for struct columns (like a horizontal explode) that will give you the output you're looking for.Workaround example!