You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Issue description
When reading a file with read_ndjson that has a column that is null, this column gets dropped from the result dataframe rather than being included with a column with a single null entry
Reproducible example
frompathlibimportPathimportpandasaspdimportpolarsasplpayload_w_null="""{"x":1,"text":null}"""payload_wo_null="""{"x":1,"text":"a"}"""file_w_null="jsonwithnull.jsonl"file_wo_null="jsonwithoutnull.jsonl"Path(file_w_null).write_text(payload_w_null)
Path(file_wo_null).write_text(payload_wo_null)
print(
pl.read_ndjson(file_w_null),
pl.read_ndjson(file_wo_null),
pd.read_json(file_w_null, lines=True),
sep="\n",
)
# shape: (1, 1)# ┌─────┐# │ x │# │ --- │# │ i64 │# ╞═════╡# │ 1 │# └─────┘# shape: (1, 2)# ┌─────┬──────┐# │ x ┆ text │# │ --- ┆ --- │# │ i64 ┆ str │# ╞═════╪══════╡# │ 1 ┆ a │# └─────┴──────┘# x text# 0 1 NaN
Expected behavior
I would expect the output to match that of the below
Polars version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Issue description
When reading a file with
read_ndjson
that has a column that is null, this column gets dropped from the result dataframe rather than being included with a column with a single null entryReproducible example
Expected behavior
I would expect the output to match that of the below
Installed versions
The text was updated successfully, but these errors were encountered: