Skip to content

Commit

Permalink
new post / 2024-05-05-convert-json-to-parquet-and-send-to-azure-blob-…
Browse files Browse the repository at this point in the history
…storage.md
  • Loading branch information
copdips committed May 5, 2024
1 parent 559ecec commit 0d3bda0
Showing 1 changed file with 39 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
authors:
- copdips
categories:
- python
- file
- spark
comments: true
date:
created: 2024-05-05
---

# Convert json to parquet and send to Azure Blob Storage

```python title="with pyarrow only without pandas"
# pip install adlfs pyarrow
# https://arrow.apache.org/docs/python/parquet.html#reading-from-cloud-storage

from os import environ

import pyarrow as pa
import pyarrow.parquet as pq
from adlfs import AzureBlobFileSystem


json_file = "aaa.json"
blob_connection_string = environ["AZURE_BLOB_CONNECTION_STRING"]
blob_container_name = "bbb"

table = pa.Table.from_json(source_file)

abfs = AzureBlobFileSystem(connection_string=blob_connection_string)

pq.write_table(
table,
f"{blob_container_name}/another_folder/output.parquet",
filesystem=abfs
)
```

0 comments on commit 0d3bda0

Please sign in to comment.