Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to I process the telemetry json messages in Azure data lake? #57

Open
deepakkumpala opened this issue Jan 21, 2019 · 1 comment
Open

Comments

@deepakkumpala
Copy link

deepakkumpala commented Jan 21, 2019

I have hundred of devices which sending messages to IoT Hub and I am trying to use data lake to process all these messages.

All the articles out there in internet shows uploading CSV files for processing. Is converting to json messages to CSV file is must before getting them processed by data lake engine? can't I process all the incoming json telemetry directly in azure data lake?

@pbakhil
Copy link

pbakhil commented Oct 3, 2019

Converting json to csv can give you advantages while processing. From ADLA perspective, if you have independent rows, then more chances you can parallelize the job. Formats like xml and json are not friendly for big data processing. The size of the data being processed also reduces in subsequent steps if you convert json to csv. Keeping data in json format and processing it was less efficient from our experience, it also depends on your json data structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants