You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AMPCamp big data mini course uses ~20GB dataset which is stored on S3.
This dataset has to be downloaded to master's local disk and then put to cluster HDFS.
Downloading from S3 to local disk of virtual instance on SL takes ~20 min.
Downloading same dataset from SL Object Storage takes ~6 min, but there is a problem: on SL Object Store it is not possible to make object public (as it is possible in S3).
It is possible though to enable CDN on objects stored in SL and they can be downloaded via HTTP url. However, in this case dataset should be archived and stored as bulk file instead of multiple separate files, as it was originally.
The text was updated successfully, but these errors were encountered:
AMPCamp big data mini course uses ~20GB dataset which is stored on S3.
This dataset has to be downloaded to master's local disk and then put to cluster HDFS.
Downloading from S3 to local disk of virtual instance on SL takes ~20 min.
Downloading same dataset from SL Object Storage takes ~6 min, but there is a problem: on SL Object Store it is not possible to make object public (as it is possible in S3).
It is possible though to enable CDN on objects stored in SL and they can be downloaded via HTTP url. However, in this case dataset should be archived and stored as bulk file instead of multiple separate files, as it was originally.
The text was updated successfully, but these errors were encountered: