Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azcopy sync --delete-destination=true takes too much time on Deleting extra object #2844

Open
ppolushkin opened this issue Oct 27, 2024 · 0 comments
Labels

Comments

@ppolushkin
Copy link

Dear all,

I'm running azcopy sync command to backup data from azure data lake to azure storage account.

I'm using command like this:
azcopy sync https://mydatalake.blob.core.windows.net/my-container/ https://mystorageaccount.blob.core.windows.net/my-container/ --recursive --log-level=NONE --delete-destination=true
My azure data lake contains tens of millions small and medium files. Sync operation and copying of newly created files takes ~10 minutes, while deleting extra objects on destination takes many hours.

Moreover, even I specified --log-level=NONE I see messages like follow for each removed file:
6142703 Files Scanned at Source, 6844507 Files Scanned at Destination, 2-sec Throughput (Mb/s): 0 INFO: Deleting extra object: DELTA/path/to/my/file.parquet

Questions:

  1. Is it possible to delete files on destination by batches?
  2. How to turn off 'Deleting extra object' logging?

Details:

ubuntu:22.04

azcopy version 10.26.0

Environment variables:

AZCOPY_AUTO_LOGIN_TYPE: "SPN"
AZCOPY_TENANT_ID: "my-azure-tenant"
AZCOPY_SPA_APPLICATION_ID: "my-azure-client-id"
AZCOPY_SPA_CLIENT_SECRET: "secret"
AZCOPY_CONCURRENCY_VALUE: "3000"
AZCOPY_CONCURRENT_SCAN: "300"
AZCOPY_BUFFER_GB: "4"
AZCOPY_LOG_LOCATION: "/tmp"
AZCOPY_JOB_PLAN_LOCATION: "/tmp"

Kind regards,

@ppolushkin ppolushkin changed the title azcopy sync with --delete-destination=true option takes too much time on INFO: Deleting extra object azcopy sync --delete-destination=true takes too much time on Deleting extra object Oct 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants