You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
There is consistently 1 leaked VM after a transfer is quit.
To Reproduce
Run transfer skyplane cp -r gs://skyplane-big-test-bucket/OPT-cloudflare/ s3://test-us-east-1-7711e4ae/. During dispatch, Ctrl-C exit the transfer.
Transfer client log
Logging to: /tmp/skyplane/transfer_logs/20230623_145734-bd9ae325/client.log
Using Skyplane version 0.3.2
Will transfer objects from gcp:us-central1-a to aws:us-east-1
14:57:36 [WARN] Quota limit file not found for aws:us-east-1. Try running `skyplane init --reinit-aws` to load the quota information
VMs to provision: 1x aws:us-east-1, 1x gcp:us-central1-a
Estimated egress cost: $0.12/GB
gs://skyplane-big-test-bucket/OPT-cloudflare/reshard-model_part-0.pt => s3://test-us-east-1-7711e4ae/reshard-model_part-0.pt
(15.34GB)
gs://skyplane-big-test-bucket/OPT-cloudflare/reshard-model_part-1.pt => s3://test-us-east-1-7711e4ae/reshard-model_part-1.pt
(15.34GB)
gs://skyplane-big-test-bucket/OPT-cloudflare/reshard-model_part-2.pt => s3://test-us-east-1-7711e4ae/reshard-model_part-2.pt
(15.34GB)
gs://skyplane-big-test-bucket/OPT-cloudflare/reshard-model_part-3.pt => s3://test-us-east-1-7711e4ae/reshard-model_part-3.pt
(15.34GB)
gs://skyplane-big-test-bucket/OPT-cloudflare/reshard-model_part-4.pt => s3://test-us-east-1-7711e4ae/reshard-model_part-4.pt
(15.34GB)
...
Transfer starting
14:57:41 [WARN] Quota limit file not found for aws:us-east-1. Try running `skyplane init --reinit-aws` to load the quota information
✓ Provisioning VMs (2/2) in 37.14s
⠼ Authorizing gateways with firewalls ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/2 0:00:0114:58:41 [WARN] :us-east-1 Error adding IPs to security group, since it already exits: An error occurred (InvalidPermission.Duplicate)
when calling the AuthorizeSecurityGroupIngress operation: the specified rule "peer: 0.0.0.0/0, ALL, ALLOW" already exists
✓ Starting gateway container on VMs (2/2) in 28.52s
⠹ Transfer progressaws:us-east-1 ━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.6/122.7 GiB 482.5 MB/s 0:04:15^C
Transfer cancelled by user. Copying gateway logs and exiting.
⠇ Transfer progressaws:us-east-1 ━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.1/122.7 GiB 473.2 MB/s 0:04:1415:00:00 [ERROR] Error running <lambda>, GCPServer(region_tag=gcp:us-central1-a, instance_name=skyplane-gcp-de24eada): 'NoneType'
object has no attribute 'open_session'
15:00:00 [ERROR] Error running <lambda>, AWSServer(region_tag=aws:us-east-1, instance_id=i-0861627e6ae3b80f1): 'NoneType' object has no
attribute 'open_session'
Exception in thread Thread-35:
Traceback (most recent call last):
File "/Users/sarahwooders/repos/skyplane/skyplane/api/tracker.py", line 181, in monitor_single_dst_helper
self.monitor_transfer(dst_region)
File "/Users/sarahwooders/repos/skyplane/skyplane/utils/imports.py", line 33, in wrapped
return fn(*modules_imported, *args, **kwargs)
File "/Users/sarahwooders/repos/skyplane/skyplane/api/tracker.py", line 278, in monitor_transfer
do_parallel(lambda i: i.run_command("echo 1"), self.dataplane.bound_nodes.values(), n=8)
File "/Users/sarahwooders/repos/skyplane/skyplane/utils/fn.py", line 57, in do_parallel
args, result = future.result()
File "/usr/local/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/_base.py",
line 451, in result
return self.__get_result()
File "/usr/local/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/_base.py",
line 403, in __get_result
raise self._exception
File "/usr/local/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/thread.py",
line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/sarahwooders/repos/skyplane/skyplane/utils/fn.py", line 43, in wrapped_fn
return args, func(args)
File "/Users/sarahwooders/repos/skyplane/skyplane/api/tracker.py", line 278, in <lambda>
do_parallel(lambda i: i.run_command("echo 1"), self.dataplane.bound_nodes.values(), n=8)
File "/Users/sarahwooders/repos/skyplane/skyplane/compute/server.py", line 241, in run_command
_, stdout, stderr = client.exec_command(command)
File "/Users/sarahwooders/repos/skyplane/env/lib/python3.10/site-packages/paramiko/client.py", line 560, in exec_command
chan = self._transport.open_session(timeout=timeout)
AttributeError: 'NoneType' object has no attribute 'open_session'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 1016, in
_bootstrap_inner
self.run()
File "/Users/sarahwooders/repos/skyplane/skyplane/api/tracker.py", line 216, in run
raise e
File "/Users/sarahwooders/repos/skyplane/skyplane/api/tracker.py", line 214, in run
results.append(future.result())
File "/usr/local/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/_base.py",
line 451, in result
return self.__get_result()
File "/usr/local/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/_base.py",
line 403, in __get_result
raise self._exception
File "/usr/local/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/thread.py",
line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/sarahwooders/repos/skyplane/skyplane/api/tracker.py", line 194, in monitor_single_dst_helper
UsageClient.log_exception(
File "/Users/sarahwooders/repos/skyplane/skyplane/api/usage.py", line 147, in log_exception
stats = client.make_error(
File "/Users/sarahwooders/repos/skyplane/skyplane/api/usage.py", line 304, in make_error
dest_regions = [tag.split(":")[1] for tag in dest_region_tags]
File "/Users/sarahwooders/repos/skyplane/skyplane/api/usage.py", line 304, in <listcomp>
dest_regions = [tag.split(":")[1] for tag in dest_region_tags]
IndexError: list index out of range
⠇ Transfer progressaws:us-east-1 ━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.1/122.7 GiB 473.2 MB/s 0:04:14%
Environment info (please complete the following information):
Describe the bug
There is consistently 1 leaked VM after a transfer is quit.
To Reproduce
Run transfer
skyplane cp -r gs://skyplane-big-test-bucket/OPT-cloudflare/ s3://test-us-east-1-7711e4ae/
. During dispatch, Ctrl-C exit the transfer.Transfer client log
Environment info (please complete the following information):
SKY-270
The text was updated successfully, but these errors were encountered: