Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kurtosis zkevm-stateless-executor-001 stopped #359

Open
ArrjunPradeep opened this issue Nov 7, 2024 · 15 comments
Open

Kurtosis zkevm-stateless-executor-001 stopped #359

ArrjunPradeep opened this issue Nov 7, 2024 · 15 comments
Labels
bug Something isn't working

Comments

@ArrjunPradeep
Copy link

System information

Ubuntu22.04

Commit id

d

Description & steps to reproduce

I'm getting the following error while executing this command "kurtosis run --enclave cdk ." [Deploy to Sepolia]

image
image

Desired behavior

Network running

What is the severity of this bug?

Critical; I am blocked and Kurtosis CDK is unusable for me because of this bug.

@ArrjunPradeep ArrjunPradeep added the bug Something isn't working label Nov 7, 2024
@praetoriansentry
Copy link
Collaborator

I don't think I've seen this before. It looks like the machine this is running on is very slow. As a result, the executor is taking so long to start that Kurtosis is assuming the process isn't starting properly.

What type of machine is this running on? I'll paste my logs below so you can compare, but it looks like the process in your screenshot is 10x slower

image

@ArrjunPradeep
Copy link
Author

Windows WSL2 -> Ubuntu22.04

@praetoriansentry
Copy link
Collaborator

One option would be to add a wait attribute to the PortSpecs here:

"hash-db": PortSpec(args["zkevm_hash_db_port"], application_protocol="grpc"),
"executor": PortSpec(args["zkevm_executor_port"], application_protocol="grpc"),

Setting wait = None should fix the issue, but could cause issues with other components that expect the executor to be running.

Setting wait = 10m or some longer amount of time should give your system more time to load. The risk is that it's very slow or there is something else wrong that will make it take a very long time. I would experiment with both and see if you can get anywhere.

@ArrjunPradeep
Copy link
Author

No, it doesn't work as expected. Encountered with the same error mentioned in the earlier thread above. According to official docs, If you notice some services, such as the zkevm-stateless-executor or zkevm-prover, consistently having the status of STOPPED, try increasing the Docker memory allocation. May I know how much it should be ?

@praetoriansentry
Copy link
Collaborator

That's strange. If you changed the wait time, you shouldn't get the same error. It would make sense if it never finished.. But strange if you're getting the same exact error as above.

I would guess you probably want at least 4GB. I'm running 3 networks right now and it looks like the biggest container is the zkProver process which is using roughly 2.5 GB

image

It's possible the max memory goes higher than that when it's initializing.

@leovct
Copy link
Member

leovct commented Nov 8, 2024

I recommend setting it to at least 10/12 GB, it may use less when running idle but when initialising, it requires a lot of memory. The different zkevm provers (zkevm-prover, zkevm-stateless-executor) use around 2.5 GB and the agglayer prover around 1GB. If you add all the other components, it quickly adds up.

@leovct
Copy link
Member

leovct commented Nov 14, 2024

Hello @ArrjunPradeep, have you managed to fix the issue?

@ArrjunPradeep
Copy link
Author

Hi @leovct , I have set to 12GB around, Now its showing a different error :

Screenshot 2024-11-14 170837
Screenshot 2024-11-14 170854
Screenshot 2024-11-14 170906
Screenshot 2024-11-14 170919

@leovct
Copy link
Member

leovct commented Nov 14, 2024

Hi @leovct , I have set to 12GB around, Now its showing a different error :

It looks like you Docker daemon stopped. You might have to restart it. Then you'll need to clean the enclaves with kurtosis clean --all before deploying the enclave with kurtosis run --enclave cdk .

@ArrjunPradeep
Copy link
Author

I tried cleaning the enclaves and starting again. But still :

image

@leovct
Copy link
Member

leovct commented Nov 14, 2024

Can you pull the latest changes?

@ArrjunPradeep
Copy link
Author

ArrjunPradeep commented Nov 15, 2024

Wait time is 20m :
ports = {
"hash-db": PortSpec(args["zkevm_hash_db_port"], application_protocol="grpc", wait='20m'),
"executor": PortSpec(args["zkevm_executor_port"], application_protocol="grpc", wait='20m'),
}

Screenshot 2024-11-15 102944
Screenshot 2024-11-15 102956
Screenshot 2024-11-15 103004

@leovct
Copy link
Member

leovct commented Nov 15, 2024

As @praetoriansentry mentioned, it looks like your machine is very slow.... Even after 20 minutes, the prover is still not started. I would advise to allocate more memory to docker, as suggested above.

@0xPolygon 0xPolygon deleted a comment from Mody6595 Nov 18, 2024
@ArrjunPradeep
Copy link
Author

ArrjunPradeep commented Nov 20, 2024

@leovct Now I am running on macOS. Still I am getting error :

Screenshot 2024-11-20 at 10 54 50 PM image

@leovct
Copy link
Member

leovct commented Nov 21, 2024

I would suggest to clean the enclave and give it a retry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants