Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deploy command does not succeed or fail when called an a module that is already deployed #1035

Closed
matt2e opened this issue Mar 7, 2024 · 4 comments
Assignees

Comments

@matt2e
Copy link
Collaborator

matt2e commented Mar 7, 2024

Steps to repro:

  • Run ftl with a deployment
  • Run ftl deploy <existing deployment>
  • Expected the command to come back with a failure message

Output:

info:echo: Building module
info:echo: Deploying module
<Hangs forever>

ftl logs:

debug:controller0: Created deployment echo-93a97481df
debug:controller0: Replace deployment for: echo-93a97481df
debug:controller0: Deployment already exists: echo-93a97481df
@matt2e matt2e added the triage Issue needs triaging label Mar 7, 2024
@alecthomas alecthomas mentioned this issue Mar 6, 2024
@wesbillman
Copy link
Collaborator

There might be a race condition here with creating runners and deploying modules. I was able to see this log today:

ftl serve --recreate --log-level=debug
debug: Unary RPC failed: unavailable: dial tcp [::1]:8892: connect: connection refused: /xyz.block.ftl.v1.ControllerService/Ping
info: Starting FTL with 1 controller(s) and 0 runner(s)
debug: Reusing existing docker container "ftl-db-1" on port "54320" for postgres db
debug: Waiting for ftl-db-1 to be healthy
debug: Postgres DSN: postgres://postgres:secret@localhost:54320/ftl?sslmode=disable
debug:migrate: Applying: 001_init.sql
debug:controller0: Starting FTL controller
debug:controller0: Building console...
debug:controller0:
debug:controller0: up to date, audited 860 packages in 647ms
debug:controller0:
debug:controller0: 189 packages are looking for funding
debug:controller0:   run `npm fund` for details
debug:controller0:
debug:controller0: 1 moderate severity vulnerability
debug:controller0:
debug:controller0: To address all issues, run:
debug:controller0:   npm audit fix
debug:controller0:
debug:controller0: Run `npm audit` for details.
info:controller0: Web console available at: http://localhost:8892
debug:controller0: Starting DB listener
debug:controller0: Listening on http://localhost:8892
debug:controller0:
debug:controller0: > [email protected] dev
debug:controller0: > vite
debug:controller0:
debug:controller0: Port 5173 is in use, trying another one...
debug:controller0: Port 5174 is in use, trying another one...
debug:controller0: Port 5175 is in use, trying another one...
debug:controller0: Port 5176 is in use, trying another one...
debug:controller0: Port 5177 is in use, trying another one...
debug:controller0: Port 5178 is in use, trying another one...
debug:controller0:
debug:controller0:   VITE v4.5.2  ready in 232 ms
debug:controller0:
debug:controller0:   ➜  Local:   http://localhost:5179/
debug:controller0:   ➜  Network: use --host to expose
debug:controller0: Adding 1 replicas
debug:controller0: Seeded 0 deployments
debug:controller0: ….0-SNAPSHOT.jar| $ mvn -B -N install
debug:controller0: ….0-SNAPSHOT.jar|   mvn -B -pl :ftl-runtime install
debug:controller0: ….0-SNAPSHOT.jar| [INFO] Scanning for projects...
debug:controller0: ….0-SNAPSHOT.jar| [INFO]
debug:controller0: ….0-SNAPSHOT.jar| [INFO] ---------------------------< xyz.block:ftl >----------------------------
debug:controller0: ….0-SNAPSHOT.jar| [INFO] Building FTL 1.0-SNAPSHOT
debug:controller0: ….0-SNAPSHOT.jar| [INFO]   from pom.xml
debug:controller0: ….0-SNAPSHOT.jar| [INFO] --------------------------------[ pom ]---------------------------------
debug:controller0: ….0-SNAPSHOT.jar| [INFO]
debug:controller0: ….0-SNAPSHOT.jar| [INFO] --- install:3.1.1:install (default-install) @ ftl ---
debug:controller0: ….0-SNAPSHOT.jar| [INFO] Installing /Users/wesbillman/dev/ftl/pom.xml to /Users/wesbillman/.m2/repository/xyz/block/ftl/1.0-SNAPSHOT/ftl-1.0-SNAPSHOT.pom
debug:controller0: ….0-SNAPSHOT.jar| [INFO] ------------------------------------------------------------------------
debug:controller0: ….0-SNAPSHOT.jar| [INFO] BUILD SUCCESS
debug:controller0: ….0-SNAPSHOT.jar| [INFO] ------------------------------------------------------------------------
debug:controller0: ….0-SNAPSHOT.jar| [INFO] Total time:  0.109 s
debug:controller0: ….0-SNAPSHOT.jar| [INFO] Finished at: 2024-03-07T09:36:36-07:00
debug:controller0: ….0-SNAPSHOT.jar| [INFO] ------------------------------------------------------------------------
debug:controller0: Created new artefact d0b37a460be0654d47a21c2cda3e119550ddb87708275cfdd0d424deee362fe2
debug:controller0: Created deployment time-ff57b9a6a0
debug:controller0: Replace deployment for: time-ff57b9a6a0
debug:controller0: Reconciling time-ff57b9a6a0
debug:controller0: Need 1 more runners for time-ff57b9a6a0
error:controller0: Failed to increase deployment replicas: failed to claim runners for time-ff57b9a6a0: no idle runners found matching labels {"languages":["go"]}: not found
debug:controller0: Reconciling time-ff57b9a6a0
debug:controller0: Need 1 more runners for time-ff57b9a6a0
error:controller0: Failed to increase deployment replicas: failed to claim runners for time-ff57b9a6a0: no idle runners found matching labels {"languages":["go"]}: not found
debug:controller0: Reconciling time-ff57b9a6a0
debug:controller0: Need 1 more runners for time-ff57b9a6a0
error:controller0: Failed to increase deployment replicas: failed to claim runners for time-ff57b9a6a0: no idle runners found matching labels {"languages":["go"]}: not found
debug:controller0: Reconciling time-ff57b9a6a0
debug:controller0: Need 1 more runners for time-ff57b9a6a0
error:controller0: Failed to increase deployment replicas: failed to claim runners for time-ff57b9a6a0: no idle runners found matching labels {"languages":["go"]}: not found

After a few seconds of the above, I see this:

debug:controller0: Reconciling time-ff57b9a6a0
debug:controller0: Need 1 more runners for time-ff57b9a6a0
error:controller0: Failed to increase deployment replicas: failed to claim runners for time-ff57b9a6a0: no idle runners found matching labels {"languages":["go"]}: not found
debug:controller0: ….0-SNAPSHOT.jar| [INFO] Scanning for projects...
debug:controller0: ….0-SNAPSHOT.jar| [INFO]
debug:controller0: ….0-SNAPSHOT.jar| [INFO] -----------------------< xyz.block:ftl-runtime >------------------------
debug:controller0: ….0-SNAPSHOT.jar| [INFO] Building xyz.block:ftl-runtime 1.0-SNAPSHOT
debug:controller0: ….0-SNAPSHOT.jar| [INFO]   from pom.xml
debug:controller0: ….0-SNAPSHOT.jar| [INFO] --------------------------------[ jar ]---------------------------------
debug:controller0: ….0-SNAPSHOT.jar| [INFO]
debug:controller0: ….0-SNAPSHOT.jar| [INFO] --- dependency:3.6.1:copy (default) @ ftl-runtime ---

... maven build nonsense

debug:controller0: ….0-SNAPSHOT.jar| [INFO] --- jar:3.3.0:jar (default-jar) @ ftl-runtime ---
debug:controller0: ….0-SNAPSHOT.jar| [INFO] Building jar: /Users/wesbillman/dev/ftl/kotlin-runtime/ftl-runtime/target/ftl-runtime-1.0-SNAPSHOT.jar
debug:controller0: ….0-SNAPSHOT.jar| [INFO]
debug:controller0: ….0-SNAPSHOT.jar| [INFO] --- install:3.1.1:install (default-install) @ ftl-runtime ---
debug:controller0: ….0-SNAPSHOT.jar| [INFO] Installing /Users/wesbillman/dev/ftl/kotlin-runtime/ftl-runtime/pom.xml to /Users/wesbillman/.m2/repository/xyz/block/ftl-runtime/1.0-SNAPSHOT/ftl-runtime-1.0-SNAPSHOT.pom
debug:controller0: ….0-SNAPSHOT.jar| [INFO] Installing /Users/wesbillman/dev/ftl/kotlin-runtime/ftl-runtime/target/ftl-runtime-1.0-SNAPSHOT.jar to /Users/wesbillman/.m2/repository/xyz/block/ftl-runtime/1.0-SNAPSHOT/ftl-runtime-1.0-SNAPSHOT.jar
debug:controller0: ….0-SNAPSHOT.jar| [INFO] ------------------------------------------------------------------------
debug:controller0: ….0-SNAPSHOT.jar| [INFO] BUILD SUCCESS
debug:controller0: ….0-SNAPSHOT.jar| [INFO] ------------------------------------------------------------------------
debug:controller0: ….0-SNAPSHOT.jar| [INFO] Total time:  18.090 s
debug:controller0: ….0-SNAPSHOT.jar| [INFO] Finished at: 2024-03-07T09:36:56-07:00
debug:controller0: ….0-SNAPSHOT.jar| [INFO] ------------------------------------------------------------------------
debug:controller0: ….0-SNAPSHOT.jar| …ftl-runtime.jar| $ install -m 0600 kotlin-runtime/ftl-runtime/target/ftl-runtime-1.0-SNAPSHOT.jar build/template/ftl/jars/ftl-runtime.jar
debug:controller0: Starting runner: R01HRCWCS99A3ZAC63PKTV2DFJN
debug:runner0: Starting FTL Runner
debug:runner0: Deployment directory: /Users/wesbillman/Library/Caches/ftl-runner/runner0/deployments

The flow seems like the server starts and the controller is "online". But there are no runners yet, and they still need a maven build (from what I can tell in the log). So when we run ftl deploy ... it will build the module and then run a maven build, which takes quite a bit of time. I think this makes it look like the ftl deploy command has hung.

If I wait for the runner to build and be available before running ftl deploy it seems to work as expected...
Screenshot 2024-03-07 at 9 44 39 AM

@wesbillman
Copy link
Collaborator

This also only seems to be an issue when using ftl in our repo since it's rebuilding with bit. Release builds are ok still afaict.

@wesbillman wesbillman removed the triage Issue needs triaging label Mar 7, 2024
@wesbillman
Copy link
Collaborator

@matt2e I think this one is fixed maybe? I'm not able to reproduce. Might have been a side effect of some of the other FTL changes at the time?

@matt2e
Copy link
Collaborator Author

matt2e commented Mar 14, 2024

I think so too @wesbillman, haven't been able to reproduce on my side either

@matt2e matt2e closed this as completed Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants