Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERDDAP container does not restart after hypervisor power outage #15

Open
7yl4r opened this issue Jul 7, 2021 · 0 comments
Open

ERDDAP container does not restart after hypervisor power outage #15

7yl4r opened this issue Jul 7, 2021 · 0 comments
Labels
bug Something isn't working

Comments

@7yl4r
Copy link
Member

7yl4r commented Jul 7, 2021

I found the ERDDAP container unexpectedly down after a power outage on the hypervisor (dune).

[root@dune erddap-config]# docker container ls -a
CONTAINER ID   IMAGE                  COMMAND                  CREATED       STATUS                    PORTS                                                 NAMES
55f9d4f95018   axiom/docker-erddap    "/entrypoint.sh cata…"   3 weeks ago   Exited (255) 4 days ago   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp             erddap
eb42e6974ea1   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Up 4 days (healthy)       0.0.0.0:8888->8080/tcp, :::8888->8080/tcp             mbon-dashboard-server_airflow-webserver_1
9c17ae6debac   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Up About an hour          8080/tcp                                              mbon-dashboard-server_airflow-worker_1
714e01bb151f   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Up 4 days (healthy)       0.0.0.0:5555->5555/tcp, :::5555->5555/tcp, 8080/tcp   mbon-dashboard-server_flower_1
463faf47c645   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Up 4 days                 8080/tcp                                              mbon-dashboard-server_airflow-scheduler_1
f9f450e639b4   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Exited (0) 4 weeks ago                                                          mbon-dashboard-server_airflow-init_1
1d1eb66f64b4   postgres:13            "docker-entrypoint.s…"   4 weeks ago   Up 4 days (healthy)       5432/tcp                                              mbon-dashboard-server_postgres_1
ae8d44f3dd8b   redis:latest           "docker-entrypoint.s…"   4 weeks ago   Up 4 days (healthy)       0.0.0.0:6379->6379/tcp, :::6379->6379/tcp             mbon-dashboard-server_redis_1

Nothing interesting in docker logs erddap.
Trying to start back up I see this error:

[root@dune erddap-config]# docker-compose up -d --build
Starting erddap ... 

ERROR: for erddap  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

ERROR: for erddap  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)
ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

LAN & WAN seem fine:

[root@dune erddap-config]# ping google.com
PING google.com (172.217.2.142) 56(84) bytes of data.
64 bytes from yyz08s14-in-f142.1e100.net (172.217.2.142): icmp_seq=1 ttl=118 time=8.63 ms
64 bytes from yyz08s14-in-f142.1e100.net (172.217.2.142): icmp_seq=2 ttl=118 time=8.70 ms
^C
--- google.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 3ms
rtt min/avg/max/mdev = 8.634/8.666/8.698/0.032 ms
[root@dune erddap-config]# ping yin
PING yinmaster (192.168.1.203) 56(84) bytes of data.
64 bytes from yinmaster (192.168.1.203): icmp_seq=1 ttl=64 time=0.324 ms
64 bytes from yinmaster (192.168.1.203): icmp_seq=2 ttl=64 time=0.193 ms
^C
--- yinmaster ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 15ms
rtt min/avg/max/mdev = 0.193/0.258/0.324/0.067 ms

Tried docker-compose up again and got the same error. Did a reboot.

[root@dune ~]# docker container ls -a
CONTAINER ID   IMAGE                  COMMAND                  CREATED       STATUS                    PORTS                                                 NAMES
55f9d4f95018   axiom/docker-erddap    "/entrypoint.sh cata…"   3 weeks ago   Exited (255) 5 days ago   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp             erddap
eb42e6974ea1   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Up 58 minutes (healthy)   0.0.0.0:8888->8080/tcp, :::8888->8080/tcp             mbon-dashboard-server_airflow-webserver_1
9c17ae6debac   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Up 58 minutes             8080/tcp                                              mbon-dashboard-server_airflow-worker_1
714e01bb151f   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Up 58 minutes (healthy)   0.0.0.0:5555->5555/tcp, :::5555->5555/tcp, 8080/tcp   mbon-dashboard-server_flower_1
463faf47c645   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Up 58 minutes             8080/tcp                                              mbon-dashboard-server_airflow-scheduler_1
f9f450e639b4   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Exited (0) 4 weeks ago                                                          mbon-dashboard-server_airflow-init_1
1d1eb66f64b4   postgres:13            "docker-entrypoint.s…"   4 weeks ago   Up 58 minutes (healthy)   5432/tcp                                              mbon-dashboard-server_postgres_1
ae8d44f3dd8b   redis:latest           "docker-entrypoint.s…"   4 weeks ago   Up 58 minutes (healthy)   0.0.0.0:6379->6379/tcp, :::6379->6379/tcp             mbon-dashboard-server_redis_1

[root@dune erddap-config]# docker-compose up -d --build
Starting erddap ... 
Starting erddap ... error

ERROR: for erddap  Cannot start service erddap: driver failed programming external connectivity on endpoint erddap (b5bdc77641d2ba4a71a75443f323944e1f9951697e00e47b5268fad6a6990cb6): Bind for 0.0.0.0:8080 failed: port is already allocated

ERROR: for erddap  Cannot start service erddap: driver failed programming external connectivity on endpoint erddap (b5bdc77641d2ba4a71a75443f323944e1f9951697e00e47b5268fad6a6990cb6): Bind for 0.0.0.0:8080 failed: port is already allocated
ERROR: Encountered errors while bringing up the project.

Well that is different. Now that I am looking at it, indeed there might be some port conflicts with the airflow stuff... and now that I am looking at that: we're not using that! and there isn't a docker-compose.yml for it. Let's clean this up:

[root@dune ~]# docker container stop mbon-dashboard-server_airflow-webserver_1 mbon-dashboard-server_airflow-worker_1 mbon-dashboard-server_flower_1 mbon-dashboard-server_airflow-scheduler_1 mbon-dashboard-server_postgres_1 mbon-dashboard-server_redis_1

[root@dune erddap-config]# docker container ls -a
CONTAINER ID   IMAGE                  COMMAND                  CREATED       STATUS                      PORTS     NAMES
eb42e6974ea1   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Exited (0) 15 seconds ago             mbon-dashboard-server_airflow-webserver_1
9c17ae6debac   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Exited (0) 15 seconds ago             mbon-dashboard-server_airflow-worker_1
714e01bb151f   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Exited (0) 16 seconds ago             mbon-dashboard-server_flower_1
463faf47c645   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Exited (1) 15 seconds ago             mbon-dashboard-server_airflow-scheduler_1
f9f450e639b4   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago   Exited (0) 4 weeks ago                mbon-dashboard-server_airflow-init_1
1d1eb66f64b4   postgres:13            "docker-entrypoint.s…"   4 weeks ago   Exited (0) 16 seconds ago             mbon-dashboard-server_postgres_1
ae8d44f3dd8b   redis:latest           "docker-entrypoint.s…"   4 weeks ago   Exited (0) 17 seconds ago             mbon-dashboard-server_redis_1

try again:

[root@dune erddap-config]# docker-compose up -d --build
Creating network "erddap-config_default" with the default driver
Creating erddap ... 
Creating erddap ... error

ERROR: for erddap  Cannot start service erddap: driver failed programming external connectivity on endpoint erddap (ae39dce980c8d428c9a5c3a3a989e5e1c8cf47338466fab847143bbd0cd33b82): Bind for 0.0.0.0:8080 failed: port is already allocated

ERROR: for erddap  Cannot start service erddap: driver failed programming external connectivity on endpoint erddap (ae39dce980c8d428c9a5c3a3a989e5e1c8cf47338466fab847143bbd0cd33b82): Bind for 0.0.0.0:8080 failed: port is already allocated
ERROR: Encountered errors while bringing up the project.

hmm

[root@dune erddap-config]# lsof -i -P -n | grep 8080
docker-pr  2551    root    4u  IPv4   3301      0t0  TCP *:8080 (LISTEN)
docker-pr  2559    root    4u  IPv6  28007      0t0  TCP *:8080 (LISTEN)

[root@dune erddap-config]# systemctl stop docker
[root@dune erddap-config]# lsof -i -P -n | grep 8080
[root@dune erddap-config]# systemctl start docker
[root@dune erddap-config]# docker-compose up -d --build
Starting erddap ... 

ERROR: for erddap  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

ERROR: for erddap  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)
ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

uhhhh...

[root@dune erddap-config]# lsof -i -P -n | grep 8080
docker-pr 52624    root    4u  IPv4 245188      0t0  TCP *:8080 (LISTEN)
docker-pr 52631    root    4u  IPv6 245192      0t0  TCP *:8080 (LISTEN)
[root@dune erddap-config]# docker container ls -a
CONTAINER ID   IMAGE                  COMMAND                  CREATED         STATUS                   PORTS                                                 NAMES
2089de3635d0   axiom/docker-erddap    "/entrypoint.sh cata…"   3 minutes ago   Created                  0.0.0.0:8080->8080/tcp, :::8080->8080/tcp             erddap
eb42e6974ea1   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago     Up 2 minutes (healthy)   0.0.0.0:8888->8080/tcp, :::8888->8080/tcp             mbon-dashboard-server_airflow-webserver_1
9c17ae6debac   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago     Up 2 minutes             8080/tcp                                              mbon-dashboard-server_airflow-worker_1
714e01bb151f   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago     Up 2 minutes (healthy)   0.0.0.0:5555->5555/tcp, :::5555->5555/tcp, 8080/tcp   mbon-dashboard-server_flower_1
463faf47c645   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago     Up 2 minutes             8080/tcp                                              mbon-dashboard-server_airflow-scheduler_1
f9f450e639b4   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   4 weeks ago     Exited (0) 4 weeks ago                                                         mbon-dashboard-server_airflow-init_1
1d1eb66f64b4   postgres:13            "docker-entrypoint.s…"   4 weeks ago     Up 2 minutes (healthy)   5432/tcp                                              mbon-dashboard-server_postgres_1
ae8d44f3dd8b   redis:latest           "docker-entrypoint.s…"   4 weeks ago     Up 2 minutes (healthy)   0.0.0.0:6379->6379/tcp, :::6379->6379/tcp             mbon-dashboard-server_redis_1

WHAT?!?

docker-compose.yml only has the ERDDAP container in it.
Why did those airflow containers turn back on?
Let's try again but be less nice to them:

[root@dune erddap-config]# docker container stop mbon-dashboard-server_airflow-webserver_1 mbon-dashboard-server_airflow-worker_1 mbon-dashboard-server_flower_1 mbon-dashboard-server_airflow-scheduler_1 mbon-dashboard-server_postgres_1 mbon-dashboard-server_redis_1

[root@dune erddap-config]# docker container prune
WARNING! This will remove all stopped containers.
Are you sure you want to continue? [y/N] y
Deleted Containers:
2089de3635d017a251205da3ad3c5f7c4e27e44b27714ace73d913b8e3d235d4
eb42e6974ea131887214dddd8e53472720be510c42a86220ab83cebd146cac18
9c17ae6debacad82a405d6139bbc90b6bdf7c7631b11fa29672310210d89ce04
714e01bb151fe2e9f0d26a8fcbed2690a3d43ba0201c0291b07f6f602eaedaea
463faf47c645748b6cc0a8eb3dc2f32ae9ace22eb37274cdb9941ad8a858ce93
f9f450e639b41670e20042ff37f0f5d39d1c5034991252791f98fc144ac79f6a
1d1eb66f64b431be214ecbda37fec865eec9650876e6983bac40b2d329c4d431
ae8d44f3dd8b8d10bfc98eb44a0844ff7c7c6091bec8836ddd9b7cc9c3020f6d

Total reclaimed space: 154.3MB
[root@dune erddap-config]# docker container ls -a
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

[root@dune erddap-config]# docker container ls -a
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
[root@dune erddap-config]# lsof -i -P -n | grep 8080
docker-pr 56599    root    4u  IPv4 297088      0t0  TCP *:8080 (LISTEN)
docker-pr 56606    root    4u  IPv6 297997      0t0  TCP *:8080 (LISTEN)
[root@dune erddap-config]# systemctl stop docker
Warning: Stopping docker.service, but it can still be activated by:
  docker.socket
[root@dune erddap-config]# lsof -i -P -n | grep 8080
[root@dune erddap-config]# systemctl start docker
[root@dune erddap-config]# lsof -i -P -n | grep 8080
[root@dune erddap-config]# docker container ls -a
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
[root@dune erddap-config]# docker-compose up -d --build
Creating erddap ... error

ERROR: for erddap  Cannot start service erddap: error while creating mount source path '/srv/imars-objects/modis_aqua_fk': mkdir /srv/imars-objects/modis_aqua_fk: permission denied

ERROR: for erddap  Cannot start service erddap: error while creating mount source path '/srv/imars-objects/modis_aqua_fk': mkdir /srv/imars-objects/modis_aqua_fk: permission denied
ERROR: Encountered errors while bringing up the project.

Okay so that error is because thing2 is down. (https://github.com/USF-IMARS/server-status/issues/166).
I brought thing2 back up and started up without issue.

[root@dune erddap-config]# docker container ls -a
CONTAINER ID   IMAGE                 COMMAND                  CREATED          STATUS         PORTS                                       NAMES
5bdcd91cfd9a   axiom/docker-erddap   "/entrypoint.sh cata…"   55 minutes ago   Up 2 minutes   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp   erddap
@7yl4r 7yl4r added the bug Something isn't working label Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant