Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running make 5gc install #2

Open
hussainahmad1995 opened this issue Aug 24, 2023 · 6 comments
Open

Error when running make 5gc install #2

hussainahmad1995 opened this issue Aug 24, 2023 · 6 comments

Comments

@hussainahmad1995
Copy link

I am working using the quickstart guide on Aether website but when I run the following command :
make aether-5gc-install

Get the following failure error from one of the Ansible tasks in the path /home/atlas-support/projects/aether-onramp/deps/5gc/roles/core/tasks/install.yml:

- name: deploy aether 5gc
  block:
    - name: deploy aether 5gc
      kubernetes.core.helm:
        update_repo_cache: true
        name: sd-core
        release_namespace: omec
        create_namespace: true
        chart_ref: "{{ core.helm.chart_ref }}"
        chart_version: "{{ core.helm.chart_version }}"
        values_files:
          - /tmp/sdcore-5g-values.yaml
        wait: true
        wait_timeout: "1m30s"
        force: true
      when: inventory_hostname in groups['master_nodes'] 


TASK [core : deploy aether 5gc] ****************************************************************************************************************
fatal: [node1]: FAILED! => {"changed": false, "command": "/usr/local/bin/helm --version=0.12.6 upgrade -i --reset-values --wait --timeout 1m30s --create-namespace --values=/tmp/sdcore-5g-values.yaml sd-core aether/sd-core", "msg": "Failure when executing Helm command. Exited 1.\nstdout: Release \"sd-core\" does not exist. Installing it now.\n\nstderr: coalesce.go:175: warning: skipped value for kafka.config: Not a table.\nError: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline\n", "stderr": "coalesce.go:175: warning: skipped value for kafka.config: Not a table.\nError: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline\n", "stderr_lines": ["coalesce.go:175: warning: skipped value for kafka.config: Not a table.", "Error: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline"], "stdout": "Release \"sd-core\" does not exist. Installing it now.\n", "stdout_lines": ["Release \"sd-core\" does not exist. Installing it now."]}

The status shows one of the deployed pods crashes. Any suggestions on where are things going wrong ?

kubectl get pods -n omec

NAME READY STATUS RESTARTS AGE
amf-5887bbf6c5-f9nbb 1/1 Running 0 13m
ausf-6dbb7655c7-mlcgb 1/1 Running 0 11m
kafka-0 1/1 Running 1 (12m ago) 13m
metricfunc-b9f8c667b-9wfgq 1/1 Running 0 13m
mongodb-0 1/1 Running 0 13m
mongodb-1 1/1 Running 0 12m
mongodb-arbiter-0 1/1 Running 0 13m
nrf-54bf88c78c-tmn8f 1/1 Running 0 13m
nssf-5b85b8978d-dxcrv 1/1 Running 0 13m
pcf-758d7cfb48-tlflp 1/1 Running 0 13m
sd-core-zookeeper-0 1/1 Running 0 13m
simapp-6cccd6f787-ng7f4 1/1 Running 0 13m
smf-7f89c6d849-njxvk 1/1 Running 0 13m
udm-768b9987b4-ppckx 1/1 Running 0 13m
udr-8566897d45-6d82h 1/1 Running 0 13m
upf-0 3/5 CrashLoopBackOff 13 (15s ago) 11m
webui-5894ffd49d-4bcgg 1/1 Running 0 13m

@hussainahmad1995 hussainahmad1995 changed the title Error when running make 5gc Error when running make 5gc install Aug 24, 2023
@mbilal92
Copy link
Contributor

It appears that one of your UPF pods is experiencing recurrent crashes.

Could you please provide the logs for the following containers within the UPF pod: "bessd," "routectl," "web," "pfcp-agent," and "arping"?

You can use the command kubectl logs upf-0 'container-name' -p -n omec to retrieve these logs.

@hussainahmad1995
Copy link
Author

hussainahmad1995 commented Sep 19, 2023

Here is the log file for "bessd," "routectl," "web," "pfcp-agent," and "arping"

 kubectl logs -n omec upf-0 bessd
+ bessd -m 0 -f -grpc-url=0.0.0.0:10514
kubectl logs -n omec -p upf-0 routectl
/opt/bess/bessctl/conf/route_control.py:311: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if item.prefix_len is 0:
Connecting to BESS daemon...
Error connecting to BESS daemon. Retrying in 2sec...
Error connecting to BESS daemon. Retrying in 2sec...
Error connecting to BESS daemon. Retrying in 2sec...
Error connecting to BESS daemon. Retrying in 2sec...
Error connecting to BESS daemon. Retrying in 2sec...
Traceback (most recent call last):
  File "/opt/bess/bessctl/conf/route_control.py", line 517, in <module>
    main()
  File "/opt/bess/bessctl/conf/route_control.py", line 483, in main
    connect_bessd()
  File "/opt/bess/bessctl/conf/route_control.py", line 438, in connect_bessd
    raise Exception('BESS connection failure.')
Exception: BESS connection failure.
kubectl logs -n omec -p upf-0 web
Error from server (BadRequest): previous terminated container "web" in pod "upf-0" not found
kubectl logs -n omec -p upf-0 pfcp-agent
Error from server (BadRequest): previous terminated container "pfcp-agent" in pod "upf-0" not found
kubectl logs -n omec -p upf-0 arping
Error from server (BadRequest): previous terminated container "arping" in pod "upf-0" not found

@llpeterson
Copy link
Contributor

Digging into the UPF logs might give us a hint, but I have to believe that this is a configuration error of some kind. Since you originally modified the ran_subnet field in vars/main.yml -- and I had you do a brute force uninstall of k8s -- I wonder if there was a residual file left behind. It's a bother, but could you
(1) Uninstall the whole system
$ make 5gc-uninstall
$ make k8s-uninstall
(2) Look in /etc/systemd/network
-- I believe it should be empty after the uninstall
-- Let me know if its not
(3) And if it is empty, reboot the server.
(4) Clone a fresh version of OnRamp
-- I'm not sure, but there's a chance of small fixes since you last cloned it
-- Edit two instances of data_iface, plus amf.ip like we did yesterday
(5) Reinstall the system
$ make k8s-install
$ make 5gc-install
(6) See if it works this time.

@CrABonzz
Copy link

CrABonzz commented Jan 16, 2024

Hello,
I get the same error, how was if fixed?
The crashed pods for me are upf and mongdo
Thanks

@Bhuvaneshnetcon
Copy link

Bhuvaneshnetcon commented Feb 9, 2024

Hi @llpeterson ,
I have built aether-onramp successfully and did make aether-5gc-install but my UE quectel RM-520N-GL with 3GPP release 16 does not get internet. For your kind note upf is able to ping 8.8.8.8. Then my host linux network setting is set to be as follows:

  • ipv4-forwarding enable in sysctl
  • sudo iptables -t nat -A POSTROUTING -o dn-interface -j MASQUERADE
  • firewall is also disabled
    Could you please help me to get internet to UE?

@jaswanthvt
Copy link

Hello,

I am facing the same error.
how to fix the error?
The crashed pods .
kryptowire@aether:~$ kubectl get pods -n omec
NAME READY STATUS RESTARTS AGE
amf-5887bbf6c5-psb7m 1/1 Running 0 11h
ausf-6dbb7655c7-z7jnj 1/1 Running 0 11h
kafka-0 1/1 Running 1 (11h ago) 11h
metricfunc-55b47f58d5-tm6s8 1/1 Running 0 11h
mongodb-0 0/1 CrashLoopBackOff 137 (3m26s ago) 11h
mongodb-arbiter-0 0/1 CrashLoopBackOff 137 (3m40s ago) 11h
nrf-54bf88c78c-fr28b 1/1 Running 0 11h
nssf-5b85b8978d-szs75 1/1 Running 0 11h
pcf-758d7cfb48-jplz9 1/1 Running 0 11h
sd-core-zookeeper-0 1/1 Running 0 11h
simapp-6cccd6f787-2j5th 1/1 Running 0 11h
smf-776ccbb869-jhv79 1/1 Running 0 11h
udm-768b9987b4-p2x4z 1/1 Running 0 11h
udr-8566897d45-pznns 1/1 Running 0 11h
upf-0 3/5 CrashLoopBackOff 270 (13s ago) 11h
webui-5894ffd49d-shvbv 1/1 Running 0 11h

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants