Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node module #15

Closed
wants to merge 18 commits into from
Closed

Node module #15

wants to merge 18 commits into from

Conversation

zmrocze
Copy link
Contributor

@zmrocze zmrocze commented Jan 26, 2024

WIP

The goal was to provide cardano-node module with some small first configurability and test it in a local network for block production.

related to issues #12 and #14

fixes #13

Status:

There's 2 modules: block-producer, relay-node - differing by access to secret keys and topology.

There's a nixos test which runs the two modules. The test is bugged and passes because it doesn't test the right thing. The nodes run but cannot connect together in the test, as judged by a node traces:

block_producer # [   11.253099] cardano-node-start[637]: [blockpro:cardano.node.ConnectionManager:Info:64] [2024-02-13 10:33:58.37 UTC] TrConnectError (Just 127.0.0.1:3001) 192.168.1.1:3001 Network.Socket.connect: <socket: 30>: does not exist (Connection refused)
[...]
relay_node # [   17.990256] cardano-node-start[655]: [relaynod:cardano.node.ConnectionManager:Info:77] [2024-02-13 10:34:06.05 UTC] TrConnectError (Just 127.0.0.1:3001) 192.168.1.0:3001 Network.Socket.connect: <socket: 35>: invalid argument (Invalid argument)

The IPs above are assigned with virtualisation.vlans = [1];. Running nmap in the interactive test (run-test block-producer -i) gives expectedly, but nodes don't connect:

>>> relay_node.execute("nmap -sP 192.168.1.*")
(0, 'Starting Nmap 7.94 ( https://nmap.org ) at 2024-02-12 18:18 UTC
Nmap scan report for block_producer (192.168.1.1)
Host is up (0.00078s latency).
MAC Address: 52:54:00:12:01:01 (QEMU virtual NIC)
Nmap scan report for relay_node (192.168.1.2)
Host is up.
Nmap done: 256 IP addresses (2 hosts up) scanned in 1.89 seconds')

TODO:

  • the connection issue
  • test for sth like block production, connection - not by ping
  • parametrize wrt to shelley-genesis.yaml file EDIT: done by setting extraNodeConfig.ShelleyGenesisFile

@zmrocze zmrocze marked this pull request as ready for review January 28, 2024 11:27
@zmrocze
Copy link
Contributor Author

zmrocze commented Jan 28, 2024

I've made the block-producer module.

At this point it hardly differs from what a relay node would be.

So, do we make 2 modules or 1 module with 2 modes?

I'd do 2 modules. Once we define different firewall settings for block-producer and relay, the modules actually become quite different. To me it's better to define them as 2 modules not to create a false expectation of similarity.

Counter argument is that the 2 modules cannot be run on same machine (clash on services.cardano-node), so that's a false expectation as well (though it's explained by the error log one would get).

@zmrocze
Copy link
Contributor Author

zmrocze commented Jan 28, 2024

I'm getting format check here at e4e1796, eventhough nix fmt brings no format changes locally? strange

@zmrocze
Copy link
Contributor Author

zmrocze commented Jan 28, 2024

Myself I would prefer to code the relay-node module later, after adding firewall, to see what are the changes first.

@zmrocze
Copy link
Contributor Author

zmrocze commented Jan 28, 2024

included issue #13.

Problem: node doesn't start because a key in /nix/store is r-xr-xr-x.
…heck

Node checks key file permissions. We can't directly use /nix/store keys in tests.
I thought - okay lets use agenix as a hack but also already progress with a future task by doing so.
Didn't work, because hardcoded ssh keys provided to a test in `environment.etc` are added in a later nixos stage than agenix.
Can't get dns resolution to work: How to refer to nodes in nixos test?
@zmrocze
Copy link
Contributor Author

zmrocze commented Feb 13, 2024

debugging nodes connection

Tried:

>>> relay_node.execute("python -m http.server")
<< doesn't return, seems to be run  >>

Dont know how to interpret

Running nmap, ports are closed, :

> relay_node.execute("nmap -sS -p3001 192.168.1.0/30")
Starting Nmap 7.94 ( https://nmap.org ) at 2024-02-13 11:20 UTC
Nmap scan report for block_producer (192.168.1.1)
Host is up (0.00079s latency).

PORT     STATE  SERVICE
3001/tcp closed nessus
MAC Address: 52:54:00:12:01:01 (QEMU virtual NIC)

Nmap scan report for relay_node (192.168.1.2)
Host is up (0.000054s latency).

PORT     STATE  SERVICE
3001/tcp closed nessus

Nmap done: 4 IP addresses (2 hosts up) scanned in 1.41 seconds

ifconfig:

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.2  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 fe80::5054:ff:fe12:102  prefixlen 64  scopeid 0x20<link>
        ether 52:54:00:12:01:02  txqueuelen 1000  (Ethernet)
        RX packets 7  bytes 565 (565.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 20  bytes 1578 (1.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 
[lo...]

@brainrake
Copy link
Contributor

superseded by #21

@brainrake brainrake closed this Apr 27, 2024
@brainrake brainrake deleted the karol/node-module branch April 27, 2024 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatically update inputs using the flake-update Hercules CI effect
3 participants