-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
05124e6
commit 093cc05
Showing
84 changed files
with
8,132 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
118 changes: 118 additions & 0 deletions
118
website/versioned_docs/version-v1.1.0/components/frontend.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
# Frontend | ||
|
||
* [cmd](https://github.com/Nordix/Meridio/tree/master/cmd/frontend) | ||
* [Dockerfile](https://github.com/Nordix/Meridio/tree/master/build/frontend) | ||
|
||
## Description | ||
|
||
The frontend makes it possible to attract external traffic to Meridio via a secondary network. | ||
|
||
The external interface to be used for external connectivity must be provided to the frontend. | ||
One way to achieve this is to rely on NSM which through a NSC container can install a VLAN capable interface into the particular frontend POD. The master interface residing in the host network namespace, the VLAN ID and the IP network NSM shall use to allocate IP address to the external interface must be configured to get consumed by the Remote VLAN NSE. | ||
Alternatively, the external interface can be provided using Multus in which case no NSC or Remote VLAN NSE is required, and IP address allocation can be taken care of by a suitable IPAM CNI plugin (e.g. [whereabouts](https://github.com/k8snetworkplumbingwg/whereabouts)) configured in the Network Attachment Definition | ||
|
||
|
||
When started, the frontend installs src routing rules for each configured VIP address, then configures and spins off a [BIRD](https://bird.network.cz/) routing program instance providing for external connectivity. The bird routing suite is restricted to the external interface. The frontend uses [birdc](https://bird.network.cz/?get_doc&v=20&f=bird-4.html) for both monitoring and changing BIRD configuration. | ||
|
||
BGP protocol with optional BFD supervision and Static+BFD setup are supported at the moment. Since they lack inherent neighbor discovery mechanism, the external gateway IP addresses must be configured. | ||
In case of BGP a next-hop route for each VIP address gets announced by the protocol to its external peer advertising the frontend IP as next-hop, thus attracting external traffic to the frontend. While from the external BGP peer at least one next-hop route is expected to be utilized by the VIP src routing to steer egress traffic. The external BGP peer can decide to announce a default route or a set of network routes. | ||
|
||
Both ingress and egress traffic traverse a frontend POD (not necessarily the same). | ||
|
||
Currently the frontend is collocated with the load balancer, hence reside in the same POD. A load balancer relies on the collocated frontend to forward egress traffic, and the other way around to handle ingress traffic. Also, the frontend signals information about external connectivity to its local load balancer, while the frontend gets information from the collocated load balancer whether it is capable of forwarding incoming traffic towards application targets. The latter is taken into consideration in case of BGP setup to control when to advertise VIP addresses, in order to avoid attracting external traffic if ingress forwarding is not available yet. Which also implies that VIP addresses are not advertised without application targets. | ||
|
||
To avoid leaking egress VIP traffic into the primary network, the frontend installs src routing rules with lesser priority upon its startup to match and blackhole such traffic when there's no external connectivity. | ||
|
||
#### External gateway router | ||
|
||
The external peer a frontend is intended to connect with must be configured separately as it is outside the scope of Meridio. | ||
|
||
Some generic pointers to setup the external router side (focusing on BGP): | ||
The external peer must be part of the same (secondary) network and subnet as the external interface of the connected frontend. NSM _exclude prefixes_ functionality can be used to prevent the IPAM in Remote VLAN NSE assigning IPs that have been allocated to external peers. (On the other hand, the IPAM starts assigning IPs from the start of the range, thus in development environments it might be sufficent to pick IPs from the end of the range to configure external peers.) | ||
To avoid the need of having to configure all the possible IPs the frontends might use to connect to an external BGP router, it's worth considering passive BGP peering on the router side. | ||
By default Meridio side uses BGP AS 8103 and assumes AS 4248829953 on the gateway router side, while default BGP port for both side is 10179. | ||
|
||
## Configuration | ||
|
||
https://github.com/Nordix/Meridio/blob/master/cmd/front-end/internal/env/config.go | ||
|
||
Environment variable | Type | Description | Default | ||
--- | --- | --- | --- | ||
NFE_VRRPS | []string | VRRP IP addresses to be used as next-hops for static default routes | | ||
NFE_EXTERNAL_INTERFACE | string | External interface to start BIRD on | ext-vlan | ||
NFE_BIRD_CONFIG_PATH | string | Path to place bird config files | /etc/bird | ||
NFE_LOCAL_AS | string | Local BGP AS number | 8103 | ||
NFE_REMOTE_AS | string | Local BGP AS number | 4248829953 | ||
NFE_BGP_LOCAL_PORT | string | Local BGP server port | 10179 | ||
NFE_BGP_REMOTE_PORT | string | Remote BGP server port | 10179 | ||
NFE_BGP_HOLD_TIME | string | Seconds to wait for a Keepalive message from peer before considering the connection stale | 3 | ||
NFE_TABLE_ID | int | Start ID of the two consecutive OS Kernel routing tables BIRD syncs the routes with | 4096 | ||
NFE_ECMP | bool | Enable ECMP towards next-hops of avaialble gateways | false | ||
NFE_DROP_IF_NO_PEER | bool | Install default blackhole route with high metric into routing table TableID | true | ||
NFE_LOG_BIRD | bool | Add important bird log snippets to our log | false | ||
NFE_NAMESPACE | string | Namespace the pod is running on | default | ||
NFE_NSP_SERVICE | string | IP (or domain) and port of the NSP Service | nsp-service-trench-a:7778 | ||
NFE_TRENCH_NAME | string | Name of the Trench the frontend is associated with | default | ||
NFE_ATTRACTOR_NAME | string | Name of the Attractor the frontend is associated with | default | ||
NFE_LOG_LEVEL | string | Log level | DEBUG | ||
NFE_NSP_ENTRY_TIMEOUT | time.Duration | Timeout of entries registered in NSP | 30s | ||
NFE_GRPC_KEEPALIVE_TIME | time.Duration | gRPC keepalive timeout | 30s | ||
NFE_GRPC_MAX_BACKOFF | time.Duration | Upper bound on gRPC connection backoff delay | 5s | ||
NFE_DELAY_CONNECTIVITY | time.Duration | Delay between routing suite checks with connectivity | 1s | ||
NFE_DELAY_NO_CONNECTIVITY | time.Duration | Delay between routing suite checks without connectivity | 3s | ||
NFE_MAX_SESSION_ERRORS | int | Max session errors when checking routing suite until denounce | 5 | ||
NFE_METRICS_ENABLED | bool | Enable the metrics collection | false | ||
NFE_METRICS_PORT | int | Specify the port used to expose the metrics | 2224 | ||
NFE_LB_SOCKET | url.URL | LB socket to connect to | unix:///var/lib/meridio/lb.sock | ||
|
||
## Command Line | ||
|
||
Command | Action | Default | ||
--- | --- | --- | ||
--help | Display a help describing | | ||
--version | Display the version | | ||
|
||
## Communication | ||
|
||
Here are all components the frontend is communicating with: | ||
|
||
Component | Secured | Method | Description | ||
--- | --- | --- | --- | ||
Spire | TBD | Unix Socket | Obtain and validate SVIDs | ||
NSP Service | yes (mTLS) | TCP | Watch configuration. Register/Unregister target (Advertise its readiness to the NSP target registry) | ||
Gateways | / | / | Routing protocol | ||
Kubernetes API | TDB | TCP | Watch the secrets for BGP authentication | ||
LB | yes (mTLS) | Unix socket | Watch internal connectivity status of collocated stateless-lb | ||
|
||
An overview of the communications between all components is available [here](resources.md). | ||
|
||
## Health check | ||
|
||
The health check is provided by the [GRPC Health Checking Protocol](https://github.com/grpc/grpc/blob/master/doc/health-checking.md). The status returned can be `UNKNOWN`, `SERVING`, `NOT_SERVING` or `SERVICE_UNKNOWN`. | ||
|
||
Service | Description | ||
--- | --- | ||
Readiness | A unique service to be used by readiness probe to return status, can aggregate other lesser services | ||
|
||
Service | Probe | Description | ||
--- | --- | --- | ||
NSPCli | Readiness | Monitor status of the connection to the NSP service | ||
Egress | Readiness | Monitor the gateways connectivity | ||
|
||
## Privileges | ||
|
||
To work properly, here are the privileges required by the frontend: | ||
|
||
Name | Description | ||
--- | --- | ||
Sysctl: net.ipv4.conf.all.forwarding=1 | Enable IP forwarding | ||
Sysctl: net.ipv6.conf.all.forwarding=1 | Enable IP forwarding | ||
Sysctl: net.ipv4.fib_multipath_hash_policy=1 | To use Layer 4 hash policy for ECMP on IPv4 | ||
Sysctl: net.ipv6.fib_multipath_hash_policy=1 | To use Layer 4 hash policy for ECMP on IPv6 | ||
Sysctl: net.ipv4.conf.all.rp_filter=0 | Allow packets to have a source IPv4 address which does not correspond to any routing destination address. | ||
Sysctl: net.ipv4.conf.default.rp_filter=0 | Allow packets to have a source IPv6 address which does not correspond to any routing destination address. | ||
Sysctl: net.ipv4.ip_local_port_range='49152 65535' | The source port of BFD Control packets must be in the IANA approved range 49152-65535 | ||
NET_ADMIN | The frontend creates IP rules to handle outbound traffic from VIP sources. BIRD interacts with kernel routing tables. | ||
NET_BIND_SERVICE | Allows BIRD to bind to privileged ports depending on the config (for example to BGP port 173). | ||
NET_RAW | Allows BIRD to use the SO_BINDTODEVICE socket option. | ||
Kubernetes API | fes-role - secrets - watch |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
# IPAM | ||
|
||
* [cmd](https://github.com/Nordix/Meridio/tree/master/cmd/ipam) | ||
* [Dockerfile](https://github.com/Nordix/Meridio/tree/master/build/ipam) | ||
|
||
## Description | ||
|
||
In order to avoid IP collisions in the system and ensure a proper IPs distribution, this service is offering some IPAM functionalities that can be consumed using a kubernetes clusterIP service (over the kubernetes primary network). This IPAM Service is a [GRPC](https://grpc.io/) server listening on port 7777. | ||
|
||
The specifications of the IPAM Service are written in a proto file accessible [here](https://github.com/Nordix/Meridio/blob/master/api/ipam/v1/ipam.proto). | ||
|
||
### IP/Prefix distribution granularity | ||
|
||
The Meridio IPAM distributes IP/Prefixes (always within the trench subnet defined in the configuration by `IPAM_PREFIX_IPV4` and `IPAM_PREFIX_IPV6`) at a few different levels. | ||
|
||
The first one is at the conduit level. Represented in blue (Conduit-A) and in red (Conduit-B) in the picture below, they are allocated automatically by the IPAM by watching the conduit list via the NSP service. The conduit subnet prefix lengths are defined in the configuration by `IPAM_CONDUIT_PREFIX_LENGTH_IPV4` and `IPAM_CONDUIT_PREFIX_LENGTH_IPV6`. | ||
|
||
The second one is at the node level. Represented in black in the picture below (1 per node per conduit), they are allocated when the `Allocate` API function is called (note: there is currently no way to unallocate except if the conduit is removed.). The node subnet prefix lengths are defined in the configuration by `IPAM_NODE_PREFIX_LENGTH_IPV4` and `IPAM_NODE_PREFIX_LENGTH_IPV6`. | ||
|
||
The third (last one) is at the pod level. Each pod will get assigned a unique IP address with `IPAM_NODE_PREFIX_LENGTH_IPV4` or `IPAM_NODE_PREFIX_LENGTH_IPV6` as prefix length. | ||
|
||
![ipam](../resources/IPAM.svg) | ||
|
||
Picture representing a cluster with 2 nodes (worked-A and worker-B), 2 conduits (Conduit-A and Conduit-B), 4 targets and the corresponding subnets. | ||
* Target-1 is running on worker-A and connected to Conduit-A | ||
* Target-2 is running on worker-A and connected to Conduit-B | ||
* Target-3 is running on worker-B and connected to Conduit-A and Conduit-B | ||
* Target-4 is running on worker-B and connected to Conduit-B | ||
|
||
### Data persistence | ||
|
||
Running as StatefulSet with a single replica, the IPAM handles restarts and pod deletions by saving the data in a local sqlite stored in a persistent volume requested via a volumeClaimTemplates. | ||
|
||
## Configuration | ||
|
||
https://github.com/Nordix/Meridio/blob/master/cmd/ipam/config.go | ||
|
||
Environment variable | Type | Description | Default | ||
--- | --- | --- | --- | ||
IPAM_PORT | int | Port the pod is running the service | 7777 | ||
IPAM_DATA_SOURCE | string | Path and file name of the sqlite database | /run/ipam/data/registry.db | ||
IPAM_TRENCH_NAME | string | Trench the pod is running on | | ||
IPAM_NSP_SERVICE | string | IP (or domain) and port of the NSP Service | | ||
IPAM_PREFIX_IPV4 | string | IPv4 prefix from which the proxy prefixes will be allocated | 169.255.0.0/16 | ||
IPAM_CONDUIT_PREFIX_LENGTH_IPV4 | int | Conduit prefix length which will be allocated | 20 | ||
IPAM_NODE_PREFIX_LENGTH_IPV4 | int | node prefix length which will be allocated | 24 | ||
IPAM_PREFIX_IPV6 | string | IPv6 prefix from which the proxy prefixes will be allocated | fd00::/48 | ||
IPAM_CONDUIT_PREFIX_LENGTH_IPV6 | int | Conduit prefix length which will be allocated | 56 | ||
IPAM_NODE_PREFIX_LENGTH_IPV6 | int | node prefix length which will be allocated | 64 | ||
IPAM_IP_FAMILY | string | IP family (ipv4, ipv6, dualstack) | dualstack | ||
IPAM_LOG_LEVEL | string | Log level (TRACE, DEBUG, INFO, WARNING, ERROR, FATAL, PANIC) | DEBUG | ||
|
||
## Command Line | ||
|
||
Command | Action | Default | ||
--- | --- | --- | ||
--help | Display a help describing | | ||
--version | Display the version | | ||
|
||
## Communication | ||
|
||
Here are all components the ipam is communicating with: | ||
|
||
Component | Secured | Method | Description | ||
--- | --- | --- | --- | ||
Spire | TBD | Unix Socket | Obtain and validate SVIDs | ||
NSP Service | yes (mTLS) | TCP | Watch configuration | ||
|
||
An overview of the communications between all components is available [here](resources.md). | ||
|
||
## Health check | ||
|
||
The health check is provided by the [GRPC Health Checking Protocol](https://github.com/grpc/grpc/blob/master/doc/health-checking.md). The status returned can be `UNKNOWN`, `SERVING`, `NOT_SERVING` or `SERVICE_UNKNOWN`. | ||
|
||
Service | Description | ||
--- | --- | ||
Liveness | A unique service to be used by liveness probe to return status, can aggregate other lesser services | ||
Readiness | A unique service to be used by readiness probe to return status, can aggregate other lesser services | ||
|
||
Service | Probe | Description | ||
--- | --- | --- | ||
NSPCli | Readiness | Monitor status of the connection to the NSP service | ||
IPAM | Liveness | Monitor status of the server | ||
|
||
## Privileges | ||
|
||
No privileges required. |
Oops, something went wrong.