3 Commits

Author SHA1 Message Date
Donavan Fritz e9d3eef2cc netpol: accept established+related at top of every pod chain
Build flock Image / build (push) Has been cancelled
K8s NetworkPolicy applies to the start of new connections; reply
packets for established flows (and ICMP related) must not be matched
against the explicit allow set. The pod ingress chain previously had
only explicit dport allows + a final drop, so any reply to a
pod-initiated outbound where the reply's dport (the ephemeral source
port) wasn't in the allow set got dropped.

Hit in production 2026-04-26: garage's `garage-admin-restrict` NP
allowed dports 3900/80/3901/3903 only. Garage uses kubernetes_discovery
to find peers — outbound to kube-apiserver succeeded, replies returned
to ephemeral source ports, dropped → "Layout not ready" cluster-wide.

Fix: emit `ct state established,related accept` as the first rule in
every pod_<hash>_(ingress|egress) chain. Regression test added.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 22:22:39 -05:00
Donavan Fritz 5d9b6bfeec netpol: anchor base-chain jump on veth only, not pod IP
Build flock Image / build (push) Has been cancelled
The previous base-chain jump matched iifname/oifname AND saddr/daddr ==
pod eth0 IP. Anycast traffic has the anycast IP as daddr, not the pod's
eth0 unicast — so anycast packets skipped the policy chain entirely and
fell through to the forward chain's policy=accept.

The veth uniquely belongs to one pod. Anything traversing it is to or
from that pod by definition (anycast, unicast, future overlay routes).
Match on iifname/oifname alone; let the pod-side chain's accept lines +
trailing drop be the policy.

Validated end-to-end on host001: anycast nginx pod with default-deny
ingress NetPol now correctly drops traffic from any peer; adding an
allow-from-podSelector rule unblocks only the matched peer.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 09:32:08 -05:00
Donavan Fritz 39ede9130b netpol: NetworkPolicy v1 enforcement via nftables
Build flock Image / build (push) Has been cancelled
New pkg/agent/netpol implementing standard networking.k8s.io/v1
NetworkPolicy. Pipeline:

  pods + policies + namespaces  →  Translate  →  Render  →  Apply

Supports ingress + egress, all three peer types (podSelector,
namespaceSelector, ipBlock with except), numeric ports + port ranges,
default-deny semantics derived from PolicyTypes (or inferred from
non-empty Spec.Egress when unset).

Apply path is `nft -f -` shell-out — single transaction, atomic, kernel
guarantees partial-failure rollback. Idempotent dedup via last-applied
script. Reconcile triggers: informer events, 30s self-heal tick, every
CNI ADD/DEL.

Verified against the three live cluster NetPols (calico-apiserver,
remote-proxies/lodge-home-assistant, storage/garage-admin-restrict).
Fuzz target stitches Translate + Render with random selector and peer
inputs; 21 unit tests cover the policy semantics.

Named ports skip with a warn — deferred until kubelet exposes them in a
form that doesn't require shadowing pod state.

Dockerfile: + nftables.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 09:25:58 -05:00