netpol: accept established+related at top of every pod chain
Build flock Image / build (push) Has been cancelled

K8s NetworkPolicy applies to the start of new connections; reply
packets for established flows (and ICMP related) must not be matched
against the explicit allow set. The pod ingress chain previously had
only explicit dport allows + a final drop, so any reply to a
pod-initiated outbound where the reply's dport (the ephemeral source
port) wasn't in the allow set got dropped.

Hit in production 2026-04-26: garage's `garage-admin-restrict` NP
allowed dports 3900/80/3901/3903 only. Garage uses kubernetes_discovery
to find peers — outbound to kube-apiserver succeeded, replies returned
to ephemeral source ports, dropped → "Layout not ready" cluster-wide.

Fix: emit `ct state established,related accept` as the first rule in
every pod_<hash>_(ingress|egress) chain. Regression test added.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Donavan Fritz
2026-04-25 22:22:39 -05:00
parent 8dd109866e
commit e9d3eef2cc
2 changed files with 13 additions and 0 deletions
+6
View File
@@ -161,6 +161,12 @@ func chainName(podKey string, dir Direction) string {
// the chain's drop policy IS the default-deny.
func writeChain(sb *strings.Builder, c chain) {
fmt.Fprintf(sb, "\tchain %s {\n", c.name)
// Stateful accept for return traffic. NetworkPolicy applies to the
// start of a new connection — reply packets for pod-initiated flows
// (egress) and follow-up packets of an established ingress flow must
// pass regardless of the explicit allow set, otherwise the chain's
// final drop kills ephemeral-port replies (e.g. pod → kube-apiserver).
sb.WriteString("\t\tct state established,related accept\n")
for _, r := range c.rules {
writeAllowRule(sb, r)
}