2 Commits

Author SHA1 Message Date
Donavan Fritz 9b777ca7d1 bird: per-peer import filter rejects connected subnet
Build flock Image / build (push) Successful in 2m17s
Without a filter, crt001's `network 2602:817:3000:A25::/64` gets
re-advertised to every peer on that subnet. bird installs the BGP /64
with metric 32, beating the kernel-connected route at 256, and all
inter-host VLAN-25 traffic hairpins through the gateway — losing PMTU
9000 and ~30x throughput. Broke Plex 2026-05-04: NFS to nas002 capped
at 7 MB/s, jumbo blackholed.

Add LocalSubnetV6/V4 (CIDR) to NodeBGP. Agent populates by masking the
peer's address to /64 (v6) or /24 (v4) — same fritzlab convention
already in localAddrSameSubnet. Render emits `import where net !=
<subnet>;` per BGP channel when set, falls back to `import all;`
otherwise so existing tests stay green.

Defence in depth: with the matching outbound route-map on crt001
(ROUTE_MAP_CLUSTER_OUT_V{4,6}) the agent now refuses the leak on its
own if the router filter ever drifts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 21:03:59 -05:00
Donavan Fritz eb1f5e0d8d M2: netlink, IPAM/handler wiring, BIRD sidecar, CNI installer
Build flock Image / build (push) Has been cancelled
Code (Linux build, with no-op stubs for macOS dev):
- pkg/agent/netns_linux.go: ensureVeth → host-side configure (addrgenmode
  none, fe80::1/64, proxy_arp, forwarding) → move peer to pod ns →
  configure pod side (addr, default route via fe80::1, v4 169.254.1.1
  on-link gateway) → host /128 + /32 routes. Idempotent.
- pkg/agent/hostiface.go: deterministic host iface name flock<8hex> from
  FNV-1a-32(containerID).
- pkg/agent/annotations.go: parse flock.fritzlab.net/{ipv6,ipv4,cidr6,
  cidr4,ip-algo,anycast} with design-doc defaults; ParseCNIArgs for the
  K8S_POD_* keys kubelet sets.
- pkg/agent/podinfo.go: shared informer scoped to spec.nodeName==NODE,
  WaitForPod helper for ADD-vs-informer-sync race.
- pkg/agent/handlers.go: PodHandler does
    cache lookup → annotations → IPAM → store(pending) → SetupFunc →
    store(committed) → Result. Idempotent on retry. Del symmetric.
- pkg/routing/bird/config.go: text/template render with stable ordering;
  golden tests for host001 + anycast injection + sort stability.
- pkg/agent/bird.go: writes /etc/flock/bird/bird.conf, debounces 500ms,
  execs `birdc -s /run/flock/bird.ctl configure`. Installs blackhole
  kernel routes for the node summary CIDRs so BIRD's protocol kernel
  imports them.
- pkg/agent/runtime_linux.go: at startup, waits up to 60s for the per-
  node NodeConfig, reconciles committed allocations into IPAM.used,
  garbage-collects pending entries, builds PodHandler, swaps RPC
  handlers in.
- cmd/flock-installer: init-container binary that copies /opt/cni/bin/
  flock and writes 01-flock.conflist (lex-first so kubelet picks it
  over Calico's 10-calico.conflist on flock-labeled nodes).

Deploy:
- Dockerfile: alpine + iproute2 + bird2; multi-binary image.
- deploy/daemonset.yaml: install-cni init container; bird sidecar
  sharing /etc/flock/bird + /run/flock with the agent; ConfigMap-seeded
  bootstrap bird.conf so the sidecar boots before the agent renders.
  Privileged on flock-agent + install-cni; bird sidecar uses
  NET_ADMIN/RAW only.
- RBAC: pods + networkpolicies get/list/watch (the latter is reserved
  for M8 — harmless to grant now).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 22:33:48 -05:00