flock

Author	SHA1	Message	Date
Donavan Fritz	a17d33e182	agent: addresses annotation replaces IPAM allocation Build flock Image / build (push) Successful in 5m27s Details When flock.fritzlab.net/addresses provides a v6 or v4, the IP becomes the pod's primary IP for that family — bound to eth0, default route off it, on-link host route via setHostRoute, and a per-pod /128 or /32 in BGP. IPAM no longer allocates a private IP alongside it. The pod ends up with exactly the operator-supplied addresses on eth0 (plus any extras beyond the first-of-family, which keep the pre-existing layered behavior). This is the fix the original addresses-annotation work missed: bug #1 allocated a private IP next to the public one (so VPN-routed clients could land on the private path on Plex). Promoting addresses-supplied IPs into the IPAM-style routing slot keeps the public IP as the only primary IP visible from outside. Three pieces: - annotations.go: reject pods whose addresses/anycast IP family is disabled (ipv6/ipv4 annotation or NodeConfig default). Both annotation types rely on the family being enabled for return-path routing. - handlers.go: peel first v6 + first v4 from Addresses into res.IP6/IP4; suppress IPAM for those families; skip IPAM call entirely if both families are addresses-supplied. - anycast_linux.go: extend renderBird to advertise any IPAM IP that's outside the node's BGP aggregate as a per-pod /32 or /128. This is what makes 142.202.202.166 reachable when host004's pod CIDR is 172.25.214.0/24 — the addresses-promoted IP isn't covered by the aggregate. Tests: 7 new annotation tests covering the conflict cases (ipv4=false + addresses-v4, NodeConfig default + addresses-v4, etc.) plus 5 unit tests for the splitAddressesPrimary helper. README updated with the addresses-replaces-IPAM behavior, the addresses-vs-anycast comparison, the conflict rule, and a Plex-style example. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 09:46:48 -05:00
Donavan Fritz	c61b12204c	anycast: drop pods from nexthop set on DeletionTimestamp Build flock Image / build (push) Has been cancelled Details Previously the AnycastReconciler kept a pod in the nexthop set as long as its PodReady condition was True. During a rolling restart that produces a window after kubelet has accepted SIGTERM (DeletionTimestamp set, pod still Ready until probes observe shutdown) where BGP still advertises a path through the dying pod's veth — in-flight requests get RST'd when the container actually exits. Fix: introduce podAnycastEligible(pod) = !DeletionTimestamp && Ready, swap it in at the AnycastReconciler's isReady callback, and fire the ready-change callback when DeletionTimestamp transitions (the informer UpdateFunc previously only fired on Ready transitions). Result: as soon as the apiserver marks a pod for deletion, the reconciler withdraws the local nexthop and BIRD reannounces the route without it. Sibling replicas absorb traffic before the pod's terminationGracePeriod elapses. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-25 22:24:50 -05:00
Donavan Fritz	a7dc7bf1f4	anycast: kernel multipath route + L4 hash for multi-pod-per-node Build flock Image / build (push) Has been cancelled Details Move pure resolver logic out of anycast_linux.go into anycast.go so it's unit-testable on any host. Reshape anycastTarget from a single {hostIface, via} into a sorted list of nexthops; multiple Ready pods on the same node binding the same anycast IP now contribute one nexthop each. installAnycastRoute uses RTA_MULTIPATH (via netlink.Route.MultiPath) when the target has more than one nexthop. Single-nexthop targets keep the simple via-route shape so 1-pod-per-node keeps rendering identically to today's production form in `ip route show`. flock-agent writes net.ipv{4,6}.fib_multipath_hash_policy = 1 at startup so the kernel hashes flows on (saddr, daddr, sport, dport, proto) rather than just IPs. Best-effort — runs privileged in production, so it works; falls back to L3 hash on environments where the write fails (only matters for the multi-pod-per-node case anyway). resolveAnycastTargets sorts nexthops by canonical(via) for stable comparison so a quiet reconcile pass doesn't churn the kernel route. 8 new unit tests cover: 1-pod, 2-pods-same-anycast (multi-nexthop), NotReady drop, no-Ready omits the IP, pending skipped, mixed v6+v4, family mismatch warns, determinism. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-25 09:57:32 -05:00
Donavan Fritz	2082df37e5	anycast: revert to lo + add via=pod-eth0 next-hop on host route Build flock Image / build (push) Has been cancelled Details Reverts the eth0-placement hack from `e1e9544`. The design doc's lo placement is correct. Real fix: the host's anycast /128 (or /32) route now uses the pod's own eth0 unicast IP (same family) as the route's `via` next-hop. The kernel then does NDP/ARP for that eth0 IP — which IS configured on the pod's eth0 — so the pod responds normally with no proxy_ndp / proxy_arp trickery on the anycast IP itself. ip -6 route add <anycast>/128 via <pod-eth0-v6> dev flock<8hex> ip -4 route add <anycast>/32 via <pod-eth0-v4> dev flock<8hex> Validation: an anycast IP whose family the pod doesn't have a unicast for is skipped with a warn (an v4 anycast on an IPv6-only pod cannot be NDP-resolved this way; require dual-stack). Bonus cleanup: ESRCH from RouteDel is treated as success (idempotent). Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-25 08:02:51 -05:00
Donavan Fritz	89a3502446	M6: anycast — pod lo + Ready-gated /128/32 + BIRD export Build flock Image / build (push) Has been cancelled Details CNI ADD now adds anycast IPs to the pod's lo interface (NOT eth0 — design doc rationale: avoid NDP/ARP DAD conflicts when N replicas share an IP). Allocation persists the anycast list. AnycastReconciler: desired = { ip → flock<8hex> } from committed allocations × pod.Status.PodReady=True diff against advertised, install/remove host /128 (v6) or /32 (v4) re-render bird.conf with the active set Triggers: 2s tick, AfterCommit (per ADD/DEL), Pod informer Ready transitions (PodCache.OnReadyChange callback). The bird template already supported Anycast6/Anycast4 via the export filter — this turn finally drives those slices from runtime. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-25 07:36:47 -05:00

5 Commits