7 Commits

Author SHA1 Message Date
Donavan Fritz a17d33e182 agent: addresses annotation replaces IPAM allocation
Build flock Image / build (push) Successful in 5m27s
When flock.fritzlab.net/addresses provides a v6 or v4, the IP becomes
the pod's primary IP for that family — bound to eth0, default route off
it, on-link host route via setHostRoute, and a per-pod /128 or /32 in
BGP. IPAM no longer allocates a private IP alongside it. The pod ends up
with exactly the operator-supplied addresses on eth0 (plus any extras
beyond the first-of-family, which keep the pre-existing layered
behavior).

This is the fix the original addresses-annotation work missed: bug #1
allocated a private IP next to the public one (so VPN-routed clients
could land on the private path on Plex). Promoting addresses-supplied
IPs into the IPAM-style routing slot keeps the public IP as the only
primary IP visible from outside.

Three pieces:
- annotations.go: reject pods whose addresses/anycast IP family is
  disabled (ipv6/ipv4 annotation or NodeConfig default). Both annotation
  types rely on the family being enabled for return-path routing.
- handlers.go: peel first v6 + first v4 from Addresses into res.IP6/IP4;
  suppress IPAM for those families; skip IPAM call entirely if both
  families are addresses-supplied.
- anycast_linux.go: extend renderBird to advertise any IPAM IP that's
  outside the node's BGP aggregate as a per-pod /32 or /128. This is
  what makes 142.202.202.166 reachable when host004's pod CIDR is
  172.25.214.0/24 — the addresses-promoted IP isn't covered by the
  aggregate.

Tests: 7 new annotation tests covering the conflict cases (ipv4=false +
addresses-v4, NodeConfig default + addresses-v4, etc.) plus 5 unit tests
for the splitAddressesPrimary helper.

README updated with the addresses-replaces-IPAM behavior, the
addresses-vs-anycast comparison, the conflict rule, and a Plex-style
example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:46:48 -05:00
Donavan Fritz 2daa2a21f3 agent: add flock.fritzlab.net/addresses annotation (eth0 static IPs)
Build flock Image / build (push) Successful in 3m23s
Like anycast, addresses IPs are advertised via BGP (/128+/32) and get
host routes via the AnycastReconciler. The sole difference: they are
assigned to pod eth0 instead of lo, so workloads that inspect their
primary interface (e.g. Plex remote-access detection) see the public IP
directly.

- annotations.go: annAddresses const, Addresses []net.IP in ParsedAnnotations
- state.go: Addresses []string persisted in allocations.json
- anycast.go: resolveAnycastTargets processes Anycast+Addresses together
- netns_linux.go: configurePodSide assigns Addresses to eth0
- netns_stub.go: mirror Addresses field for non-Linux builds
- handlers.go: thread Addresses through ADD path

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 17:50:49 -05:00
Donavan Fritz 65b2fb5b17 ip-algo: rename pod field to app; image from pod spec
Build flock Image / build (push) Has been cancelled
The `pod` field hashed pod.Name, which differs per replica because of
the ReplicaSet pod-template-hash + 5-char random suffix. With
namespace,pod,image, all replicas of the same Deployment got distinct
hextets even though they were the same workload.

Replace `pod` with `app` — a stable workload identifier derived from
the controller chain:

  - Deployment → ReplicaSet → Pod: strip the pod-template-hash suffix
    from the RS name (`traefik-789df685f` → `traefik`).
  - StatefulSet/DaemonSet/Job → Pod: use controller name as-is.
  - Bare pod: pod name.

Image now comes from pod.Spec.Containers[0].Image (the spec'd
reference). 64-hex-char values are treated as sha256 digests and
parsed as before; everything else (image:tag, short SHA) is FNV-1a-64'd
as a string. This makes `traefik:v3.5` deterministic across replicas
without needing the runtime-resolved digest.

Net effect: namespace,app,image yields identical hextets across all
replicas of the same Deployment except the trailing random N nibble.

embed.Values.Pod → App; AllocRequest.Pod kept for log context only,
new App and Image fields drive the embed call. handlers.go computes
both via deriveAppName + podImageRef helpers.

Tests: 7 new TestDeriveAppName_* cases (Deploy/STS/DS/bare/RS-without-
hash/non-controller-owner) + TestPodImageRef. Existing fuzz seeds
updated for the new keyword.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 11:42:06 -05:00
Donavan Fritz c860e9351b ip-algo: pod annotation > NodeConfig annotation > random
Build flock Image / build (push) Has been cancelled
Add flock.fritzlab.net/ip-algo as a node-wide default via NodeConfig
metadata.annotations. Pod-level annotation still wins. Empty, missing,
or invalid input at either level falls through to the next; invalid
values warn-log via the agent's slog. Both unset → fully random IID
(unchanged baseline).

ParseAnnotations no longer touches ip-algo; ResolveIPAlgo handles the
full precedence chain, called from PodHandler.Add with the cached
NodeConfig's annotations and the agent logger.

Tests: 9 new TestResolveIPAlgo_* cases covering pod-wins, all
fall-through paths, both-absent, nil node map, whitespace, and
duplicate-as-invalid. Fuzz target rebuilt without ip-algo input space
(now exercised by ResolveIPAlgo unit tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 11:09:09 -05:00
Donavan Fritz a6202a36bd defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
Build flock Image / build (push) Has been cancelled
BuiltinFamilyDefaults() now returns {WantV6: true, WantV4: true}. Pods
that want a single family explicitly opt out via the
flock.fritzlab.net/ipv4 (or ipv6) annotation, or the operator narrows
the default at the node level via NodeConfig.Spec.Defaults.

Annotation precedence is unchanged: pod annotation > NodeConfig defaults
> built-in baseline. Tests updated to reflect the new baseline; the
"opt out of v4" path now has explicit coverage.

Docs updated:
  - NodeConfig.Spec.Defaults Go doc + CRD descriptions reflect the new
    baseline and its overrides
  - README opening framing softened from "IPv6-first" to "dual-stack,
    IPv6-friendly"; example pods + spec.defaults table flipped to
    treat dual-stack as the default and v6/v4-only as overrides
  - README NetworkPolicy line in the comparison table flipped to
    "yes (nftables)" since v1 enforcement shipped
  - Limitations note about IPv4-only destinations rewritten — every
    pod has v4 by default now, so the question is whether your IPv4
    pool is routable beyond your network

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 10:07:48 -05:00
Donavan Fritz 71e584cf96 NodeConfig defaults + code-quality pass + fuzz tests + README
NodeConfig.Spec.Defaults adds per-node IPv6/IPv4 family defaults that pod
annotations can override; built-in baseline (v6=true, v4=false) still
applies when the field is omitted.

bird.Render now validates every operator-supplied value (peer addresses,
CIDRs, anycast IPs, source addresses) before templating — fuzz found a
peer address containing `}` produced unbalanced braces in bird.conf.
Failing input preserved as a regression seed.

Fuzz targets added for ParseAnnotations, ParseCNIArgs, HostIfaceName,
canonical, IPAM allocate sequences, embed.Embed, and bird.Render.
Hardened canonical/ipToU32 against nil and non-IPv4 inputs.

README rewritten for outside readers — quickstart, NodeConfig + annotation
reference with worked examples, anycast use cases, comparison vs Calico
and Cilium, requirements, limitations.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 09:25:45 -05:00
Donavan Fritz eb1f5e0d8d M2: netlink, IPAM/handler wiring, BIRD sidecar, CNI installer
Build flock Image / build (push) Has been cancelled
Code (Linux build, with no-op stubs for macOS dev):
- pkg/agent/netns_linux.go: ensureVeth → host-side configure (addrgenmode
  none, fe80::1/64, proxy_arp, forwarding) → move peer to pod ns →
  configure pod side (addr, default route via fe80::1, v4 169.254.1.1
  on-link gateway) → host /128 + /32 routes. Idempotent.
- pkg/agent/hostiface.go: deterministic host iface name flock<8hex> from
  FNV-1a-32(containerID).
- pkg/agent/annotations.go: parse flock.fritzlab.net/{ipv6,ipv4,cidr6,
  cidr4,ip-algo,anycast} with design-doc defaults; ParseCNIArgs for the
  K8S_POD_* keys kubelet sets.
- pkg/agent/podinfo.go: shared informer scoped to spec.nodeName==NODE,
  WaitForPod helper for ADD-vs-informer-sync race.
- pkg/agent/handlers.go: PodHandler does
    cache lookup → annotations → IPAM → store(pending) → SetupFunc →
    store(committed) → Result. Idempotent on retry. Del symmetric.
- pkg/routing/bird/config.go: text/template render with stable ordering;
  golden tests for host001 + anycast injection + sort stability.
- pkg/agent/bird.go: writes /etc/flock/bird/bird.conf, debounces 500ms,
  execs `birdc -s /run/flock/bird.ctl configure`. Installs blackhole
  kernel routes for the node summary CIDRs so BIRD's protocol kernel
  imports them.
- pkg/agent/runtime_linux.go: at startup, waits up to 60s for the per-
  node NodeConfig, reconciles committed allocations into IPAM.used,
  garbage-collects pending entries, builds PodHandler, swaps RPC
  handlers in.
- cmd/flock-installer: init-container binary that copies /opt/cni/bin/
  flock and writes 01-flock.conflist (lex-first so kubelet picks it
  over Calico's 10-calico.conflist on flock-labeled nodes).

Deploy:
- Dockerfile: alpine + iproute2 + bird2; multi-binary image.
- deploy/daemonset.yaml: install-cni init container; bird sidecar
  sharing /etc/flock/bird + /run/flock with the agent; ConfigMap-seeded
  bootstrap bird.conf so the sidecar boots before the agent renders.
  Privileged on flock-agent + install-cni; bird sidecar uses
  NET_ADMIN/RAW only.
- RBAC: pods + networkpolicies get/list/watch (the latter is reserved
  for M8 — harmless to grant now).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 22:33:48 -05:00