When Calico shuts down on a flock-labeled node, calico-node sets
NetworkUnavailable=True with reason CalicoIsDown. Nothing replaces it,
so kubelet's NodeController applies node.kubernetes.io/network-
unavailable:NoSchedule and new pods can't land.
flock-agent now patches Status.Conditions every 60s with
NetworkUnavailable=False (reason=FlockReady). RBAC: nodes/status patch.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
BIRD2's protocol kernel does not import kernel routes by default; the
import filter on the channel is just for what BIRD has already learned.
Added `learn;` so the kernel-installed blackholes (from the agent's
SummaryRoutes) are picked up.
Also added explicit `protocol static static6/static4` with one
`route <cidr> blackhole;` per NodeConfig CIDR. This is belt-and-
suspenders: even if `learn` doesn't capture the kernel blackhole, BIRD
has the route directly and exports it via the BGP filter.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Calico fenced off via Tigera Installation CR (apps@2121892). flock-agent
now renders bird.conf with the per-node BGP peers; bird sidecar reloads
on changes (debounced 500ms). Re-render tick every 15s reacts to
NodeConfig updates.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Calico's calico-node still runs on every node (Tigera-Operator-managed
via ArgoCD with selfHeal). Two birds with the same ASN can't peer to
crt001 from the same source. Use a manual static route on crt001 for
the flock /64 for the first cutover; switch to live BGP after Calico is
fenced off flock-labeled nodes.
The bird sidecar stays running with the bootstrap config (kernel +
device only, no BGP), so flipping live BGP on later is a single-line
change in runtime_linux.go.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Locks the wire format between /opt/cni/bin/flock and flock-agent. ADD
returns a CNI Result, DEL returns success/error, CHECK returns
success/error. Connection-per-RPC, newline-delimited JSON.
- pkg/cni/rpc.go: shared Op + Request + Response + framed encode/decode.
- pkg/cni/rpc_client.go: net.Dial + EncodeRequest + DecodeResponse;
rpcSocket overridable for tests.
- pkg/cni/plugin.go: real implementations of CmdAdd/Del/Check that call
through, mapping agent errors to types.Error.
- pkg/agent/rpc.go: rpcServer with swappable AddHandler/DelHandler/
CheckHandler (defaults: not-implemented for ADD; idempotent-no-op for
DEL/CHECK so kubelet teardown of a never-ADDed pod doesn't fail).
- pkg/agent/server.go: replaces the M1 accept-and-close placeholder
with rpcServer.serve(ctx, listener); listener closes on ctx cancel.
Tests cover: Request/Response JSON roundtrip, end-to-end client →
unix-socket → fake server, agent error → CNI types.Error mapping.
ADD remains "not implemented" until netlink + IPAM wire-up — the agent
returns an error and kubelet will fail pod sandbox creation IF a node
were configured to use this CNI. host001's CNI plane is still 100%
Calico, so this changes nothing observable on the cluster.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Core building block for M2 CNI ADD. Pure logic (no netlink), mutex-
serialized, seedable from committed state via MarkInUse. Hooks into
pkg/embed for ip-algo IID derivation.
- resolveEffective() implements the design-doc cidr6/cidr4 annotation
rules: equal→node, supernet→node, subnet→ann, disjoint→error.
First-match-wins across multiple annotation CIDRs.
- allocV6() random IID within the effective CIDR; on ip-algo, defers
to embed.Embed. 16-retry on collision (regenerates IID or N nibble).
- allocV4() linear scan skipping .0 (network), .1 (gateway), .<last>
(broadcast). Smallest supported block: /30 with 1 usable address.
- Deterministic fakeRand in tests covers: intersection matrix, random
IID, embed path, collision→retry, v4 skip-gateway, v4 exhaustion,
dual-stack, release-then-reallocate, family mismatch rejection.
No agent Run-loop integration yet — NewIPAM(nc.Spec.CIDR6, nc.Spec.CIDR4)
will be called from Server.Run once netlink + RPC are in place.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Agent now watches nodeconfigs.flock.fritzlab.net via a client-go dynamic
informer, filters events to its own node name, and caches the typed
NodeConfig in memory (NodeConfigCache, atomic pointer). M2's IPAM will
read from that cache.
- pkg/agent/nodeconfig.go: informer + JSON-round-trip decode (avoids
hand-written DeepCopy + scheme registration for this small a use).
- pkg/agent/server.go: starts the informer goroutine; Run terminates if
the informer returns.
- pkg/api/v1alpha1: switch placeholder TypeMeta/ObjectMeta to metav1.
- deploy/rbac: get/list/watch on nodeconfigs.
- cmd/flock-agent: --kubeconfig flag for out-of-cluster runs (tests).
Satisfies M1 verified-by: "kubectl apply NodeConfig; agent logs read it".
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
The runner runs jobs via act + DinD; `docker run -v "$PWD:/src"` from
inside the job container mounts the runner-job filesystem, not the
docker daemon's host fs, so the mount appears empty and `go test ./...`
fails with "directory prefix . does not contain main module".
Run tests in the same container that builds — same workspace, no mount.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>