README.md

# flock

A small, opinionated Kubernetes CNI built around three ideas:

1. **Dual-stack, IPv6-friendly.** Every pod gets a globally routable IPv6
   address by default. IPv4 is also enabled by default; either family can
   be turned off per-node or per-pod when you really mean to.
2. **No tunnels, no NAT.** Pod addresses are the real packets on the wire.
   Each node speaks BGP to its upstream router and advertises its own
   per-node prefix. The pod network is just the LAN, plus host routes.
3. **Anycast as a primitive.** A pod can request an anycast address via
   an annotation; flock binds it on the pod's loopback and advertises a
   `/128` (or `/32`) over BGP, but only while the pod is `Ready`. Multiple
   replicas advertise the same address from different nodes for ECMP load
   balancing without a separate Service or external LB.

flock is built for clusters where every node already speaks BGP to one
or more upstream routers. It deliberately leaves out features you'd
expect from a general-purpose CNI — overlays, IPsec/Wireguard, IPAM
coordination across nodes, kube-proxy integration — so the moving parts
that remain are easy to reason about.

> **Status:** alpha. CRD shape and annotation keys may still change.

## Table of contents

- [How it works](#how-it-works)
- [Requirements](#requirements)
- [Quickstart](#quickstart)
- [NodeConfig CRD](#nodeconfig-crd)
- [Pod annotations](#pod-annotations)
- [Use cases](#use-cases)
- [Comparison vs Calico / Cilium](#comparison-vs-calico--cilium)
- [Limitations and non-goals](#limitations-and-non-goals)
- [Building and testing](#building-and-testing)
- [License](#license)

## How it works

Each node runs a single `flock-agent` DaemonSet pod with three containers:

- a privileged init container (`flock-installer`) that drops the CNI
  plugin binary into `/opt/cni/bin/flock` and writes
  `/etc/cni/net.d/01-flock.conflist`,
- the agent itself, which owns IPAM, programs veth pairs, and tracks
  pod readiness, and
- a [BIRD2](https://bird.network.cz/) sidecar that the agent re-renders
  and reloads when the per-node config or the active anycast set changes.

Each node has a `NodeConfig` CR (cluster-scoped, name = node name) that
declares its IPv6 and IPv4 prefixes, its local BGP ASN, and its upstream
peers. The agent reads the CR via a dynamic informer.

When kubelet runs the CNI plugin on `ADD`, the plugin opens a unix-socket
RPC to the agent. The agent allocates an address from the per-node
CIDRs, creates a veth pair, configures the pod side, persists the
allocation to `/var/lib/flock/allocations.json`, and returns the result.
There is no controller loop and no IPAM coordination across nodes — each
node owns a non-overlapping CIDR and allocates locally.

For anycast, the agent installs `<anycast-ip> via <pod-eth0-ip> dev <veth>`
host routes on the node and adds the anycast IP to BIRD's BGP export
filter. When a pod loses readiness, the agent withdraws the route from
both the kernel and BGP within one reconcile cycle (sub-second).

### Packet path

`pod.eth0` (a veth) ↔ host-side veth (with `addrgenmode none`,
`fe80::1/64`, proxy-ARP for the v4 default-via) ↔ host kernel ↔ uplink
NIC ↔ upstream router. No conntrack, no SNAT, no encapsulation.

For IPv6 the host side of every veth carries the deterministic link-local
gateway `fe80::1`, so every pod can use a fixed default route. For IPv4
the host side answers ARP for `169.254.1.1`, providing the same fixed
default route in v4.

## Requirements

- Linux nodes. flock has not been tested on, and does not target,
  Windows nodes.
- Kubernetes ≥ 1.27.
- An upstream router (or pair) that accepts a BGP session from each
  node. flock has been tested with Cisco IOS-XE, Arista EOS, and FRR
  acting as the upstream; anything that speaks standard eBGP should work.
- Globally routable (or at least datacentre-routable) IPv6 prefix
  delegated to the cluster, sliced into a per-node /64. IPv4 is
  optional but supported.
- Each node must have a unique local ASN. Private ASNs (`64512–65534`,
  `4200000000–4294967294`) are typical.

## Quickstart

```sh
# 1. Install CRD + RBAC + DaemonSet (single bundled manifest):
kubectl apply -f deploy/install.yaml

# 2. Label the node(s) you want flock to manage:
kubectl label node <node-name> flock.fritzlab.net/agent=

# 3. Apply a NodeConfig CR for that node (see "NodeConfig CRD" below):
kubectl apply -f my-nodeconfig.yaml

# 4. Verify the agent is up:
kubectl -n kube-system get pod -l app=flock-agent -o wide
kubectl -n kube-system exec -it ds/flock-agent -c bird -- \
    birdc -s /run/flock/bird.ctl show protocols
```

The DaemonSet is gated by the `flock.fritzlab.net/agent` node label, so
unlabelled nodes continue to use whatever CNI was installed before. This
lets you migrate node-by-node — start with one node, prove it works, then
proceed.

## NodeConfig CRD

A `NodeConfig` is the only operator-supplied input. One per node, name
matches the node name. Example:

```yaml
apiVersion: flock.fritzlab.net/v1alpha1
kind: NodeConfig
metadata:
  name: node-a
spec:
  cidr6:
    - 2001:db8:f001::/64       # Pods on this node get addresses from here.
  cidr4:
    - 192.0.2.0/24             # IPv4 pool, used only when a pod opts in.
  defaults:
    ipv6: true                 # Optional. Built-in baseline if omitted.
    ipv4: true                 # Optional. Built-in baseline if omitted.
  bgp:
    asn: 65101                 # This node's local ASN.
    peers:
      - address: 2001:db8::1   # Upstream router (IPv6 session).
        asn: 65000
      - address: 192.0.2.1     # Same router, IPv4 session.
        asn: 65000
```

### `spec.defaults`

`spec.defaults` controls which address families a pod *gets by default*
on this node — i.e. when the pod has no explicit `flock.fritzlab.net/ipv6`
or `flock.fritzlab.net/ipv4` annotation. Pod annotations always override.
If you omit `spec.defaults` (or any individual field inside it) flock
falls back to its built-in baseline of **dual-stack (IPv6 on, IPv4 on)**.

| Goal                              | `spec.defaults`                        |
|-----------------------------------|----------------------------------------|
| Dual-stack (the default)          | omit, or `{ ipv6: true,  ipv4: true }` |
| IPv6-only node                    | `{ ipv6: true,  ipv4: false }`         |
| IPv4-only (legacy node)           | `{ ipv6: false, ipv4: true }`          |

A NodeConfig that resolves to "neither family" is rejected at allocation
time, so misconfiguring both to false will surface as an error on the
first `CNI ADD`.

### `spec.bgp`

Each `peer` becomes one BGP session. The agent picks a node-local source
address on the same subnet as the peer; if there isn't one, BIRD uses
its default. Multi-homing (multiple peers per family — or per upstream
router pair) is allowed.

## Pod annotations

All annotations live under `flock.fritzlab.net/`. Every annotation is
optional; leave them off to inherit the per-node defaults.

| Annotation                          | Type   | Purpose                                                                                       |
|-------------------------------------|--------|-----------------------------------------------------------------------------------------------|
| `flock.fritzlab.net/ipv6`           | bool   | Override `spec.defaults.ipv6` for this pod (`true`/`false`).                                  |
| `flock.fritzlab.net/ipv4`           | bool   | Override `spec.defaults.ipv4` for this pod (`true`/`false`).                                  |
| `flock.fritzlab.net/cidr6`          | CIDRs  | Restrict IPv6 allocation to a sub-range of the node's `cidr6`. Comma-separated.               |
| `flock.fritzlab.net/cidr4`          | CIDRs  | Restrict IPv4 allocation to a sub-range of the node's `cidr4`. Comma-separated.               |
| `flock.fritzlab.net/ip-algo`        | list   | Embed identity into the IPv6 IID. Subset of `namespace,pod,image`, in order, comma-separated. |
| `flock.fritzlab.net/anycast`        | IPs    | Bind these IPs on the pod's `lo`; advertise via BGP while pod is `Ready`. Mixed v6+v4 ok.     |
| `flock.fritzlab.net/addresses`      | IPs    | Bind these IPs on the pod's `eth0`. The first v6 and first v4 **replace** IPAM allocation for that family — the addresses IP becomes the pod's primary IP. Mixed v6+v4 ok. Single-replica only in practice. |

Bool values must be the literal strings `"true"` or `"false"`
(case-insensitive, surrounding whitespace tolerated). Other values —
`1`, `0`, `yes`, `no` — are rejected so a typo can't silently flip
behaviour.

### `addresses` vs `anycast`

Both annotations bind operator-supplied IPs onto a pod and have flock
advertise `/128` (or `/32`) per-pod over BGP. The differences are
where the IP lands and what it's for:

|                            | `anycast`                                          | `addresses`                                                       |
|----------------------------|----------------------------------------------------|-------------------------------------------------------------------|
| Bound on                   | pod `lo`                                           | pod `eth0`                                                        |
| Multi-replica?             | yes — every Ready replica advertises the same IP and the upstream router ECMPs across them | no — the same IP on multiple replicas is operator error           |
| Replaces IPAM?             | no — pod still has an IPAM-allocated unicast IP   | **yes** — the first v6 + first v4 in the list become the pod's primary IPs in place of an IPAM allocation |
| Workload visibility        | only the IPAM IP is on the primary interface     | the public IP is `eth0`'s primary address — workloads that read their own NIC see it (e.g. Plex's remote-access detection) |

Use `anycast` for shared services with many replicas (DNS, ingress).
Use `addresses` when one specific pod needs a known public IP that the
workload itself must see on its primary interface.

### Conflict detection

`addresses` and `anycast` reject pods that supply an IP whose family is
disabled. If the resolved `WantV4` is false (via the pod's `ipv4`
annotation or the NodeConfig default) and any addresses- or
anycast-supplied IP is IPv4, the CNI ADD fails with an explicit error.
Same for v6. Both annotation types put IPs on a pod interface and rely
on the family being enabled for return-path routing — silently accepting
the IP would leave a non-functional pod.

### Outside-aggregate advertisement

When an `addresses` IP replaces IPAM (becomes the pod's primary IP) the
IP is typically **outside** the node's BGP aggregate (e.g. a public
`/32` on a node whose pod CIDR is private). flock notices this during
BGP rendering and advertises the IP individually as a per-pod `/32` or
`/128` so the upstream router has a route to it.

### Example pods

Default dual-stack — no annotations needed:

```yaml
apiVersion: v1
kind: Pod
metadata:
  name: minimal
```

IPv6 only — opt out of the default v4 allocation:

```yaml
apiVersion: v1
kind: Pod
metadata:
  name: v6-only
  annotations:
    flock.fritzlab.net/ipv4: "false"
```

Operator-friendly addressing — `fnv(namespace) | fnv(pod) | random`
packed into the host bits, so a pod's identity is recognisable from
its IP in `kubectl get pods -o wide`:

```yaml
metadata:
  annotations:
    flock.fritzlab.net/ip-algo: "namespace,pod"
```

Anycast service — three replicas, each advertising the same v6+v4
anycast pair from the node it lands on. The upstream router does ECMP
across the active set:

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dns
spec:
  replicas: 3
  template:
    metadata:
      annotations:
        flock.fritzlab.net/anycast: "2001:db8:a::53, 192.0.2.53"
    spec:
      containers:
        - name: coredns
          image: coredns/coredns
          readinessProbe:
            httpGet: { path: /ready, port: 8181 }
            periodSeconds: 1
            failureThreshold: 1
```

Workload with a known public IP — single-replica pod whose application
inspects its own primary interface (Plex's remote-access flow). The
addresses become the pod's primary IPs in place of any IPAM allocation;
the pod's `eth0` ends up with exactly the supplied addresses, and BGP
advertises them as a `/128` and `/32`:

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: plex
spec:
  replicas: 1
  template:
    metadata:
      annotations:
        flock.fritzlab.net/addresses: "2001:db8:c606::166, 192.0.2.166"
    spec:
      containers:
        - name: plex
          image: plexinc/pms-docker
```

## Use cases

**Highly-available DNS.** Run N CoreDNS replicas, each annotated with
the same `anycast` IP. Point client `/etc/resolv.conf` at the anycast
address. Each replica advertises a `/128` from its own node; the
upstream router does ECMP. Lose a pod, traffic fails over within a
probe cycle.

**Replacing a kube-proxy `ClusterIP`.** Headless Service plus an anycast
IP gives you a single stable address with load-balancing across pods,
without the DNAT-pinning that makes long-lived TCP keepalive connections
stick to one backend forever. ECMP routes each new flow independently.

**Per-pod public IPv6.** Because every pod has a globally routable IPv6
address and the cluster does no NAT, a pod's `eth0` IP is reachable from
the rest of the internet (subject to your firewall). Useful for things
like outgoing SMTP, where you want a stable from-address per pod, or for
peer-to-peer protocols that don't tolerate NAT.

**Fast pod identification in `kubectl`.** With
`flock.fritzlab.net/ip-algo: namespace,pod` the IPv6 host bits encode
the pod's namespace+name, so you can recognise a pod from its IP without
a lookup. Reverse-DNS via a wildcard zone makes those IPs human-readable
too.

**Static-IP migration.** Annotation-driven address allocation means you
can ask for a specific sub-CIDR (`cidr6: 2001:db8:f001::ab00/120`) for
services that previously needed pinned IPs (mail server, ingress
controller). When the static-IP requirement goes away, drop the
annotation and the pod gets a normal allocation.

## Comparison vs Calico / Cilium

|                          | flock                       | Calico                       | Cilium                       |
|--------------------------|-----------------------------|------------------------------|------------------------------|
| Default address family   | dual (IPv6+IPv4)            | IPv4                         | dual                         |
| BGP                      | yes (BIRD)                  | yes                          | optional                     |
| Overlay (VXLAN/IPIP)     | never                       | optional                     | yes (geneve) or native       |
| NAT in datapath          | never                       | masquerade by default        | masquerade by default        |
| Anycast pod addressing   | first-class                 | manual                       | optional, via service mesh   |
| eBPF datapath            | no                          | optional                     | yes                          |
| NetworkPolicy            | yes (nftables)              | yes (Felix)                  | yes (eBPF)                   |
| Cluster size target      | small (< 100 nodes)         | thousands                    | thousands                    |
| Operational surface area | low (1 DaemonSet, 1 CRD)    | medium                       | high                         |
| Production-ready         | alpha                       | yes                          | yes                          |

flock is not trying to compete with Calico or Cilium. The right answer
for most clusters is one of those two — flock exists for clusters where
every node already speaks BGP, the operator wants real (no NAT) IPv6
addressing on every pod, and per-pod anycast is something they actually
want to use rather than work around.

## Limitations and non-goals

- NetworkPolicy supports `networking.k8s.io/v1` (ingress + egress, all
  three peer types, numeric ports + port ranges). Named ports and
  AdminNetworkPolicy are not yet implemented.
- No NAT, no masquerade, no SNAT-egress. Pods reach the wider internet
  using their real cluster-routable addresses; if your IPv4 pool isn't
  routable beyond your network, those pods can't reach v4-only hosts on
  the public internet without help from your border router.
- No multi-cluster, no peering across clusters.
- Linux-only datapath.
- IPAM is per-node — there's no global allocator and no IP mobility.
  When a pod moves to a different node it gets a new address.
- The agent is privileged. It mounts `/var/run/netns`, configures veth
  pairs, manages kernel routes, and holds `CAP_NET_ADMIN`. This is
  inherent to being a CNI; reducing privilege further is not a goal.
- If BIRD dies but the agent stays up, pods on that node stop being
  reachable from off-node. The DaemonSet liveness probes catch this.

## Building and testing

```sh
# Unit tests + fuzz seed corpora (fast, ~1s):
go test ./...

# Targeted fuzz pass:
go test -run NEVERMATCH -fuzz=FuzzParseAnnotations -fuzztime=30s ./pkg/agent
go test -run NEVERMATCH -fuzz=FuzzRender           -fuzztime=30s ./pkg/routing/bird
go test -run NEVERMATCH -fuzz=FuzzEmbed            -fuzztime=30s ./pkg/embed
go test -run NEVERMATCH -fuzz=FuzzIPAM_Allocate    -fuzztime=30s ./pkg/agent

# Build the container image (used by the DaemonSet):
docker build -t flock:dev .
```

The fuzz tests are also run as plain unit tests via their seed corpora,
so every `go test ./...` exercises the discovered edge cases as
regressions.

`pkg/agent` has Linux-only files (`*_linux.go`) for netlink and netns
work; the macOS/Windows build pulls in stubs from `*_stub.go` so tests
run cleanly on developer laptops.

## License

Apache 2.0 — see [LICENSE](LICENSE).
-											flock M1 scaffold: CNI plugin + agent + NodeConfig CRD
										
										
											2026-04-24 21:17:42 -05:00
+								# flock
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								A small, opinionated Kubernetes CNI built around three ideas:
-											flock M1 scaffold: CNI plugin + agent + NodeConfig CRD
										
										
											2026-04-24 21:17:42 -05:00
-											defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
										
										
											2026-04-25 10:07:48 -05:00
+. **Dual-stack, IPv6-friendly.** Every pod gets a globally routable IPv6
 								   address by default. IPv4 is also enabled by default; either family can
 								   be turned off per-node or per-pod when you really mean to.
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+. **No tunnels, no NAT.** Pod addresses are the real packets on the wire.
 								   Each node speaks BGP to its upstream router and advertises its own
 								   per-node prefix. The pod network is just the LAN, plus host routes.
 . **Anycast as a primitive.** A pod can request an anycast address via
 								   an annotation; flock binds it on the pod's loopback and advertises a
 								   `/128` (or `/32`) over BGP, but only while the pod is `Ready`. Multiple
 								   replicas advertise the same address from different nodes for ECMP load
 								   balancing without a separate Service or external LB.
-											flock M1 scaffold: CNI plugin + agent + NodeConfig CRD
										
										
											2026-04-24 21:17:42 -05:00
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								flock is built for clusters where every node already speaks BGP to one
 								or more upstream routers. It deliberately leaves out features you'd
 								expect from a general-purpose CNI — overlays, IPsec/Wireguard, IPAM
 								coordination across nodes, kube-proxy integration — so the moving parts
 								that remain are easy to reason about.
-											flock M1 scaffold: CNI plugin + agent + NodeConfig CRD
										
										
											2026-04-24 21:17:42 -05:00
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								> **Status:** alpha. CRD shape and annotation keys may still change.
-											flock M1 scaffold: CNI plugin + agent + NodeConfig CRD
										
										
											2026-04-24 21:17:42 -05:00
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								## Table of contents
 								- [How it works](#how-it-works)
 								- [Requirements](#requirements)
 								- [Quickstart](#quickstart)
 								- [NodeConfig CRD](#nodeconfig-crd)
 								- [Pod annotations](#pod-annotations)
 								- [Use cases](#use-cases)
 								- [Comparison vs Calico / Cilium](#comparison-vs-calico--cilium)
 								- [Limitations and non-goals](#limitations-and-non-goals)
 								- [Building and testing](#building-and-testing)
 								- [License](#license)
 								## How it works
 								Each node runs a single `flock-agent` DaemonSet pod with three containers:
 								- a privileged init container (`flock-installer`) that drops the CNI
 								  plugin binary into `/opt/cni/bin/flock` and writes
 								  `/etc/cni/net.d/01-flock.conflist`,
 								- the agent itself, which owns IPAM, programs veth pairs, and tracks
 								  pod readiness, and
 								- a [BIRD2](https://bird.network.cz/) sidecar that the agent re-renders
 								  and reloads when the per-node config or the active anycast set changes.
 								Each node has a `NodeConfig` CR (cluster-scoped, name = node name) that
 								declares its IPv6 and IPv4 prefixes, its local BGP ASN, and its upstream
 								peers. The agent reads the CR via a dynamic informer.
 								When kubelet runs the CNI plugin on `ADD`, the plugin opens a unix-socket
 								RPC to the agent. The agent allocates an address from the per-node
 								CIDRs, creates a veth pair, configures the pod side, persists the
 								allocation to `/var/lib/flock/allocations.json`, and returns the result.
 								There is no controller loop and no IPAM coordination across nodes — each
 								node owns a non-overlapping CIDR and allocates locally.
 								For anycast, the agent installs `<anycast-ip> via <pod-eth0-ip> dev <veth>`
 								host routes on the node and adds the anycast IP to BIRD's BGP export
 								filter. When a pod loses readiness, the agent withdraws the route from
 								both the kernel and BGP within one reconcile cycle (sub-second).
 								### Packet path
 								`pod.eth0` (a veth) ↔ host-side veth (with `addrgenmode none`,
 								`fe80::1/64`, proxy-ARP for the v4 default-via) ↔ host kernel ↔ uplink
 								NIC ↔ upstream router. No conntrack, no SNAT, no encapsulation.
 								For IPv6 the host side of every veth carries the deterministic link-local
 								gateway `fe80::1`, so every pod can use a fixed default route. For IPv4
 								the host side answers ARP for `169.254.1.1`, providing the same fixed
 								default route in v4.
 								## Requirements
 								- Linux nodes. flock has not been tested on, and does not target,
 								  Windows nodes.
 								- Kubernetes ≥ 1.27.
 								- An upstream router (or pair) that accepts a BGP session from each
 								  node. flock has been tested with Cisco IOS-XE, Arista EOS, and FRR
 								  acting as the upstream; anything that speaks standard eBGP should work.
 								- Globally routable (or at least datacentre-routable) IPv6 prefix
 								  delegated to the cluster, sliced into a per-node /64. IPv4 is
 								  optional but supported.
 								- Each node must have a unique local ASN. Private ASNs (`64512–65534`,
 								  `4200000000–4294967294`) are typical.
 								## Quickstart
 								```sh
 								# 1. Install CRD + RBAC + DaemonSet (single bundled manifest):
 								kubectl apply -f deploy/install.yaml
 								# 2. Label the node(s) you want flock to manage:
 								kubectl label node <node-name> flock.fritzlab.net/agent=
 								# 3. Apply a NodeConfig CR for that node (see "NodeConfig CRD" below):
 								kubectl apply -f my-nodeconfig.yaml
 								# 4. Verify the agent is up:
 								kubectl -n kube-system get pod -l app=flock-agent -o wide
 								kubectl -n kube-system exec -it ds/flock-agent -c bird -- \
 								    birdc -s /run/flock/bird.ctl show protocols
 								```
 								The DaemonSet is gated by the `flock.fritzlab.net/agent` node label, so
 								unlabelled nodes continue to use whatever CNI was installed before. This
 								lets you migrate node-by-node — start with one node, prove it works, then
 								proceed.
 								## NodeConfig CRD
 								A `NodeConfig` is the only operator-supplied input. One per node, name
 								matches the node name. Example:
 								```yaml
 								apiVersion: flock.fritzlab.net/v1alpha1
 								kind: NodeConfig
 								metadata:
 								  name: node-a
 								spec:
 								  cidr6:
 								    - 2001:db8:f001::/64       # Pods on this node get addresses from here.
 								  cidr4:
 								    - 192.0.2.0/24             # IPv4 pool, used only when a pod opts in.
 								  defaults:
 								    ipv6: true                 # Optional. Built-in baseline if omitted.
-											defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
										
										
											2026-04-25 10:07:48 -05:00
+								    ipv4: true                 # Optional. Built-in baseline if omitted.
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								  bgp:
 								    asn: 65101                 # This node's local ASN.
 								    peers:
 								      - address: 2001:db8::1   # Upstream router (IPv6 session).
 								        asn: 65000
 								      - address: 192.0.2.1     # Same router, IPv4 session.
 								        asn: 65000
 								```
 								### `spec.defaults`
 								`spec.defaults` controls which address families a pod *gets by default*
 								on this node — i.e. when the pod has no explicit `flock.fritzlab.net/ipv6`
 								or `flock.fritzlab.net/ipv4` annotation. Pod annotations always override.
 								If you omit `spec.defaults` (or any individual field inside it) flock
-											defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
										
										
											2026-04-25 10:07:48 -05:00
+								falls back to its built-in baseline of **dual-stack (IPv6 on, IPv4 on)**.
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
-											defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
										
										
											2026-04-25 10:07:48 -05:00
+								| Goal                              | `spec.defaults`                        |
 								|-----------------------------------|----------------------------------------|
 								| Dual-stack (the default)          | omit, or `{ ipv6: true,  ipv4: true }` |
 								| IPv6-only node                    | `{ ipv6: true,  ipv4: false }`         |
 								| IPv4-only (legacy node)           | `{ ipv6: false, ipv4: true }`          |
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
 								A NodeConfig that resolves to "neither family" is rejected at allocation
 								time, so misconfiguring both to false will surface as an error on the
 								first `CNI ADD`.
 								### `spec.bgp`
 								Each `peer` becomes one BGP session. The agent picks a node-local source
 								address on the same subnet as the peer; if there isn't one, BIRD uses
 								its default. Multi-homing (multiple peers per family — or per upstream
 								router pair) is allowed.
 								## Pod annotations
 								All annotations live under `flock.fritzlab.net/`. Every annotation is
 								optional; leave them off to inherit the per-node defaults.
 								| Annotation                          | Type   | Purpose                                                                                       |
 								|-------------------------------------|--------|-----------------------------------------------------------------------------------------------|
 								| `flock.fritzlab.net/ipv6`           | bool   | Override `spec.defaults.ipv6` for this pod (`true`/`false`).                                  |
 								| `flock.fritzlab.net/ipv4`           | bool   | Override `spec.defaults.ipv4` for this pod (`true`/`false`).                                  |
 								| `flock.fritzlab.net/cidr6`          | CIDRs  | Restrict IPv6 allocation to a sub-range of the node's `cidr6`. Comma-separated.               |
 								| `flock.fritzlab.net/cidr4`          | CIDRs  | Restrict IPv4 allocation to a sub-range of the node's `cidr4`. Comma-separated.               |
 								| `flock.fritzlab.net/ip-algo`        | list   | Embed identity into the IPv6 IID. Subset of `namespace,pod,image`, in order, comma-separated. |
 								| `flock.fritzlab.net/anycast`        | IPs    | Bind these IPs on the pod's `lo`; advertise via BGP while pod is `Ready`. Mixed v6+v4 ok.     |
-											agent: addresses annotation replaces IPAM allocation
										
										
											2026-04-29 09:46:48 -05:00
+								| `flock.fritzlab.net/addresses`      | IPs    | Bind these IPs on the pod's `eth0`. The first v6 and first v4 **replace** IPAM allocation for that family — the addresses IP becomes the pod's primary IP. Mixed v6+v4 ok. Single-replica only in practice. |
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
 								Bool values must be the literal strings `"true"` or `"false"`
 								(case-insensitive, surrounding whitespace tolerated). Other values —
 								`1`, `0`, `yes`, `no` — are rejected so a typo can't silently flip
 								behaviour.
-											agent: addresses annotation replaces IPAM allocation
										
										
											2026-04-29 09:46:48 -05:00
+								### `addresses` vs `anycast`
 								Both annotations bind operator-supplied IPs onto a pod and have flock
 								advertise `/128` (or `/32`) per-pod over BGP. The differences are
 								where the IP lands and what it's for:
 								|                            | `anycast`                                          | `addresses`                                                       |
 								|----------------------------|----------------------------------------------------|-------------------------------------------------------------------|
 								| Bound on                   | pod `lo`                                           | pod `eth0`                                                        |
 								| Multi-replica?             | yes — every Ready replica advertises the same IP and the upstream router ECMPs across them | no — the same IP on multiple replicas is operator error           |
 								| Replaces IPAM?             | no — pod still has an IPAM-allocated unicast IP   | **yes** — the first v6 + first v4 in the list become the pod's primary IPs in place of an IPAM allocation |
 								| Workload visibility        | only the IPAM IP is on the primary interface     | the public IP is `eth0`'s primary address — workloads that read their own NIC see it (e.g. Plex's remote-access detection) |
 								Use `anycast` for shared services with many replicas (DNS, ingress).
 								Use `addresses` when one specific pod needs a known public IP that the
 								workload itself must see on its primary interface.
 								### Conflict detection
 								`addresses` and `anycast` reject pods that supply an IP whose family is
 								disabled. If the resolved `WantV4` is false (via the pod's `ipv4`
 								annotation or the NodeConfig default) and any addresses- or
 								anycast-supplied IP is IPv4, the CNI ADD fails with an explicit error.
 								Same for v6. Both annotation types put IPs on a pod interface and rely
 								on the family being enabled for return-path routing — silently accepting
 								the IP would leave a non-functional pod.
 								### Outside-aggregate advertisement
 								When an `addresses` IP replaces IPAM (becomes the pod's primary IP) the
 								IP is typically **outside** the node's BGP aggregate (e.g. a public
 								`/32` on a node whose pod CIDR is private). flock notices this during
 								BGP rendering and advertises the IP individually as a per-pod `/32` or
 								`/128` so the upstream router has a route to it.
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								### Example pods
-											defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
										
										
											2026-04-25 10:07:48 -05:00
+								Default dual-stack — no annotations needed:
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
 								```yaml
 								apiVersion: v1
 								kind: Pod
 								metadata:
 								  name: minimal
 								```
-											defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
										
										
											2026-04-25 10:07:48 -05:00
+								IPv6 only — opt out of the default v4 allocation:
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
 								```yaml
 								apiVersion: v1
 								kind: Pod
 								metadata:
-											defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
										
										
											2026-04-25 10:07:48 -05:00
+								  name: v6-only
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								  annotations:
-											defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
										
										
											2026-04-25 10:07:48 -05:00
+								    flock.fritzlab.net/ipv4: "false"
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								```
 								Operator-friendly addressing — `fnv(namespace) | fnv(pod) | random`
 								packed into the host bits, so a pod's identity is recognisable from
 								its IP in `kubectl get pods -o wide`:
 								```yaml
 								metadata:
 								  annotations:
 								    flock.fritzlab.net/ip-algo: "namespace,pod"
 								```
 								Anycast service — three replicas, each advertising the same v6+v4
 								anycast pair from the node it lands on. The upstream router does ECMP
 								across the active set:
 								```yaml
 								apiVersion: apps/v1
 								kind: Deployment
 								metadata:
 								  name: dns
 								spec:
 								  replicas: 3
 								  template:
 								    metadata:
 								      annotations:
 								        flock.fritzlab.net/anycast: "2001:db8:a::53, 192.0.2.53"
 								    spec:
 								      containers:
 								        - name: coredns
 								          image: coredns/coredns
 								          readinessProbe:
 								            httpGet: { path: /ready, port: 8181 }
 								            periodSeconds: 1
 								            failureThreshold: 1
 								```
-											agent: addresses annotation replaces IPAM allocation
										
										
											2026-04-29 09:46:48 -05:00
+								Workload with a known public IP — single-replica pod whose application
 								inspects its own primary interface (Plex's remote-access flow). The
 								addresses become the pod's primary IPs in place of any IPAM allocation;
 								the pod's `eth0` ends up with exactly the supplied addresses, and BGP
 								advertises them as a `/128` and `/32`:
 								```yaml
 								apiVersion: apps/v1
 								kind: Deployment
 								metadata:
 								  name: plex
 								spec:
 								  replicas: 1
 								  template:
 								    metadata:
 								      annotations:
 								        flock.fritzlab.net/addresses: "2001:db8:c606::166, 192.0.2.166"
 								    spec:
 								      containers:
 								        - name: plex
 								          image: plexinc/pms-docker
 								```
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								## Use cases
 								**Highly-available DNS.** Run N CoreDNS replicas, each annotated with
 								the same `anycast` IP. Point client `/etc/resolv.conf` at the anycast
 								address. Each replica advertises a `/128` from its own node; the
 								upstream router does ECMP. Lose a pod, traffic fails over within a
 								probe cycle.
 								**Replacing a kube-proxy `ClusterIP`.** Headless Service plus an anycast
 								IP gives you a single stable address with load-balancing across pods,
 								without the DNAT-pinning that makes long-lived TCP keepalive connections
 								stick to one backend forever. ECMP routes each new flow independently.
 								**Per-pod public IPv6.** Because every pod has a globally routable IPv6
 								address and the cluster does no NAT, a pod's `eth0` IP is reachable from
 								the rest of the internet (subject to your firewall). Useful for things
 								like outgoing SMTP, where you want a stable from-address per pod, or for
 								peer-to-peer protocols that don't tolerate NAT.
 								**Fast pod identification in `kubectl`.** With
 								`flock.fritzlab.net/ip-algo: namespace,pod` the IPv6 host bits encode
 								the pod's namespace+name, so you can recognise a pod from its IP without
 								a lookup. Reverse-DNS via a wildcard zone makes those IPs human-readable
 								too.
 								**Static-IP migration.** Annotation-driven address allocation means you
 								can ask for a specific sub-CIDR (`cidr6: 2001:db8:f001::ab00/120`) for
 								services that previously needed pinned IPs (mail server, ingress
 								controller). When the static-IP requirement goes away, drop the
 								annotation and the pod gets a normal allocation.
 								## Comparison vs Calico / Cilium
 								|                          | flock                       | Calico                       | Cilium                       |
 								|--------------------------|-----------------------------|------------------------------|------------------------------|
-											defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
										
										
											2026-04-25 10:07:48 -05:00
+								| Default address family   | dual (IPv6+IPv4)            | IPv4                         | dual                         |
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								| BGP                      | yes (BIRD)                  | yes                          | optional                     |
 								| Overlay (VXLAN/IPIP)     | never                       | optional                     | yes (geneve) or native       |
 								| NAT in datapath          | never                       | masquerade by default        | masquerade by default        |
 								| Anycast pod addressing   | first-class                 | manual                       | optional, via service mesh   |
 								| eBPF datapath            | no                          | optional                     | yes                          |
-											defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
										
										
											2026-04-25 10:07:48 -05:00
+								| NetworkPolicy            | yes (nftables)              | yes (Felix)                  | yes (eBPF)                   |
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								| Cluster size target      | small (< 100 nodes)         | thousands                    | thousands                    |
 								| Operational surface area | low (1 DaemonSet, 1 CRD)    | medium                       | high                         |
 								| Production-ready         | alpha                       | yes                          | yes                          |
 								flock is not trying to compete with Calico or Cilium. The right answer
 								for most clusters is one of those two — flock exists for clusters where
-											defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
										
										
											2026-04-25 10:07:48 -05:00
+								every node already speaks BGP, the operator wants real (no NAT) IPv6
 								addressing on every pod, and per-pod anycast is something they actually
 								want to use rather than work around.
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
 								## Limitations and non-goals
-											defaults: built-in baseline is dual-stack (IPv6 + IPv4), not IPv6-only
										
										
											2026-04-25 10:07:48 -05:00
+								- NetworkPolicy supports `networking.k8s.io/v1` (ingress + egress, all
 								  three peer types, numeric ports + port ranges). Named ports and
 								  AdminNetworkPolicy are not yet implemented.
 								- No NAT, no masquerade, no SNAT-egress. Pods reach the wider internet
 								  using their real cluster-routable addresses; if your IPv4 pool isn't
 								  routable beyond your network, those pods can't reach v4-only hosts on
 								  the public internet without help from your border router.
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								- No multi-cluster, no peering across clusters.
 								- Linux-only datapath.
 								- IPAM is per-node — there's no global allocator and no IP mobility.
 								  When a pod moves to a different node it gets a new address.
 								- The agent is privileged. It mounts `/var/run/netns`, configures veth
 								  pairs, manages kernel routes, and holds `CAP_NET_ADMIN`. This is
 								  inherent to being a CNI; reducing privilege further is not a goal.
 								- If BIRD dies but the agent stays up, pods on that node stop being
 								  reachable from off-node. The DaemonSet liveness probes catch this.
 								## Building and testing
 								```sh
 								# Unit tests + fuzz seed corpora (fast, ~1s):
 								go test ./...
 								# Targeted fuzz pass:
 								go test -run NEVERMATCH -fuzz=FuzzParseAnnotations -fuzztime=30s ./pkg/agent
 								go test -run NEVERMATCH -fuzz=FuzzRender           -fuzztime=30s ./pkg/routing/bird
 								go test -run NEVERMATCH -fuzz=FuzzEmbed            -fuzztime=30s ./pkg/embed
 								go test -run NEVERMATCH -fuzz=FuzzIPAM_Allocate    -fuzztime=30s ./pkg/agent
 								# Build the container image (used by the DaemonSet):
 								docker build -t flock:dev .
 								```
 								The fuzz tests are also run as plain unit tests via their seed corpora,
 								so every `go test ./...` exercises the discovered edge cases as
 								regressions.
 								`pkg/agent` has Linux-only files (`*_linux.go`) for netlink and netns
 								work; the macOS/Windows build pulls in stubs from `*_stub.go` so tests
 								run cleanly on developer laptops.
-											flock M1 scaffold: CNI plugin + agent + NodeConfig CRD
										
										
											2026-04-24 21:17:42 -05:00
 								## License
-											NodeConfig defaults + code-quality pass + fuzz tests + README
										
										
											2026-04-25 09:25:45 -05:00
+								Apache 2.0 — see [LICENSE](LICENSE).