Network Architecture
This document covers the Kubernetes application-level networking. For physical network topology and VLAN configuration, see Network Topology.
Container Networking (CNI)
Cilium CNI
The cluster uses Cilium as the primary Container Network Interface (CNI):
- Pod CIDR: 10.69.0.0/16 (native routing mode)
- Service CIDR: 10.96.0.0/16
- Mode: Non-exclusive (paired with Multus for multi-network support)
- Kube-Proxy Replacement: Enabled (eBPF-based service load balancing)
- Load Balancing Algorithm: Maglev with DSR (Direct Server Return)
- Network Policy: Endpoint routes enabled
- BPF Masquerading: Enabled for outbound traffic
Key Features:
- High-performance eBPF data plane
- Native Kubernetes network policy support
- L2 announcements for external load balancer IPs
- Advanced observability and monitoring
Multus CNI (Multiple Networks)
Multus provides additional network interfaces to pods beyond the primary Cilium network:
- Primary Use: IoT network attachment (VLAN-based isolation)
- Network Attachment: macvlan on ens19 interface
- Mode: Bridge mode with DHCP IPAM
- Purpose: Enable pods to connect to additional networks (e.g., IoT devices, legacy systems)
Pods can request additional networks via annotations:
metadata:
annotations:
k8s.v1.cni.cncf.io/networks: macvlan-conf
Ingress Controllers
The cluster uses dual ingress-nginx controllers for traffic routing:
Internal Ingress
- Class:
internal
(default) - Purpose: Internal services, private DNS
- Version: v4.13.3
- Load Balancer: Cilium L2 announcement
- DNS: Synced to internal DNS via k8s-gateway and External-DNS (UniFi webhook)
External Ingress
- Class:
external
- Purpose: Public-facing services
- Version: v4.13.3
- Load Balancer: Cilium L2 announcement
- DNS: Synced to Cloudflare via External-DNS
- Tunnel: Cloudflared for secure access
Load Balancer IP Management
Cilium L2 Announcements
Cilium's L2 announcement feature provides load balancer IPs for services:
- How it works: Cilium announces load balancer IPs via L2 (ARP/NDP)
- Policy-based: L2AnnouncementPolicy defines which services get announced
- Benefits:
- No external load balancer required
- Native Kubernetes LoadBalancer service type support
- High availability through leader election
- Automatic failover
Configuration: See kubernetes/apps/kube-system/cilium/config/l2.yaml
This enables both ingress controllers to receive external IPs that are accessible from the broader network.
Network Policies
graph LR subgraph Policies Default[Default Deny] Allow[Allowed Routes] end subgraph Apps Media[Media Stack] Monitor[Monitoring] DB[Databases] end Allow --> Media Allow --> Monitor Default --> DB
DNS Configuration
Internal DNS (k8s-gateway)
- Purpose: DNS server for internal ingresses
- Domain: Internal cluster services
- Integration: Works with External-DNS for automatic record creation
External-DNS (Dual Instances)
Instance 1: Internal DNS
- Provider: UniFi (via webhook provider)
- Target: UDM Pro Max
- Ingress Class:
internal
- Purpose: Sync private DNS records for internal services
Instance 2: External DNS
- Provider: Cloudflare
- Ingress Class:
external
- Purpose: Sync public DNS records for externally accessible services
How DNS Works
- Create an Ingress with class
internal
orexternal
- External-DNS watches for new/updated ingresses
- Appropriate External-DNS instance syncs DNS records to target provider
- Services become accessible via their configured hostnames
Security
Network Policies
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
TLS Configuration
- Automatic certificate management via cert-manager
- Let's Encrypt integration
- Internal PKI for service mesh
Service Mesh
Traffic Flow
graph LR subgraph Ingress External[External Traffic] Traefik[Traefik] end subgraph Services App1[Service 1] App2[Service 2] DB[Database] end External --> Traefik Traefik --> App1 Traefik --> App2 App1 --> DB App2 --> DB
Best Practices
-
Security
- Implement default deny policies
- Use TLS everywhere
- Regular security audits
- Network segmentation
-
Performance
- Load balancer optimization
- Connection pooling
- Proper resource allocation
- Traffic monitoring
-
Reliability
- High availability configuration
- Failover planning
- Backup routes
- Health checks
-
Monitoring
- Network metrics collection
- Traffic analysis
- Latency monitoring
- Bandwidth usage tracking
Troubleshooting
Common network issues and resolution steps:
-
Connectivity Issues
- Check network policies
- Verify DNS resolution
- Inspect service endpoints
- Review ingress configuration
-
Performance Problems
- Monitor network metrics
- Check for bottlenecks
- Analyze traffic patterns
- Review resource allocation