Dapper Cluster Documentation

This documentation covers the architecture, configuration, and operations of the Dapper Kubernetes cluster, a high-performance home lab infrastructure with GPU capabilities.

Cluster Overview

graph TD
    subgraph Control Plane
        CP1[Control Plane 1<br>4 CPU, 16GB]
        CP2[Control Plane 2<br>4 CPU, 16GB]
        CP3[Control Plane 3<br>4 CPU, 16GB]
    end

    subgraph Worker Nodes
        W1[Worker 1<br>16 CPU, 128GB]
        W2[Worker 2<br>16 CPU, 128GB]
        GPU[GPU Node<br>16 CPU, 128GB<br>4x Tesla P100]
    end

    CP1 --- CP2
    CP2 --- CP3
    CP3 --- CP1

    Control Plane --> Worker Nodes

Hardware Specifications

Control Plane

3 nodes for high availability
4 CPU cores per node
16GB RAM per node
Dedicated to cluster control plane operations

Worker Nodes

2 general-purpose worker nodes
16 CPU cores per node
128GB RAM per node
Handles general workloads and applications

GPU Node

Specialized GPU worker node
16 CPU cores
128GB RAM
4x NVIDIA Tesla P100 GPUs
Handles ML/AI and GPU-accelerated workloads

Key Features

High-availability Kubernetes cluster
GPU acceleration support
Automated deployment using Flux CD
Secure secrets management with SOPS
NFS and OpenEBS storage integration
Comprehensive monitoring and observability
Media services automation

Infrastructure Components

graph TD
    subgraph Core Services
        Flux[Flux CD]
        Storage[Storage Layer]
        Network[Network Layer]
    end

    subgraph Applications
        Media[Media Stack]
        Monitor[Monitoring]
        GPU[GPU Workloads]
    end

    Core Services --> Applications

    Storage --> |NFS/OpenEBS| Applications
    Network --> |Ingress/DNS| Applications

Documentation Structure

Architecture: Detailed technical documentation about cluster design and components
- High-availability control plane design
- Storage architecture and configuration
- Network topology and policies
- GPU integration and management
Applications: Information about deployed applications and their configurations
- Media services stack
- Monitoring and observability
- GPU-accelerated applications
Operations: Guides for installation, maintenance, and troubleshooting
- Cluster setup procedures
- Node management
- GPU configuration
- Maintenance tasks

Getting Started

For new users, we recommend starting with:

Architecture Overview - Understanding the cluster design
Installation Guide - Setting up the cluster
Application Stack - Deploying applications