🛠️ Technical Stack
Overview of the Phase-1 ML/DL, Phase-2 classical ML, and Phase-3 transformer workflows that power this dashboard across NVIDIA CUDA and Apple Silicon MLX/MPS environments.
🧠
Core Deep Learning Framework
PyTorch 2.0+
Primary runtime for Phase-1 CNN/MLP/RNN training, custom transformer experiments, Phase-3 Vision Transformer pipelines, and shared optimization and export flows.
Transformers 4.30+
Hugging Face stack for tokenization, pre-trained checkpoints, compact text transformers, and PEFT-based LoRA fine-tuning workflows.
CUDA / MPS / MLX
CUDA accelerates NVIDIA training and quantized fine-tuning workloads.
Apple Silicon (MPS / MLX): Native GPU acceleration for M-series processors via Metal Performance Shaders, with MLX-LM enabling Apple-native quantized LoRA and QLoRA training and fused export flows.
🏗️
Model Coverage & Training Workflows
Model Coverage (transformer.py + backend/ml_models)
- Phase 1 ML/DL Models - CNN, MLP, and RNN training flows with CSV, JSONL, and image dataset support
- Custom Transformer Blocks - Attention, positional encoding, feed-forward layers, and decoder-oriented experiments
- Phase 2 Classical ML - Random Forest, SVM, and Logistic Regression registries with CSV schema inspection
- Phase 3 Models - Vision Transformer, MicroLLaMA, and MiniLLaMA workflows with optional ONNX export plus GGUF conversion for the text-model path
- Shared Dataset Pipeline - Built-in upload, sample-dataset generation, preview, validation, and run history flows
Training Workflows (finetune.py / finetune_mlx.py / backend/ml_models)
- Gradient Accumulation & MPS Safety - Automatic batch-size and sequence-length adjustments for constrained Apple Silicon memory
- LoRA / QLoRA Backends - PEFT plus bitsandbytes on CUDA and MLX-native quantized fine-tuning on Apple Silicon
- Resume, Validation & Scheduling - Cosine scheduling, gradient checkpointing, resume-from-checkpoint, and epoch validation samples
- Export Paths - SafeTensors, GGUF, ONNX, and artifact bundles for local deployment and comparison
- Telemetry & History - Training status, logs, history, and downloadable artifacts surfaced directly in the dashboard
🌐
Web Dashboard & Visualization
Flask 3.0+
Flask serves the unified Phase-1, Phase-2, and Phase-3 dashboard, training APIs, dataset upload routes, telemetry, and artifact export endpoints.
Plotly 5.20+
Interactive 3D visualizations for checkpoints, embeddings, layer structures, neural atlases, and interpretability scenes.
Marked.js
Markdown parsing for rendering documentation and workflow guides directly in the dashboard.
📊
Data Processing & System Monitoring
NumPy 1.24+
Array-centric processing for tabular, text, image, and geometry workflows across ML and deep-learning training.
PSUtil 5.9+
Real-time system and process monitoring for CPU, memory, GPU, and disk usage tracking.
TQDM 4.65+
Progress reporting for fine-tuning, classical ML jobs, dataset conversion, and export pipelines with ETA estimates.
🦙
Model Export & Deployment
SafeTensors 0.4+
Checkpoint and adapter serialization for transformer fine-tuning, resume bundles, and model conversion pipelines.
Ollama Integration
GGUF export path for PyTorch and Apple-native MLX outputs, with direct Ollama import for local inference.
GGUF / ONNX Runtime
llama.cpp quantization supports GGUF deployment, while ONNX and ONNX Runtime cover Phase-2 exports plus optional ONNX and GGUF paths for the Phase-3 text models.
🔧
ML, Tracking & Inference Tooling
TensorBoard 2.12+
Loss, learning-rate, and gradient tracking for deep-learning runs and checkpoint inspection.
Weights & Biases
Experiment tracking and collaboration platform for machine learning projects.
Scikit-learn 1.2+
Phase-2 classical ML stack for Random Forest, SVM, and Logistic Regression training, metrics, and optional skl2onnx export.
🔬
Transformer Interpretability Stack
LLM Training Dashboard progressively exposes the internal logic of large language models through twelve hierarchical layers, moving from high-level semantic geometry down to neuron-level concept discovery.
- Embedding Galaxy - Constructs the top-level semantic space, mapping tokens into a unified representational geometry.
- Brain Atlas - Defines the macro-architecture of the model, showing how major regions such as attention, MLP, and embedding blocks interconnect.
- Tensor Microarchitecture - Visualizes tensor statistics and heatmaps for fine-grained inspection of weight distributions and activation patterns.
- Head-Aware Q/K/V Decomposition - Separates attention heads to analyze norms, sparsity, and per-head behavior.
- Activation-Path Visualization - Traces query, key, and value activations through attention weights to reveal how information flows within a layer.
- Multi-Head Interaction Map - Examines horizontal structure through head-to-head similarity, clustering, redundancy, and specialization, including syntax, induction, and negation heads.
- Layer-to-Layer Activation Flow - Explores vertical structure by showing how outputs propagate across layers to form emergent circuits and conceptual hierarchies.
- MLP Neuron Concept Discovery - Discovers concept neurons and feature detectors in MLP layers that respond to interpretable patterns such as numbers, names, and emotions.
- Feature-Space Geometry - Visualizes principal components, concept directions, neuron clusters, and subspaces for specific behaviors.
- Time Drift Visualisation - Tracks how embeddings, attention patterns, and neuron behaviours shift across checkpoints to reveal when representations diverge or capabilities emerge.
- Gradient Flow & Influence Maps - Reveals why a model chose its output by tracing token‑level gradients, attribution signals, and causal influence pathways.
- Mechanistic Circuits & Subgraph Extraction - A multi‑level framework that reveals how transformer models represent, transform, and reason through their internal mechanisms.
🎯
System Architecture
Data Pipeline: CSV / JSONL / image-folder input → preview and validation → dataset builders → training loaders
Training Loop: Phase-1 deep learning loops / Phase-2 classical fit / Phase-3 transformer fine-tuning → telemetry → checkpoints and artifacts
Model Export: SafeTensors / .pkl / ONNX / GGUF → Ollama / ONNX Runtime / local artifact downloads
Dashboard: Flask API ↔ Phase-1/2/3 controls ↔ real-time updates ↔ 3D visualizations ↔ system monitoring
🚀 Built for Production-Ready LLM Training
Architecture, Design and Development by Franz Ayestaran / Enhanced Pair Programming with Claude Code & OpenAI GPT