🛠️ Technical Stack

Overview of the Phase-1 ML/DL, Phase-2 classical ML, and Phase-3 transformer workflows that power this dashboard across NVIDIA CUDA and Apple Silicon MLX/MPS environments.

🧠 Core Deep Learning Framework

PyTorch 2.0+

Primary runtime for Phase-1 CNN/MLP/RNN training, custom transformer experiments, Phase-3 Vision Transformer pipelines, and shared optimization and export flows.

Transformers 4.30+

Hugging Face stack for tokenization, pre-trained checkpoints, compact text transformers, and PEFT-based LoRA fine-tuning workflows.

CUDA / MPS / MLX

CUDA accelerates NVIDIA training and quantized fine-tuning workloads.

Apple Silicon (MPS / MLX): Native GPU acceleration for M-series processors via Metal Performance Shaders, with MLX-LM enabling Apple-native quantized LoRA and QLoRA training and fused export flows.

🏗️ Model Coverage & Training Workflows

Model Coverage (transformer.py + backend/ml_models)

  • Phase 1 ML/DL Models - CNN, MLP, and RNN training flows with CSV, JSONL, and image dataset support
  • Custom Transformer Blocks - Attention, positional encoding, feed-forward layers, and decoder-oriented experiments
  • Phase 2 Classical ML - Random Forest, SVM, and Logistic Regression registries with CSV schema inspection
  • Phase 3 Models - Vision Transformer, MicroLLaMA, and MiniLLaMA workflows with optional ONNX export plus GGUF conversion for the text-model path
  • Shared Dataset Pipeline - Built-in upload, sample-dataset generation, preview, validation, and run history flows

Training Workflows (finetune.py / finetune_mlx.py / backend/ml_models)

  • Gradient Accumulation & MPS Safety - Automatic batch-size and sequence-length adjustments for constrained Apple Silicon memory
  • LoRA / QLoRA Backends - PEFT plus bitsandbytes on CUDA and MLX-native quantized fine-tuning on Apple Silicon
  • Resume, Validation & Scheduling - Cosine scheduling, gradient checkpointing, resume-from-checkpoint, and epoch validation samples
  • Export Paths - SafeTensors, GGUF, ONNX, and artifact bundles for local deployment and comparison
  • Telemetry & History - Training status, logs, history, and downloadable artifacts surfaced directly in the dashboard

🌐 Web Dashboard & Visualization

Flask 3.0+

Flask serves the unified Phase-1, Phase-2, and Phase-3 dashboard, training APIs, dataset upload routes, telemetry, and artifact export endpoints.

Plotly 5.20+

Interactive 3D visualizations for checkpoints, embeddings, layer structures, neural atlases, and interpretability scenes.

Marked.js

Markdown parsing for rendering documentation and workflow guides directly in the dashboard.

📊 Data Processing & System Monitoring

NumPy 1.24+

Array-centric processing for tabular, text, image, and geometry workflows across ML and deep-learning training.

PSUtil 5.9+

Real-time system and process monitoring for CPU, memory, GPU, and disk usage tracking.

TQDM 4.65+

Progress reporting for fine-tuning, classical ML jobs, dataset conversion, and export pipelines with ETA estimates.

🦙 Model Export & Deployment

SafeTensors 0.4+

Checkpoint and adapter serialization for transformer fine-tuning, resume bundles, and model conversion pipelines.

Ollama Integration

GGUF export path for PyTorch and Apple-native MLX outputs, with direct Ollama import for local inference.

GGUF / ONNX Runtime

llama.cpp quantization supports GGUF deployment, while ONNX and ONNX Runtime cover Phase-2 exports plus optional ONNX and GGUF paths for the Phase-3 text models.

🔧 ML, Tracking & Inference Tooling

TensorBoard 2.12+

Loss, learning-rate, and gradient tracking for deep-learning runs and checkpoint inspection.

Weights & Biases

Experiment tracking and collaboration platform for machine learning projects.

Scikit-learn 1.2+

Phase-2 classical ML stack for Random Forest, SVM, and Logistic Regression training, metrics, and optional skl2onnx export.

🔬 Transformer Interpretability Stack

LLM Training Dashboard progressively exposes the internal logic of large language models through twelve hierarchical layers, moving from high-level semantic geometry down to neuron-level concept discovery.

  1. Embedding Galaxy - Constructs the top-level semantic space, mapping tokens into a unified representational geometry.
  2. Brain Atlas - Defines the macro-architecture of the model, showing how major regions such as attention, MLP, and embedding blocks interconnect.
  3. Tensor Microarchitecture - Visualizes tensor statistics and heatmaps for fine-grained inspection of weight distributions and activation patterns.
  4. Head-Aware Q/K/V Decomposition - Separates attention heads to analyze norms, sparsity, and per-head behavior.
  5. Activation-Path Visualization - Traces query, key, and value activations through attention weights to reveal how information flows within a layer.
  6. Multi-Head Interaction Map - Examines horizontal structure through head-to-head similarity, clustering, redundancy, and specialization, including syntax, induction, and negation heads.
  7. Layer-to-Layer Activation Flow - Explores vertical structure by showing how outputs propagate across layers to form emergent circuits and conceptual hierarchies.
  8. MLP Neuron Concept Discovery - Discovers concept neurons and feature detectors in MLP layers that respond to interpretable patterns such as numbers, names, and emotions.
  9. Feature-Space Geometry - Visualizes principal components, concept directions, neuron clusters, and subspaces for specific behaviors.
  10. Time Drift Visualisation - Tracks how embeddings, attention patterns, and neuron behaviours shift across checkpoints to reveal when representations diverge or capabilities emerge.
  11. Gradient Flow & Influence Maps - Reveals why a model chose its output by tracing token‑level gradients, attribution signals, and causal influence pathways.
  12. Mechanistic Circuits & Subgraph Extraction - A multi‑level framework that reveals how transformer models represent, transform, and reason through their internal mechanisms.

🎯 System Architecture

Data Pipeline: CSV / JSONL / image-folder input → preview and validation → dataset builders → training loaders

Training Loop: Phase-1 deep learning loops / Phase-2 classical fit / Phase-3 transformer fine-tuning → telemetry → checkpoints and artifacts

Model Export: SafeTensors / .pkl / ONNX / GGUF → Ollama / ONNX Runtime / local artifact downloads

Dashboard: Flask API ↔ Phase-1/2/3 controls ↔ real-time updates ↔ 3D visualizations ↔ system monitoring

🚀 Built for Production-Ready LLM Training

Architecture, Design and Development by Franz Ayestaran / Enhanced Pair Programming with Claude Code & OpenAI GPT