LLM Training Dashboard - (AI-Augmented Software Engineering)
🤖 LLM Training Dashboard - (AI-Augmented Software Engineering)
Manage Training Workflow, 3D Visualizations, Ollama Integration and Model Artifact Clean Up Architecture, Design and Development by Franz Ayestaran / Enhanced Pair Programming with Claude Code
Loading workflow documentation...
📊 LLM Training Pipeline Presentation
Download the complete PowerPoint presentation covering the LLM creation and Ollama deployment pipeline
Upload a .txt, .json, or .jsonl training file. JSON datasets can use records like {"instruction": "...", "output": "..."}.
File:
Size: | lines
Format:
📝 Edit Training Data
1\n
Tip: Each line is a training example. Make sure your data is clean and relevant to what you want the model to learn.
📊 Training Data Analysis
📄 File Statistics
Size:
Lines:
Words:
🔤 Tokenization
Total tokens:
Unique tokens:
Vocab coverage:
🧠 Model Utilization
Model params:
Embedding use:
Training seqs:
⏱️ Estimated Training Time
1 epoch
3 epochs
5 epochs
10 epochs
💡 Recommendations
✓Ready to Train
Training will use:
⚙️ Training Configuration
e.g. 0.00001 for 1e-5
LoRA Configuration
QLoRA uses bitsandbytes on NVIDIA CUDA and switches to an Apple-native MLX-LM path on Apple Silicon for real quantized adapter training.
Recommended defaults: 5 epochs, batch size 4, QLoRA, native profile.
Native keeps each backend's preferred defaults. Comparable aligns sequence length, dataset packing, and adapter scope more closely so Apple and CUDA runs are easier to compare.
Keep this enabled for easiest inference and export. Disable it to save only the LoRA adapter and reduce disk usage.
🔄 Resume from Checkpoint (Optional)
Continue training from a saved checkpoint instead of starting from scratch.
Epoch:-
Step:-
Loss:-
Size:-
Created:-
ℹ️ No checkpoints available yet. Checkpoints will appear here during training.
Recommended resets the form to the dashboard baseline. Export saves the current dashboard settings as a versioned JSON config. Load applies a previously saved config back into the form and restores the referenced training file when available.
⚠️ Please upload a training data file first
📊 Training Progress
Progress0%
Initializing...
⏱️ 0s
⏳ ETA: --
📉Current Loss
--
📊Epoch Avg Loss
--
🧪 Epoch Quality Check
Epoch -
Prompt
--
Model Output
--
Reference
--
Starting training...
💻 System Status
⚙️
CPU
0 cores
Usage0%
0 MHz
🧠
Memory (RAM)
0 GB total
Used0%
0 GB / 0 GB available
🎮
GPU
Checking...
GPU Utilization0%
VRAM: 0 GB / 0 GB
No GPU detected or PyTorch not installed
🖥️
Device Information
Platform:-
Architecture:-
Processor:-
Python:-
📊
Utilization Details
CPU Cores
Disk I/O
📖 Read
0 MB
✏️ Write
0 MB
Network I/O
📤 Sent
0 MB
📥 Received
0 MB
Last updated: Never•Updates every 2 seconds
🤖 Chat with Your Model
💬 Generate Text
Use a model loaded from this workspace or switch to the remote Ollama server.
Select the exact local model directory or remote Ollama model tag the chat tab should use.
Exact mode returns the saved training answer verbatim for exact prompt matches. Context mode still queries the model, but includes the matched dataset entry as supporting context.
Higher values = longer responses (50-2000 tokens)
Fact0.20 • mostly factualFiction
Lower values reduce creative drift. Set this near 0 for deterministic factual answers, especially on NVIDIA.
📝 Generated Text
Response Source:--
Model Target:--
Dataset Context:--
Dataset Path:--
Supporting Dataset Prompt:--
🦙 Import Model to Ollama
📦 Import GGUF Model
ℹ️ About: This will import your trained model into Ollama, allowing you to run it locally using the ollama run command.
⚠️ No GGUF file? If you don't see any GGUF files below, you need to convert your trained model first.
This will be the name you use with ollama run <name>
☁️ Cloud deploy: Use the second button to upload the selected GGUF and create the model on ollama.ayestaran.dev. This requires SSH access from the dashboard host to the server.
Cloud Deployment Progress0%
Waiting to start deployment...
📝 Import Output
☁️ Ollama Cloud Server Admin
ℹ️ About: Manage models on ollama.ayestaran.dev. Listing uses the remote tags endpoint, and deletion uses SSH access from the dashboard host.
Loading cloud models...
☁️
No Cloud Models Found
The Ollama cloud server is reachable, but no models are currently installed.
Model
Size
Family
Quantization
Updated
📝 Admin Output
📊 Training Artifacts
Loading artifacts...
Total Items
0
Total Size
0 B
Categories
0
⚠️ Warning
Resetting will permanently delete all training artifacts including models, visualizations, logs, and GGUF files. This action cannot be undone unless you create a backup.
✨
No Training Artifacts Found
Your workspace is clean! Start training to see artifacts here.
📊 3D Model Visualizations
Interactive 3D visualizations of your model's training process, checkpoints, embeddings, and layer structures.
Checking for visualizations...
🎯 Checkpoints 3D 🔗
🏗️ Layers 3D 🔗
🌐 Embedding 3D 🔗
🎪 Checkpoints Centroids 3D 🔗
💡 Tip: These visualizations are interactive! You can rotate, zoom, and explore the 3D spaces by clicking and dragging. Click any title (🔗) to open in a new tab for full-screen interaction.
Regenerating Visualizations0%
Starting...
📊
No Visualizations Available
Train your model first to generate 3D visualizations. The visualizations will appear here after training completes.
🛠️ Technical Stack
Comprehensive overview of the technologies, frameworks, and tools powering this LLM training dashboard.
🧠
Core Deep Learning Framework
PyTorch 2.0+
Primary deep learning framework powering the transformer architecture, training loops, and model optimization.
Transformers 4.30+
Hugging Face Transformers library for model architecture, tokenization, and pre-trained model integration.
CUDA Toolkit
GPU acceleration support for high-performance model training with NVIDIA graphics cards.
Apple Silicon (MPS): Native GPU acceleration for the custom M-series ARM-based processors, via Metal Performance Shaders. PyTorch uses MPS for GPU support on Apple Silicon.
🏗️
Custom Transformer Implementation
Core Components (transformer.py)
Multi-Head Attention - Scaled dot-product attention with multiple heads for parallel processing
Positional Encoding - Sinusoidal position embeddings for sequence ordering
Feed-Forward Networks - Position-wise fully connected layers with GELU activation
Transformer Layers - Encoder and decoder layers with residual connections and layer normalization
Model Variants - Support for both encoder-decoder and decoder-only architectures
Training Utilities (train_utils.py)
Gradient Accumulation - Memory-efficient training with large batch sizes
Mixed Precision Training - FP16 automatic mixed precision for faster training
Learning Rate Scheduling - Cosine annealing and warmup strategies
Checkpoint Management - Save/load model states with SafeTensors format
Text Generation - Temperature, top-k, and top-p sampling strategies
🌐
Web Dashboard & Visualization
Flask 3.0+
Lightweight web framework serving the training dashboard API and real-time status updates.
Plotly 5.20+
Interactive 3D visualizations for model checkpoints, embeddings, and layer structures.
Marked.js
Markdown parsing for rendering documentation and workflow guides directly in the dashboard.
📊
Data Processing & System Monitoring
NumPy 1.24+
Numerical computing library for efficient array operations and data manipulation.
PSUtil 5.9+
Real-time system and process monitoring for CPU, memory, GPU, and disk usage tracking.
TQDM 4.65+
Progress bars for training loops and batch processing with ETA estimates.
🦙
Model Export & Deployment
SafeTensors 0.4+
Fast and secure model serialization format for checkpoint saving and model conversion.
Ollama Integration
Export trained models to GGUF format and import directly into Ollama for local inference.
llama.cpp
Convert PyTorch models to GGUF format for efficient CPU/GPU inference with quantization.
🔧
Optional Tools & Extensions
TensorBoard 2.12+
Training visualization and metrics tracking for loss curves, learning rates, and gradients.
Weights & Biases
Experiment tracking and collaboration platform for machine learning projects.
Scikit-learn 1.2+
Data preprocessing utilities and evaluation metrics for model performance analysis.