Manage Training Workflow, 3D Visualizations, Ollama Integration and Model Artifact Clean Up
Architecture, Design and Development by Franz Ayestaran / Enhanced Pair Programming with Claude Code & OpenAI GPT
Loading workflow documentation...
Download the complete PowerPoint presentation covering the LLM creation and Ollama deployment pipeline
Upload a .txt, .json, or .jsonl training file. JSON datasets can use records like {"instruction": "...", "output": "..."}.
Continue training from a saved checkpoint instead of starting from scratch.
Recommended resets the form to the dashboard baseline. Export saves the current dashboard settings as a versioned JSON config. Load applies a previously saved config back into the form and restores the referenced training file when available.
ℹ️ About: This will import your trained model into Ollama, allowing you to run it locally using the ollama run command.
⚠️ No GGUF file? If you don't see any GGUF files below, you need to convert your trained model first.
ollama run <name>
☁️ Cloud deploy: Use the second button to upload the selected GGUF and create the model on ollama.ayestaran.dev. This requires SSH access from the dashboard host to the server.
ℹ️ About: Manage models on ollama.ayestaran.dev. Listing uses the remote tags endpoint, and deletion uses SSH access from the dashboard host.
Loading artifacts...
Interactive 3D visualizations of your model's training process, checkpoints, embeddings, layer structures, and internal LLM topology.
Checking for visualizations...
A dedicated scene for inspecting how query, key, and value paths move through a transformer block. The layout is intentionally closer to a cinematic architecture diagram than the existing topology plot, with Q/K/V weight slabs, vector stages, layer norm, attention scoring, and animated token flow.
Separate Q, K, and V stages, token score routing, residual handoff, and downstream attention aggregation in one navigable 3D scene.
Orbit, pan, and zoom the scene. Click a subsystem to pin metadata and use the scene controls to jump straight to Q, K, V, or the attention matrix.
The page uses a hand-built scene rather than Plotly so the composition can read more like a technical explainer, matching the reference style more closely.
A dedicated brain-shaped scene for travelling through your trained model. The page samples real learned tensor rows from each transformer layer, places them in an outlined brain shell, and lets you orbit between embeddings, attention projections, feed-forward blocks, and the output head.
The center spine is the residual stream. Branches represent Q, K, V, and feed-forward structures. Smaller glowing nodes are strong sampled rows taken from the real checkpoint weights.
Use the built-in waypoints to jump from input cortex to mid-layer reasoning and then into the output crown. Click any node to pin its layer, tensor path, and sampled row metrics.
This atlas is sampled, not literal-all-edges. Rendering every trained connection in a browser is not feasible at LLM scale, so the scene uses the strongest learned pathways to stay explorable.
Comprehensive overview of the technologies, frameworks, and tools powering this LLM training dashboard.
Primary deep learning framework powering the transformer architecture, training loops, and model optimization.
Hugging Face Transformers library for model architecture, tokenization, and pre-trained model integration.
GPU acceleration support for high-performance model training with NVIDIA graphics cards.
Apple Silicon (MPS): Native GPU acceleration for the custom M-series ARM-based processors, via Metal Performance Shaders. PyTorch uses MPS for GPU support on Apple Silicon.
Lightweight web framework serving the training dashboard API and real-time status updates.
Interactive 3D visualizations for model checkpoints, embeddings, and layer structures.
Markdown parsing for rendering documentation and workflow guides directly in the dashboard.
Numerical computing library for efficient array operations and data manipulation.
Real-time system and process monitoring for CPU, memory, GPU, and disk usage tracking.
Progress bars for training loops and batch processing with ETA estimates.
Fast and secure model serialization format for checkpoint saving and model conversion.
Export trained models to GGUF format and import directly into Ollama for local inference.
Convert PyTorch models to GGUF format for efficient CPU/GPU inference with quantization.
Training visualization and metrics tracking for loss curves, learning rates, and gradients.
Experiment tracking and collaboration platform for machine learning projects.
Data preprocessing utilities and evaluation metrics for model performance analysis.
Data Pipeline: Text/JSONL Input → Tokenization → Dataset Creation → DataLoader
Training Loop: Forward Pass → Loss Calculation → Backpropagation → Optimizer Step → Checkpoint Save
Model Export: PyTorch Model → SafeTensors → GGUF Conversion → Ollama Import
Dashboard: Flask API ↔ Real-time Updates ↔ 3D Visualizations ↔ System Monitoring
🚀 Built for Production-Ready LLM Training
Architecture, Design and Development by Franz Ayestaran / Enhanced Pair Programming with Claude Code & OpenAI GPT