axolotl-ai-cloud · NanoCode012 · Feb 25, 2025 · Feb 19, 2025 · Feb 19, 2025 · Feb 19, 2025
diff --git a/_quarto.yml b/_quarto.yml
@@ -25,30 +25,55 @@ website:
       contents:
         - text: Home
           href: index.qmd
-        - section: "How-To Guides"
+
+        - section: "Getting Started"
           contents:
-          # TODO Edit folder structure after we have more docs.
             - docs/getting-started.qmd
             - docs/installation.qmd
-            - docs/debugging.qmd
+            - docs/cli.qmd
             - docs/inference.qmd
-            - docs/multipack.qmd
-            - docs/fsdp_qlora.qmd
-            - docs/input_output.qmd
-            - docs/rlhf.qmd
-            - docs/nccl.qmd
-            - docs/mac.qmd
+
+        - section: "Dataset Formats"
+          contents: docs/dataset-formats/*
+
+        - section: "Deployments"
+          contents:
             - docs/multi-gpu.qmd
             - docs/multi-node.qmd
-            - docs/unsloth.qmd
-            - docs/amd_hpc.qmd
             - docs/ray-integration.qmd
-        - section: "Dataset Formats"
-          contents: docs/dataset-formats/*
+            - docs/amd_hpc.qmd
+            - docs/mac.qmd
+
+        - section: "How To Guides"
+          contents:
+            - docs/multimodal.qmd
+            - docs/rlhf.qmd
+            - docs/reward_modelling.qmd
+            - docs/lr_groups.qmd
+            - docs/lora_optims.qmd
+
+        - section: "Core Concepts"
+          contents:
+            - docs/batch_vs_grad.qmd
+            - docs/dataset_preprocessing.qmd
+            - docs/multipack.qmd
+
+        - section: "Advanced Features"
+          contents:
+            - docs/fsdp_qlora.qmd
+            - docs/unsloth.qmd
+            - docs/torchao.qmd
+            - docs/custom_integrations.qmd
+
+        - section: "Troubleshooting"
+          contents:
+            - docs/faq.qmd
+            - docs/debugging.qmd
+            - docs/nccl.qmd
+
         - section: "Reference"
           contents:
             - docs/config.qmd
-        - docs/faq.qmd
 
 format:
   html:

diff --git a/docs/amd_hpc.qmd b/docs/amd_hpc.qmd
@@ -1,5 +1,5 @@
 ---
-title: Training with AMD GPUs on HPC Systems
+title: AMD GPUs on HPC Systems
 description: A comprehensive guide for using Axolotl on distributed systems with AMD GPUs
 ---
 

diff --git a/docs/cli.qmd b/docs/cli.qmd
@@ -1,28 +1,19 @@
-# Axolotl CLI Documentation
+---
+title: "CLI Reference"
+format:
+  html:
+    toc: true
+    toc-expand: 1
+    toc-depth: 2
+execute:
+  enabled: false
+---
 
 The Axolotl CLI provides a streamlined interface for training and fine-tuning large language models. This guide covers
 the CLI commands, their usage, and common examples.
 
-### Table of Contents
 
-- Basic Commands
-- Command Reference
-  - fetch
-  - preprocess
-  - train
-  - inference
-  - merge-lora
-  - merge-sharded-fsdp-weights
-  - evaluate
-  - lm-eval
-- Legacy CLI Usage
-- Remote Compute with Modal Cloud
-  - Cloud Configuration
-  - Running on Modal Cloud
-  - Cloud Configuration Options
-
-
-### Basic Commands
+# Basic Commands
 
 All Axolotl commands follow this general structure:
 
@@ -32,9 +23,9 @@ axolotl <command> [config.yml] [options]
 
 The config file can be local or a URL to a raw YAML file.
 
-### Command Reference
+# Command Reference
 
-#### fetch
+## fetch
 
 Downloads example configurations and deepspeed configs to your local machine.
 
@@ -49,7 +40,7 @@ axolotl fetch deepspeed_configs
 axolotl fetch examples --dest path/to/folder
 ```
 
-#### preprocess
+## preprocess
 
 Preprocesses and tokenizes your dataset before training. This is recommended for large datasets.
 
@@ -74,7 +65,7 @@ dataset_prepared_path: Local folder for saving preprocessed data
 push_dataset_to_hub: HuggingFace repo to push preprocessed data (optional)
 ```
 
-#### train
+## train
 
 Trains or fine-tunes a model using the configuration specified in your YAML file.
 
@@ -95,7 +86,7 @@ axolotl train config.yml --no-accelerate
 axolotl train config.yml --resume-from-checkpoint path/to/checkpoint
 ```
 
-#### inference
+## inference
 
 Runs inference using your trained model in either CLI or Gradio interface mode.
 
@@ -115,7 +106,7 @@ cat prompt.txt | axolotl inference config.yml \
     --base-model="./completed-model"
 ```
 
-#### merge-lora
+## merge-lora
 
 Merges trained LoRA adapters into the base model.
 
@@ -137,7 +128,7 @@ gpu_memory_limit: Limit GPU memory usage
 lora_on_cpu: Load LoRA weights on CPU
 ```
 
-#### merge-sharded-fsdp-weights
+## merge-sharded-fsdp-weights
 
 Merges sharded FSDP model checkpoints into a single combined checkpoint.
 
@@ -146,7 +137,7 @@ Merges sharded FSDP model checkpoints into a single combined checkpoint.
 axolotl merge-sharded-fsdp-weights config.yml
 ```
 
-#### evaluate
+## evaluate
 
 Evaluates a model's performance using metrics specified in the config.
 
@@ -155,7 +146,7 @@ Evaluates a model's performance using metrics specified in the config.
 axolotl evaluate config.yml
 ```
 
-#### lm-eval
+## lm-eval
 
 Runs LM Evaluation Harness on your model.
 
@@ -170,12 +161,12 @@ axolotl lm-eval config.yml --tasks arc_challenge,hellaswag
 Configuration options:
 
 ```yaml
-lm_eval_tasks: List of tasks to evaluate
-lm_eval_batch_size: Batch size for evaluation
-output_dir: Directory to save evaluation results
+lm_eval_tasks: # List of tasks to evaluate
+lm_eval_batch_size: # Batch size for evaluation
+output_dir: # Directory to save evaluation results
 ```
 
-### Legacy CLI Usage
+# Legacy CLI Usage
 
 While the new Click-based CLI is preferred, Axolotl still supports the legacy module-based CLI:
 
@@ -195,12 +186,18 @@ accelerate launch -m axolotl.cli.inference config.yml \
     --lora_model_dir="./outputs/lora-out" --gradio
 ```
 
-### Remote Compute with Modal Cloud
+::: {.callout-important}
+When overriding CLI parameters in the legacy CLI, use same notation as in yaml file (e.g., `--lora_model_dir`).
+
+**Note:** This differs from the new Click-based CLI, which uses dash notation (e.g., `--lora-model-dir`). Keep this in mind if you're referencing newer documentation or switching between CLI versions.
+:::
+
+# Remote Compute with Modal Cloud
 
 Axolotl supports running training and inference workloads on Modal cloud infrastructure. This is configured using a
 cloud YAML file alongside your regular Axolotl config.
 
-#### Cloud Configuration
+## Cloud Configuration
 
 Create a cloud config YAML with your Modal settings:
 
@@ -215,13 +212,17 @@ branch: main    # Git branch to use (optional)
 volumes:        # Persistent storage volumes
   - name: axolotl-cache
     mount: /workspace/cache
+  - name: axolotl-data
+    mount: /workspace/data
+  - name: axolotl-artifacts
+    mount: /workspace/artifacts
 
 env:            # Environment variables
   - WANDB_API_KEY
   - HF_TOKEN
 ```
 
-#### Running on Modal Cloud
+## Running on Modal Cloud
 
 Commands that support the --cloud flag:
 
@@ -239,18 +240,18 @@ axolotl train config.yml --cloud cloud_config.yml --no-accelerate
 axolotl lm-eval config.yml --cloud cloud_config.yml
 ```
 
-#### Cloud Configuration Options
+## Cloud Configuration Options
 
 ```yaml
-provider: compute provider, currently only `modal` is supported
-gpu: GPU type to use
-gpu_count: Number of GPUs (default: 1)
-memory: RAM in GB (default: 128)
-timeout: Maximum runtime in seconds
-timeout_preprocess: Preprocessing timeout
-branch: Git branch to use
-docker_tag: Custom Docker image tag
-volumes: List of persistent storage volumes
-env: Environment variables to pass
-secrets: Secrets to inject
+provider: #compute provider, currently only `modal` is supported
+gpu: # GPU type to use
+gpu_count: # Number of GPUs (default: 1)
+memory: # RAM in GB (default: 128)
+timeout: # Maximum runtime in seconds
+timeout_preprocess: # Preprocessing timeout
+branch: # Git branch to use
+docker_tag: # Custom Docker image tag
+volumes: # List of persistent storage volumes
+env: # Environment variables to pass
+secrets: # Secrets to inject
 ```
diff --git a/docs/custom_integrations.qmd b/docs/custom_integrations.qmd
@@ -0,0 +1,32 @@
+---
+title: Custom Integrations
+---
+
+
+Axolotl adds custom features through `integrations`. They are located within the `src/axolotl/integrations` directory.
+
+To enable them, please check the respective documentations.
+
+## Cut Cross Entropy
+
+Please see [here](https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/cut_cross_entropy)
+
+## Grokfast
+
+Please see [here](https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/grokfast)
+
+## Knowledge Distillation (KD)
+
+Please see [here](https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/kd)
+
+## Liger Kernels
+
+Please see [here](https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/liger)
+
+## Language Model Evaluation Harness (LM Eval)
+
+Please see [here](https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/lm_eval)
+
+## Spectrum
+
+Please see [here](https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/spectrum)