Cloud (Modal)¶

TorchUMM integrates with Modal for running inference, evaluation, and training on cloud GPUs. Each model runs in an isolated container image with the correct Python, PyTorch, and dependency versions --- no local environment conflicts.

Setup¶

# Install Modal
pip install modal

# Login (one-time)
modal setup

Create required secrets on the Modal Dashboard:

modal secret create huggingface-secret HF_TOKEN=hf_xxx
modal secret create wandb-secret WANDB_API_KEY=xxx          # for training

No external APIs needed for scoring

UEval and WISE scoring both use local Qwen models as the evaluator — no Gemini or OpenAI API keys required. Download the evaluator weights once before running scoring steps:

modal run modal/download.py --model evaluator

Key Commands¶

Download Model Weights¶

# Download a specific model (one-time per model)
modal run modal/download.py --model bagel

# Download a dataset
modal run modal/download.py --dataset ueval

# Verify what is cached
modal run modal/download.py --ls

Run Evaluation¶

# Run an evaluation benchmark
modal run modal/run.py --model bagel --eval-config modal_dpg_bench_bagel

# Specify GPU type
modal run modal/run.py --model omnigen2 --gpu H100

# Multi-GPU
modal run modal/run.py --model wise --eval-config modal_score_wise_bagel --gpu A100-80GB:2

Run Inference¶

modal run modal/run.py --model bagel --script inferencer.py

Run Training¶

modal run modal/train.py --config bagel_sft

Sync Code¶

After modifying code locally, sync it to the cloud volume:

modal run modal/download.py --sync

Minimal sync

For small changes, consider syncing individual files rather than the entire codebase. Use modal run modal/download.py --sync only when many files have changed.

Architecture¶

Modal infrastructure is organized in modal/:

File	Purpose
`config.py`	Constants: volume names, HF model/dataset IDs, paths
`volumes.py`	Persistent volume definitions (models, datasets, checkpoints, outputs, codebase)
`images.py`	Container image definitions per backbone model
`download.py`	Download model weights and datasets to volumes
`run.py`	Unified entry point for inference and evaluation
`train.py`	Post-training entry point

Volumes¶

Volume	Name	Container Mount Path
Codebase	`umm-codebase`	`/workspace`
Model weights	`umm-model-cache`	`/model_cache`
Post-train weights	`umm-post-train-model-cache`	`/post_train_model_cache`
Datasets	`umm-datasets-cache`	`/datasets`
Checkpoints	`umm-checkpoints`	`/checkpoints`
Outputs	`umm-outputs`	`/outputs`

Path convention

Eval config files use container mount paths (e.g., /outputs/ueval/bagel), not volume names or local paths.