BoxGPT RTX PRO 6000 96GB - Front AngleBoxGPT RTX PRO 6000 96GB - Top I/O PanelBoxGPT RTX PRO 6000 96GB - Rear I/O PortsBoxGPT RTX PRO 6000 96GB - Internal Components

RTX PRO 6000 96GB

Designed for advanced local AI workloads, this system features the RTX PRO 6000 with 96GB of VRAM. Easily handles some of the largest models, complex image/video generation pipelines, and demanding enterprise workloads. Comes with OpenWebUI, Ollama, and ComfyUI pre-installed. Everything runs locally on your hardware - your data never leaves your machine

Ideal for AI developers and enterprise workloads
Pre-installed AI runtimes with latest models
1-year limited warranty and tech support
One-time purchase, no subscriptions or cloud fees
Select RAM

30-day returns · Free shipping

Performance

LLM Inference Speed

Tokens generated per second for a single user session. The average human reads about 4-5 words per second (~6-8 tokens), so anything above 30 tok/s feels instantaneous. Speed decreases with multiple concurrent users sharing the GPU. Tested on a selection of most popular models.

Default Ollama settings, no CPU offloading

Default Ollama settings, no CPU offloading

Capacity

Max Context Window

Maximum tokens that fit entirely in VRAM without CPU offloading. A 128K context holds roughly 200 pages of text. Larger contexts let the AI remember more of your conversation history and analyze longer documents. Context size is limited by available VRAM after loading the model weights.

Creative

Image/Video Generation Speed

Seconds to generate one 512x512 image or 5 seconds of video. Lower is better for rapid iteration. Higher resolutions take longer. Tested on a selection of most popular models. Generation speed scales with GPU compute power and VRAM bandwidth.

Default ComfyUI templates, 512x512, 81 frames for video models, same prompt for all models

Benchmark results are based on internal testing and may vary depending on workload, model parameters, system configuration, and thermal conditions.