BoxGPT RTX PRO 5000 48GB - Front AngleBoxGPT RTX PRO 5000 48GB - Top I/O PanelBoxGPT RTX PRO 5000 48GB - Rear I/O PortsBoxGPT RTX PRO 5000 48GB - Internal Components

RTX PRO 5000 48GB

Powerful AI server featuring the RTX PRO 5000 with 48GB VRAM. Handle medium-large models, complex image/video generation pipelines, and demanding enterprise workloads with ease. Comes with OpenWebUI, Ollama, and ComfyUI pre-installed. Everything runs locally on your hardware - your data never leaves your machine.

Ideal for AI developers and enterprise workloads
Pre-installed AI runtimes with latest models
1-year limited warranty and tech support
One-time purchase, no subscriptions or cloud fees

30-day returns · Free shipping

Performance

LLM Inference Speed

Tokens generated per second for a single user session. The average human reads about 4-5 words per second (~6-8 tokens), so anything above 30 tok/s feels instantaneous. Speed decreases with multiple concurrent users sharing the GPU. Tested on a selection of most popular models.

Default Ollama settings, no CPU offloading

Default Ollama settings, no CPU offloading

Capacity

Max Context Window

Maximum tokens that fit entirely in VRAM without CPU offloading. A 128K context holds roughly 200 pages of text. Larger contexts let the AI remember more of your conversation history and analyze longer documents. Context size is limited by available VRAM after loading the model weights.

Creative

Image/Video Generation Speed

Seconds to generate one 512x512 image or 5 seconds of video. Lower is better for rapid iteration. Higher resolutions take longer. Tested on a selection of most popular models. Generation speed scales with GPU compute power and VRAM bandwidth.

Default ComfyUI templates, 512x512, 81 frames for video models, same prompt for all models

Benchmark results are based on internal testing and may vary depending on workload, model parameters, system configuration, and thermal conditions.