BoxGPT Dual RTX 5060 Ti 32GB - Front ViewBoxGPT Dual RTX 5060 Ti 32GB - Rear I/O PortsBoxGPT Dual RTX 5060 Ti 32GB - Internal ComponentsBoxGPT Dual RTX 5060 Ti 32GB - Side View

Dual RTX 5060 Ti 16GB

A powerful AI server featuring dual RTX 5060 Ti GPUs with 32GB combined VRAM. Run larger language models, handle multi-user workloads, and accelerate image generation with twice the processing power. Comes with OpenWebUI, Ollama, and ComfyUI pre-installed and ready to use. Everything runs locally on your hardware - your data never leaves your machine.

Ideal for personal use or small teams
Pre-installed AI runtimes with latest models
1-year limited warranty and tech support
One-time purchase, no subscriptions or cloud fees

30-day returns · Free shipping

Performance

LLM Inference Speed

Tokens generated per second for a single user session. The average human reads about 4-5 words per second (~6-8 tokens), so anything above 30 tok/s feels instantaneous. Speed decreases with multiple concurrent users sharing the GPU. Tested on a selection of most popular models.

Default Ollama settings, no CPU offloading

Default Ollama settings, no CPU offloading

Capacity

Max Context Window

Maximum tokens that fit entirely in VRAM without CPU offloading. A 128K context holds roughly 200 pages of text. Larger contexts let the AI remember more of your conversation history and analyze longer documents. Context size is limited by available VRAM after loading the model weights.

Creative

Image/Video Generation Speed

Seconds to generate one 512x512 image or 5 seconds of video. Lower is better for rapid iteration. Higher resolutions take longer. Tested on a selection of most popular models. Generation speed scales with GPU compute power and VRAM bandwidth.

Default ComfyUI templates, 512x512, 81 frames for video models, same prompt for all models

Benchmark results are based on internal testing and may vary depending on workload, model parameters, system configuration, and thermal conditions.