Question 1

Can I run large language models on a CometVPS server?

Accepted Answer

Yes. Quantized 7B–14B models (Llama 3.1, Mistral, Qwen 2.5) run comfortably on Supernova Ryzen plans for personal and small-team use. Quantized 30B–70B models run on our AstroMetal Ryzen 7950X3D dedicated server with 128GB RAM. CometVPS currently focuses on CPU inference, which is well-suited for these workloads — for real-time 70B+ inference at high concurrency you'll want a dedicated GPU host. We're evaluating GPU offerings and may expand here in the future.

Question 2

Do I need a GPU for self-hosted AI?

Accepted Answer

Not for everything. Tools like LiteLLM, n8n, Open WebUI, OpenClaw, and Flowise don't run models themselves — they orchestrate calls to remote APIs (OpenAI, Anthropic, Groq, etc.) and only need CPU and RAM. If you want to run models locally, modern AMD Ryzen CPUs with quantized GGUF models via Ollama are surprisingly capable for 7B–14B models, and acceptable for 30B–70B at lower throughput.

Question 3

Is my data private when self-hosting AI?

Accepted Answer

When you run local models with Ollama, your prompts and completions never leave your server. When you proxy to remote providers like OpenAI through LiteLLM, those providers see your prompts but the orchestration logic, conversation history, embeddings, and uploaded documents stay on your VPS. You also fully control retention — nothing is logged to a third-party AI vendor for training.

Question 4

How much does it cost compared to hosted AI services?

Accepted Answer

It depends on usage. For high-volume agent and inference workloads, self-hosting wins quickly — a $58/month Supernova VPS can handle workloads that would cost hundreds per month on per-token services. For low-volume use, you mostly benefit from privacy, control, and the absence of usage caps rather than raw cost savings.

Question 5

What's the easiest way to get started?

Accepted Answer

If you want a personal AI chat: deploy Open WebUI + Ollama on a Supernova Flare. If you want to automate workflows: deploy n8n on a Core VPS or Supernova plan. If you want to control AI spend across your team: deploy LiteLLM on a Core VPS. All four have step-by-step Docker setup guides linked above.

Question 6

Can I run multiple AI tools on one server?

Accepted Answer

Yes — many of these tools are designed to be combined. A common stack is Ollama + Open WebUI + LiteLLM + n8n on a single Supernova or AstroMetal server, giving you a private AI chat, model gateway, and automation platform in one deployment.

Core VPS

Supernova VPS

Vault VPS

AstroMetal

AI Hosting Built for Self-Hosters

Why Self-Host Your AI Stack?

Data Sovereignty

No Per-Token Markup

Mix and Match Models

What You Can Run

OpenClaw

n8n

Ollama

Open WebUI

LiteLLM

Flowise

Pick Your AI Workload

Lightweight Agents & Proxies

Heavy Automation & Multi-Agent

Local CPU Inference (7B–70B)

A note on CPU vs GPU inference

Built for AI Workloads

NVMe SSD Storage

10Gbps Network

99.9% Uptime SLA

DDoS Protection

Frequently Asked Questions

Ready to Build Your AI Stack?