OpenAI-compatible · Drop-in replacement · No waitlist

Every model.
One endpoint.

Nexus AI gives you access to GPT-4o, Claude 3.5, Gemini 2.0, Llama 3.3 and 150+ models through a single OpenAI-compatible API. Point your SDK here and you're done.

2.4K
Requests / sec
150+
Models
99.9%
Uptime SLA
bash — api.nexus.ai/v1
124.8M
API requests served
↑ 18% this week
< 42ms
Median latency p50
↓ 8ms since last deploy
153
Models available
+12 added this month
$0.0002
Per 1K tokens (avg)
Up to 10× cheaper than direct
Quick Start
Start building in 30 seconds

Copy a demo API key, point any OpenAI-compatible SDK at https://api.nexus.ai, and run a test request. Demo keys are seeded for evaluation only and rotate automatically.

Your API Keys Active
Production API Key
sk-proj-J1j0Gr9G...Cx3uyw
Created · Last used
GitHub Actions CI
sk-proj-9C5cSvoA...FuKj3X
Created · Last used
Backend Service
sk-kNSBuhffYxACN...1gbT5e
Created · Last used
Data Pipeline
sk-proj-sPaDvtpl...xdw9hf
Created · Last used
Internal Tooling
sk-4A7rGe2tdqvhX...mmp8ro
Created · Last used
Analytics Service
sk-proj-hXuJmQHP...QesnzB
Created · Last used
# pip install openai import openai client = openai.OpenAI( api_key = "sk-nexus-...", base_url = "https://api.nexus.ai/v1" ) response = client.chat.completions.create( model = "gpt-4o", messages = [ {"role": "user", "content": "Hello!"} ] ) print(response.choices[0].message.content)
// npm install openai import OpenAI from 'openai'; const client = new OpenAI({ apiKey: 'sk-nexus-...', baseURL: 'https://api.nexus.ai/v1', }); const res = await client.chat.completions.create({ model: 'gpt-4o', messages: [ { role: 'user', content: 'Hello!' } ], }); console.log(res.choices[0].message.content);
curl https://api.nexus.ai/v1/chat/completions \ -H "Authorization: Bearer sk-nexus-..." \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o", "messages": [ { "role": "user", "content": "Hello!" } ] }'
Drop-in compatible with the official OpenAI SDK — just change base_url and your existing code works.
Model Catalog
Every frontier model
View all 153 models
gpt-4o
OpenAI
Flagship multimodal model. Text, vision, and audio. Fastest GPT-4 class model.
128K ctx Vision Audio Functions
Available
claude-3-5-sonnet
Anthropic
Best-in-class coding and analysis. Strong instruction following and computer use.
200K ctx Vision Computer Use
Available
gemini-2.0-flash
Google DeepMind
Gemini's fastest and most efficient model. Excellent at multimodal reasoning tasks.
1M ctx Vision Grounding
Available
llama-3.3-70b
Meta AI
Best open-source model at 70B. Competitive with GPT-4o for most tasks.
128K ctx Open Source Functions
Available
mistral-large-2
Mistral AI
Top-tier European AI. Strong multilingual performance and code generation.
128K ctx Multilingual Functions
Available
gpt-4o-mini
OpenAI
Small, fast, cheap. Ideal for classification, extraction, and high-volume tasks.
128K ctx Low cost Vision
Available
Pricing
Simple, transparent pricing

Start for free. Scale as you grow. No seat fees, no hidden charges — you only pay for what you use.

Save 20%
Free
$ 0 / mo
Everything you need to prototype and experiment. No credit card required.
Get started free
  • 100K tokens/day included
  • Access to 40+ models
  • OpenAI-compatible API
  • Community support
  • Priority routing
  • Fine-tuning API
Enterprise
Custom
For teams with volume, compliance, or infrastructure requirements. We'll work with you.
Contact sales →
  • Unlimited tokens — volume pricing
  • Dedicated inference capacity
  • VPC deployment + private endpoints
  • SOC 2 Type II · HIPAA · GDPR
  • SSO / SAML + audit logs
  • Dedicated 24/7 support engineer
Per-token pricing Prices per 1M tokens · input / output · after included quota
Model Input Output Context Tags
gpt-4o
OpenAI
$2.50 $10.00 128K Flagship
gpt-4o-mini
OpenAI
$0.15 $0.60 128K Cheapest
claude-3-5-sonnet
Anthropic
$3.00 $15.00 200K Best coding
gemini-2.0-flash
Google
$0.10 $0.40 1M Fastest
llama-3.3-70b
Meta AI
$0.59 $0.79 128K Open source
Compute Infrastructure Demo
Demo compute nodes — explore SSH, console, and metrics
3 nodes healthy
api-prod-01 RUNNING
Ubuntu 24.04 · t3.xlarge · us-east-1a
10.24.7.18 · 4 vCPU · 16 GB RAM
ml-inference-02 RUNNING
Ubuntu 24.04 · g4dn.xlarge · us-west-2b
10.24.7.31 · 4 vCPU · 1× T4 GPU
db-primary HIGH CPU
Ubuntu 24.04 · r6i.2xlarge · us-east-1b
10.24.0.1 · 8 vCPU · 64 GB RAM
Temporary session credentials auto-provisioned per request — valid 4h. Or connect with your org key pair on port 22 as ubuntu / root. Each API key is authorized for SSH.
ssh ubuntu@10.24.7.18
Serial Console — api-prod-01
Metrics — api-prod-01
Copied