OpenAI-compatible · Drop-in replacement · No waitlist

Every model.
One endpoint.

Nexus AI gives you access to GPT-4o, Claude 3.5, Gemini 2.0, Llama 3.3 and 150+ models through a single OpenAI-compatible API. Point your SDK here and you're done.

Get your API key — free View documentation

2.4K

Requests / sec

150+

Models

99.9%

Uptime SLA

bash — api.nexus.ai/v1

124.8M

API requests served

↑ 18% this week

< 42ms

Median latency p50

↓ 8ms since last deploy

153

Models available

+12 added this month

$0.0002

Per 1K tokens (avg)

Up to 10× cheaper than direct

Quick Start

Start building in 30 seconds

Copy a demo API key, point any OpenAI-compatible SDK at https://api.nexus.ai, and run a test request. Demo keys are seeded for evaluation only and rotate automatically.

Your API Keys Active

Production API Key

sk-proj-J1j0Gr9G...Cx3uyw

Created · Last used

GitHub Actions CI

sk-proj-9C5cSvoA...FuKj3X

Created · Last used

Backend Service

sk-kNSBuhffYxACN...1gbT5e

Created · Last used

Data Pipeline

sk-proj-sPaDvtpl...xdw9hf

Created · Last used

Internal Tooling

sk-4A7rGe2tdqvhX...mmp8ro

Created · Last used

Analytics Service

sk-proj-hXuJmQHP...QesnzB

Created · Last used

# pip install openai
import openai

client = openai.OpenAI(
    api_key = "sk-nexus-...",
    base_url = "https://api.nexus.ai/v1"
)

response = client.chat.completions.create(
    model = "gpt-4o",
    messages = [
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)
                

// npm install openai
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey:  'sk-nexus-...',
  baseURL: 'https://api.nexus.ai/v1',
});

const res = await client.chat.completions.create({
  model:    'gpt-4o',
  messages: [
    { role: 'user', content: 'Hello!' }
  ],
});

console.log(res.choices[0].message.content);
                

curl https://api.nexus.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-nexus-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'
                

Drop-in compatible with the official OpenAI SDK — just change base_url and your existing code works.

Model Catalog

Every frontier model

View all 153 models

gpt-4o

OpenAI

Flagship multimodal model. Text, vision, and audio. Fastest GPT-4 class model.

128K ctx Vision Audio Functions

Available

claude-3-5-sonnet

Anthropic

Best-in-class coding and analysis. Strong instruction following and computer use.

200K ctx Vision Computer Use

Available

gemini-2.0-flash

Google DeepMind

Gemini's fastest and most efficient model. Excellent at multimodal reasoning tasks.

1M ctx Vision Grounding

Available

llama-3.3-70b

Meta AI

Best open-source model at 70B. Competitive with GPT-4o for most tasks.

128K ctx Open Source Functions

Available

⚡

mistral-large-2

Mistral AI

Top-tier European AI. Strong multilingual performance and code generation.

128K ctx Multilingual Functions

Available

gpt-4o-mini

OpenAI

Small, fast, cheap. Ideal for classification, extraction, and high-volume tasks.

128K ctx Low cost Vision

Available

Pricing

Simple, transparent pricing

Start for free. Scale as you grow. No seat fees, no hidden charges — you only pay for what you use.

Save 20%

Free

$ 0 / mo

Everything you need to prototype and experiment. No credit card required.

Get started free

100K tokens/day included
Access to 40+ models
OpenAI-compatible API
Community support
Priority routing
Fine-tuning API

MOST POPULAR

Pro

$ 29 / mo

For developers shipping production apps. Low latency, high limits, full model access.

Start Pro trial — 14 days free

10M tokens/month included
Access to all 153 models
Priority routing — lowest latency
Fine-tuning & embeddings API
99.9% uptime SLA
Email + Slack support

Enterprise

Custom

For teams with volume, compliance, or infrastructure requirements. We'll work with you.

Contact sales →

Unlimited tokens — volume pricing
Dedicated inference capacity
VPC deployment + private endpoints
SOC 2 Type II · HIPAA · GDPR
SSO / SAML + audit logs
Dedicated 24/7 support engineer

Per-token pricing Prices per 1M tokens · input / output · after included quota

Model	Input	Output	Context	Tags
gpt-4o OpenAI	$2.50	$10.00	128K	Flagship
gpt-4o-mini OpenAI	$0.15	$0.60	128K	Cheapest
claude-3-5-sonnet Anthropic	$3.00	$15.00	200K	Best coding
gemini-2.0-flash Google	$0.10	$0.40	1M	Fastest
llama-3.3-70b Meta AI	$0.59	$0.79	128K	Open source

Compute Infrastructure Demo

Demo compute nodes — explore SSH, console, and metrics

3 nodes healthy

api-prod-01 RUNNING

Ubuntu 24.04 · t3.xlarge · us-east-1a

10.24.7.18 · 4 vCPU · 16 GB RAM

ml-inference-02 RUNNING

Ubuntu 24.04 · g4dn.xlarge · us-west-2b

10.24.7.31 · 4 vCPU · 1× T4 GPU

db-primary HIGH CPU

Ubuntu 24.04 · r6i.2xlarge · us-east-1b

10.24.0.1 · 8 vCPU · 64 GB RAM

Temporary session credentials auto-provisioned per request — valid 4h. Or connect with your org key pair on port 22 as ubuntu / root. Each API key is authorized for SSH.

Every model. One endpoint.

Every model.
One endpoint.