Quickstart
If you already use the OpenAI SDK, start here first. This is the lowest-friction migration path.
You can think of QYUAN AI as a unified API gateway. Right now there are two recommended integration patterns:
- OpenAI-compatible: best for most SDKs and apps, using
/v1/chat/completionsor/v1/responses. - Claude Messages: if your existing code already follows the Anthropic format, you can call
/v1/messagesdirectly.
https://token.qyuanai.com/v1
https://token.qyuanai.com/v1/messages
Authorization: Bearer YOUR_API_KEY for OpenAI-compatible endpoints. Use x-api-key: YOUR_API_KEY for Claude Messages.
GET /v1/models to fetch the currently available model list.
Interface overview
Models on this platform no longer share a single request shape. The most common integration mistake is this: most text models can use the OpenAI-compatible interface, but image generation and Claude native requests require different payload formats.
| Model type | Recommended endpoint | Core payload shape |
|---|---|---|
| GPT / Codex / Claude / Gemini text models | POST /v1/chat/completions |
model + messages + max_tokens |
| Gemini models through this gateway | POST /v1/chat/completions |
Still use OpenAI-style messages. Do not switch to Google-native contents. |
| Pure image generation | POST /v1/images/generations |
model + prompt + size + n |
| OpenAI Responses-style apps | POST /v1/responses |
model + input + max_output_tokens |
| Claude native format | POST /v1/messages |
model + max_tokens + messages with x-api-key |
GET /v1/models |
List currently available models | Recommended before every new integration |
Model categories
As of 2026-05-15, the models returned by the API can be understood like this:
| Category | Current models | Recommended integration |
|---|---|---|
| OpenAI / Codex text models | gpt-5.2、gpt-5.3-codex、gpt-5.3-codex-spark、gpt-5.4、gpt-5.4-mini、gpt-5.5、gpt-oss-120b-medium |
/v1/chat/completions or /v1/responses |
| Claude models | claude-sonnet-4-5、claude-sonnet-4-6、claude-opus-4-6、claude-opus-4-6-thinking、claude-opus-4-7 |
/v1/chat/completions or /v1/messages |
| Gemini text / reasoning models | gemini-2.5-flash、gemini-2.5-flash-lite、gemini-2.5-pro、gemini-3-flash、gemini-3-flash-preview、gemini-3-pro-low、gemini-3-pro-high、gemini-3-pro-preview、gemini-3.1-flash-lite、gemini-3.1-flash-lite-preview、gemini-3.1-pro-low、gemini-3.1-pro-high、gemini-3.1-pro-preview |
/v1/chat/completions |
| Image models | gpt-image-2 |
/v1/images/generations |
| Gemini image-capable models | gemini-3.1-flash-image |
For now, test and integrate it through /v1/chat/completions |
The safest pattern is still to request GET /v1/models first and read model names directly from the response.
curl https://token.qyuanai.com/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"
OpenAI Chat Completions
This is the most universal integration path right now. GPT, Claude, and Gemini models can all be tested through this format first.
model + messages + optional max_tokens /
stream。
curl example
curl https://token.qyuanai.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4-mini",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Introduce QYUAN AI in one sentence."}
],
"max_tokens": 512
}'
Python example
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://token.qyuanai.com/v1"
)
resp = client.chat.completions.create(
model="gpt-5.4-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Introduce QYUAN AI in one sentence."}
],
max_tokens=512,
)
print(resp.choices[0].message.content)
JavaScript example
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.QYUAN_API_KEY,
baseURL: "https://token.qyuanai.com/v1",
});
const resp = await client.chat.completions.create({
model: "gpt-5.4-mini",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Introduce QYUAN AI in one sentence." }
],
max_tokens: 512
});
console.log(resp.choices[0].message.content);
Gemini integration
This model family is already normalized into the OpenAI-compatible format on this platform, so
do not copy Google-native contents, parts, or generateContent request shapes directly.
The simplest path is to keep using /v1/chat/completions.
POST https://token.qyuanai.com/v1/chat/completions
gemini-2.5-*、gemini-3-*、gemini-3.1-*
model and messages, with optional max_tokens and stream
Gemini text model curl example
curl https://token.qyuanai.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-2.5-flash",
"messages": [
{"role": "user", "content": "Reply with ok only."}
],
"max_tokens": 64
}'
Gemini image-capable model test example
On this platform, gemini-3.1-flash-image is currently best tested through the same
chat/completions format first.
curl https://token.qyuanai.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3.1-flash-image",
"messages": [
{"role": "user", "content": "Reply with ok only."}
],
"max_tokens": 64
}'
Image generation
If you want direct image generation, do not call /v1/chat/completions. Use the dedicated
/v1/images/generations endpoint instead.
model + prompt + size + n.
The recommended pure image model right now is gpt-image-2.
curl example
curl https://token.qyuanai.com/v1/images/generations \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-image-2",
"prompt": "A minimalist product poster featuring a blue cube, white background, studio lighting",
"n": 1,
"size": "1024x1024"
}'
Python example
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://token.qyuanai.com/v1"
)
result = client.images.generate(
model="gpt-image-2",
prompt="A minimalist product poster featuring a blue cube, white background, studio lighting",
size="1024x1024",
n=1,
)
print(result.data[0].b64_json[:80])
Streaming
Text models currently support SSE streaming responses. When testing in a terminal, add -N.
curl -N https://token.qyuanai.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4-mini",
"stream": true,
"messages": [
{"role": "user", "content": "Output three lines: hello, qyuan, ai"}
]
}'
The response is a standard data: {...} event stream and ends with [DONE].
Responses API
If your SDK or app has already moved to OpenAI's newer unified interface, you can call /v1/responses directly.
model + input + optional max_output_tokens.
curl https://token.qyuanai.com/v1/responses \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4-mini",
"input": "Reply with ok only.",
"max_output_tokens": 32
}'
chat.completions, there is no need to migrate to
responses just because it is newer. Both are currently supported. Choose based on your existing code structure.
Claude Messages
If your current Claude integration already follows the Anthropic format, you can keep using it here.
model + max_tokens + messages,
and the auth header must be x-api-key, not Bearer.
curl https://token.qyuanai.com/v1/messages \
-H "x-api-key: YOUR_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-5",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
We currently recommend claude-sonnet-4-5 or claude-sonnet-4-6 for most Claude usage.
If you prefer to keep one shared integration style, you can also call Claude models through OpenAI-compatible /v1/chat/completions.
Referral rewards
To avoid ambiguity, the current referral reward policy is defined as follows:
Referral rewards are accumulated inside the console. In-site usage and withdrawal follow the rules below.
Scan to add WeCom support Hanson. Once you reach the withdrawal threshold, support can help process the withdrawal.
Not recommended right now
To keep this documentation aligned with the platform's actual production-ready capabilities, the following are intentionally not covered here:
/v1/embeddings: there is currently no active embedding channel.- Audio, files, Assistants, fine-tuning, and other interfaces are not currently documented as primary external capabilities.
FAQ
1. What should I use as the Base URL?
For OpenAI-compatible SDKs, use https://token.qyuanai.com/v1.
2. Why do I get a “model not found” error?
Always trust the result of GET /v1/models. Do not reuse model names from other platforms.
3. Do Claude models have to use /v1/messages?
No. You can also use /v1/chat/completions as long as the model name is a Claude model.
4. Why can’t I copy Google-native Gemini examples directly?
Because Gemini is normalized into an OpenAI-compatible interface on this platform. The recommended path is /v1/chat/completions with messages.
5. Which endpoint should I use for image generation?
Use /v1/images/generations for pure image generation. The currently recommended model is gpt-image-2.
6. How can I verify that my API key works?
The simplest method is to call GET /v1/models. If it returns a model list, authentication is working.