Chat Completions
The gateway exposes:
POST /v1/chat/completionsAll requests require:
Authorization: Bearer gw_<token-id>_<random-secret>Request Body
Section titled “Request Body”{ "messages": [ { "role": "user", "content": "Write a short project summary." } ], "stream": false, "temperature": 0.5, "max_tokens": 256, "system": "Answer clearly and concisely."}Validation rules:
messagesis required and must contain at least one message.- Message
rolemust beuserorassistant. - Message
contentmust be a non-empty string. - Extra top-level fields are rejected.
streamdefaults tofalse.temperaturecan be omitted,null, or a number from0.0to2.0.max_tokenscan be omitted,null, or an integer greater than0.systemcan be omitted,null, or a non-empty string.
Routing Metadata
Section titled “Routing Metadata”Pass routing metadata in the metadata header as a JSON object with string keys and
string values:
metadata: {"task-type":"summarization"}The gateway uses this object to match routing rules.
Non-streaming Example
Section titled “Non-streaming Example”curl https://gateway.example.com/v1/chat/completions \ -H 'Authorization: Bearer gw_token-id_token-secret' \ -H 'Content-Type: application/json' \ -H 'metadata: {"task-type":"summarization"}' \ -d '{ "messages": [ { "role": "user", "content": "Write a short project summary." } ], "stream": false, "temperature": 0.5, "max_tokens": 256 }'Streaming Example
Section titled “Streaming Example”{ "messages": [ { "role": "user", "content": "Write a short project summary." } ], "stream": true}When stream is true, Mantis returns text chunks as they are received from the model
provider.
Health
Section titled “Health”The load balancer health endpoint is public:
GET /healthIt returns:
{"status":"ok"}