Mantis
A self-hosted LLM gateway for routing, caching, guardrails, and observability across model providers.
One API
Stable chat completions endpoint in front of multiple model targets.
AWS-native
Deployable with Terraform, ECS, ElastiCache, Bedrock, and CloudWatch.
Policy driven
Routing, retry, fallback, timeout, cooldown, and cache behavior live in config.
What Mantis Provides
Section titled “What Mantis Provides”Configurable Routing
Route requests by metadata, model aliases, weighted targets, and fallback chains.
Gateway Orchestration
Coordinate validation, cache checks, cooldowns, provider calls, retries, and terminal responses.
OpenAI-style API Surface
Send chat completion requests through a single gateway endpoint with optional routing metadata.
Python SDK
Call Mantis from application code without manually constructing every HTTP request.
Next Steps
Section titled “Next Steps”Start with the quick start to run or deploy the gateway, then read the case study for the project background and design decisions.