Multi-tenant budget-based proxy server for OpenAI API with cost monitoring, usage tracking, and budget limiting per period. Built with Rust (Axum) backend and Next.js 15 frontend.
- Rust - High-performance, safe systems programming language
- Axum - Ergonomic web framework for Rust
- SQLx - Async SQL toolkit with compile-time query checking
- SQLite - Lightweight, serverless database
- JWT - Token-based authentication
- Next.js 15 - React framework with App Router
- TypeScript - Type-safe JavaScript
- shadcn/ui - Re-usable components built with Radix UI
- Tailwind CSS - Utility-first CSS framework
- Recharts - Composable charting library
- 💰 Budget-Based Limiting - Set budget limits per minute/hour/day/week/month/year
- 🏢 Multi-Tenant - Create multiple limiters for different clients/projects
- 🔒 Transparent Proxy - Works with any OpenAI SDK, just change baseURL
- 📊 Usage Tracking - Track requests, tokens, and cost per user
- 📈 Modern Dashboard - Beautiful React dashboard with shadcn/ui components
- 💸 Cost Monitoring - Automatic cost calculation for all OpenAI models
- 🎯 Per-Limiter Stats - Detailed statistics and budget usage per limiter
- 🔧 Easy Management - Create, edit, delete, and toggle limiters via UI
- 🗄️ SQLite Storage - Persistent storage with automatic migrations
- 🌐 Azure OpenAI Support - Works with both OpenAI and Azure OpenAI
- ⚡ High Performance - Built with Rust for maximum speed and safety
- 🔐 Secure Auth - JWT-based authentication system
- Rust 1.75 or higher
- Node.js 20 or higher
- Docker (optional, for containerized deployment)
cd backend
# Copy environment file
cp .env.example .env
# Edit .env with your configuration
# Important: Change JWT_SECRET and ADMIN_PASSWORD in production!
# Run the backend
cargo run
# Or with auto-reload (requires cargo-watch)
cargo watch -x runBackend will be running at http://localhost:8000
cd frontend
# Install dependencies
npm install
# Copy environment file
cp .env.example .env.local
# Run the development server
npm run devFrontend will be running at http://localhost:3000
# Copy environment file
cp .env.example .env
# Edit .env with your configuration
# Important: Change JWT_SECRET and ADMIN_PASSWORD in production!
# Build and run with Docker Compose
docker-compose up -d
# View logs
docker-compose logs -f
# Stop containers
docker-compose downServices:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- Health Check: http://localhost:8000/health
Access http://localhost:3000/login
Default credentials (change in production!):
- Username:
admin - Password:
changeme
In the dashboard:
- Click "Create Limiter"
- Name: e.g., "Customer A - $20/day"
- Budget Limit: e.g., $20.00
- Period: day (or minute, hour, week, month, year)
- Base URL: https://api.openai.com (or Azure OpenAI URL)
You'll get a unique proxy endpoint like:
http://localhost:8000/v1/abc123def456
Just change the baseURL to your limiter's proxy endpoint:
from openai import OpenAI
client = OpenAI(
api_key="your-openai-api-key",
base_url="http://localhost:8000/v1/abc123def456" # Your limiter endpoint
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'your-openai-api-key',
baseURL: 'http://localhost:8000/v1/abc123def456' // Your limiter endpoint
});
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello!' }]
});- View real-time statistics in the dashboard
- Check budget usage percentage
- See requests, tokens, and cost breakdown
- Monitor per-user statistics
When budget limit is exceeded, the proxy returns HTTP 429:
{
"error": {
"message": "Budget limit exceeded: $20.50/$20.00 per day. Current usage: 102.5%",
"type": "429"
}
}- Create Limiters - Add new limiters with custom budgets and periods
- Responsive Grid - View all limiters in a responsive card layout
- Limiter Cards - Each card shows:
- Active/Inactive status with toggle
- Budget limit and period type
- Proxy endpoint with copy button
- Real-time statistics
- Quick Actions - Toggle, view stats, or delete limiters
- Stats Dialog - View detailed usage and budget information
- Overview Cards:
- Total Requests
- Total Tokens
- Total Cost
- Active Users
- Usage Table:
- Detailed breakdown by limiter
- Sortable columns
- Cost and token metrics
All protected endpoints require a JWT token in the Authorization header:
Authorization: Bearer <your-jwt-token>
Get token by logging in:
POST /api/auth/login
Content-Type: application/json
{
"username": "admin",
"password": "changeme"
}Response:
{
"token": "eyJhbGc...",
"username": "admin"
}POST /api/limiters
Authorization: Bearer <token>
Content-Type: application/json
{
"name": "Customer A - $20/day",
"budget_limit": 20.00,
"period_type": "day",
"base_url": "https://api.openai.com"
}GET /api/limiters
Authorization: Bearer <token>GET /api/limiters/{limiter_id}/stats
Authorization: Bearer <token>Response:
{
"limiter_id": "uuid-here",
"limiter_name": "Customer A",
"current_spend": 5.25,
"budget_limit": 20.00,
"remaining_budget": 14.75,
"usage_percentage": 26.25,
"total_requests": 150,
"period_start": "2025-10-10T00:00:00Z",
"period_end": "2025-10-11T00:00:00Z"
}PUT /api/limiters/{limiter_id}
Authorization: Bearer <token>
Content-Type: application/json
{
"name": "Customer A - $30/day",
"budget_limit": 30.00,
"is_active": true
}DELETE /api/limiters/{limiter_id}
Authorization: Bearer <token>GET /api/stats
Authorization: Bearer <token>Budget limits can be set for different periods:
| Period | Description | Resets |
|---|---|---|
minute |
Per minute | Every minute |
hour |
Per hour | Every hour |
day |
Per day | Every day at midnight UTC |
week |
Per week | Every Monday at midnight UTC |
month |
Per month | First day of month |
year |
Per year | January 1st |
Automatic cost calculation for OpenAI models:
| Model | Prompt (per 1K tokens) | Completion (per 1K tokens) |
|---|---|---|
| gpt-4o | $0.005 | $0.015 |
| gpt-4o-mini | $0.00015 | $0.0006 |
| gpt-4-turbo | $0.01 | $0.03 |
| gpt-4 | $0.03 | $0.06 |
| gpt-3.5-turbo | $0.0005 | $0.0015 |
┌──────────────┐
│ Client │
│ (OpenAI SDK)│
└──────┬───────┘
│
▼
┌─────────────────────────────────────┐
│ Rust Backend (Axum) │
│ │
│ ┌───────────────────────────────┐ │
│ │ /v1/{limiter_code}/* │ │
│ │ - Validate limiter │ │
│ │ - Check budget limit │ │
│ │ - Forward to OpenAI API │ │
│ │ - Track usage & cost │ │
│ └───────────────────────────────┘ │
│ │
│ ┌───────────────────────────────┐ │
│ │ Management API │ │
│ │ - JWT Authentication │ │
│ │ - Limiter CRUD │ │
│ │ - Statistics endpoints │ │
│ └───────────────────────────────┘ │
└─────────────┬───────────────────────┘
│
▼
┌──────────────┐
│ SQLite DB │
│ - Limiters │
│ - Usage │
└──────────────┘
┌─────────────────────────────────────┐
│ Next.js Frontend │
│ │
│ ┌───────────────────────────────┐ │
│ │ /login │ │
│ │ - JWT authentication │ │
│ └───────────────────────────────┘ │
│ │
│ ┌───────────────────────────────┐ │
│ │ / (Dashboard) │ │
│ │ - Limiter management │ │
│ │ - Real-time stats │ │
│ └───────────────────────────────┘ │
│ │
│ ┌───────────────────────────────┐ │
│ │ /statistics │ │
│ │ - Usage overview │ │
│ │ - Cost breakdown │ │
│ └───────────────────────────────┘ │
└─────────────────────────────────────┘
gpt-usage-limiter/
├── backend/ # Rust backend
│ ├── src/
│ │ ├── main.rs # Application entry point
│ │ ├── config.rs # Configuration management
│ │ ├── db.rs # Database layer
│ │ ├── error.rs # Error handling
│ │ ├── models.rs # Data models
│ │ ├── handlers/ # HTTP handlers
│ │ │ ├── auth.rs # Authentication
│ │ │ ├── limiters.rs # Limiter management
│ │ │ ├── proxy.rs # OpenAI proxy
│ │ │ └── stats.rs # Statistics
│ │ ├── services/ # Business logic
│ │ │ ├── limiter.rs # Limiter service
│ │ │ ├── proxy.rs # Proxy service
│ │ │ ├── pricing.rs # Pricing calculations
│ │ │ └── stats.rs # Statistics service
│ │ └── middleware/ # Middleware
│ │ └── auth.rs # JWT middleware
│ ├── migrations/ # Database migrations
│ ├── Cargo.toml # Rust dependencies
│ ├── Dockerfile # Backend Docker image
│ └── .env.example # Environment variables template
│
├── frontend/ # Next.js frontend
│ ├── app/
│ │ ├── (dashboard)/ # Dashboard layout group
│ │ │ ├── layout.tsx # Dashboard layout
│ │ │ ├── page.tsx # Limiters page
│ │ │ └── statistics/
│ │ │ └── page.tsx # Statistics page
│ │ ├── login/
│ │ │ └── page.tsx # Login page
│ │ ├── layout.tsx # Root layout
│ │ └── globals.css # Global styles
│ ├── components/
│ │ └── ui/ # shadcn/ui components
│ ├── lib/
│ │ ├── api.ts # API client
│ │ └── utils.ts # Utility functions
│ ├── package.json # Node dependencies
│ ├── Dockerfile # Frontend Docker image
│ └── .env.example # Environment variables template
│
├── docker-compose.yml # Docker Compose configuration
├── .env.example # Root environment template
└── README.md # This file
Create separate limiters for each customer with different budget tiers:
- Basic: $10/month
- Pro: $50/month
- Enterprise: $500/month
Separate limiters for different departments in your organization:
- Marketing: $100/week
- Engineering: $500/week
- Support: $200/week
Create limiters for different projects with independent budgets:
- Project Alpha: $1000/month
- Project Beta: $500/month
- POC Projects: $50/month
Create limiters with small budgets for free tier users:
- Free User: $1/day
- Prevent abuse while offering free tier
- Change
JWT_SECRETto a strong random value - Change
ADMIN_PASSWORDto a secure password - Use HTTPS with reverse proxy (nginx/traefik/Caddy)
- Set up database backups for SQLite file
- Configure proper CORS settings
- Enable rate limiting on API endpoints
- Set up monitoring and alerting
- Configure log aggregation
- Use secrets manager for sensitive values
- Set up automated backups
# Server
HOST=0.0.0.0
PORT=8000
DATABASE_URL=sqlite:usage_data.db
# Security
JWT_SECRET=your-super-secret-key-min-32-characters
JWT_EXPIRE_HOURS=24
ADMIN_USERNAME=admin
ADMIN_PASSWORD=strong-password-here
# OpenAI
OPENAI_API_BASE_URL=https://api.openai.comNEXT_PUBLIC_API_URL=http://localhost:8000server {
listen 80;
server_name your-domain.com;
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name your-domain.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
# Frontend
location / {
proxy_pass http://localhost:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
# Backend API
location /api {
proxy_pass http://localhost:8000;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# Proxy endpoints
location /v1 {
proxy_pass http://localhost:8000;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}cd backend
# Run with auto-reload
cargo watch -x run
# Run tests
cargo test
# Check code
cargo clippy
# Format code
cargo fmt
# Build release
cargo build --releasecd frontend
# Run development server
npm run dev
# Build for production
npm run build
# Start production server
npm start
# Lint code
npm run lint- Check that limiter is active (not deactivated)
- Verify period type is correct
- Check if period has rolled over (e.g., new day started)
- Review SQLite database for usage records
- Verify limiter code in URL is correct
- Check if limiter was deleted
- Ensure limiter is active
- Check backend logs for errors
- Verify JWT_SECRET is set correctly
- Check token expiration (default 24 hours)
- Clear browser localStorage and login again
- Check browser console for CORS errors
- Verify model name matches OpenAI's naming
- Check pricing configuration in
pricing.rs - Review usage records in database
- Check logs:
docker-compose logs backend - Verify environment variables are set
- Ensure ports 3000 and 8000 are not in use
- Check disk space for SQLite database
- Backend: Built with Rust for maximum performance and memory safety
- Database: SQLite with optimized indexes for fast queries
- Frontend: Next.js 15 with Server Components for optimal loading
- API: Async Rust with Axum for high concurrency
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Write tests if applicable
- Submit a pull request
MIT License
If you encounter issues or have questions:
- Open a GitHub issue
- Review server logs for errors
- Check the troubleshooting section
High-performance OpenAI API budget management built with Rust and Next.js