No description
Find a file
2025-12-25 18:57:01 +02:00
crates Add MCP servers for search and scraping APIs 2025-12-25 18:33:36 +02:00
docs update the full project documentation 2025-12-25 18:57:01 +02:00
.env.example chore:improving the README.md and the mcp_servers.example 2025-12-25 18:46:24 +02:00
.gitignore chore:improving the README.md and the mcp_servers.example 2025-12-25 18:46:24 +02:00
Cargo.toml Add MCP servers for search and scraping APIs 2025-12-25 18:33:36 +02:00
Makefile Add Makefile with common development targets 2025-12-25 18:06:05 +02:00
mcp_servers.example.json chore:improving the README.md and the mcp_servers.example 2025-12-25 18:46:24 +02:00
mcp_servers.json Add MCP servers for search and scraping APIs 2025-12-25 18:33:36 +02:00
modelsconfig.yml Update model IDs to match OpenAI standard naming 2025-12-25 18:11:02 +02:00
README.md update the full project documentation 2025-12-25 18:57:01 +02:00

AI Broker

A lightweight LLM request broker with an OpenAI-compatible API that intelligently routes requests to multiple LLM providers with cost-aware strategies.

Features

  • OpenAI-Compatible API - Drop-in replacement for OpenAI clients
  • Multi-Provider Support - OpenAI, OpenRouter, Groq, SambaNova
  • Smart Routing - Automatic model selection based on cost or quality
  • Cost Tracking - Per-request cost calculation and tracking
  • Streaming Support - Real-time streaming responses via SSE
  • MCP Broker - Aggregate tools from multiple MCP (Model Context Protocol) servers
  • Rate Limiting - Per-IP rate limiting with configurable limits
  • Audio APIs - Text-to-speech and speech-to-text support
  • Embeddings - Vector embedding generation

Project Structure

aibroker/
├── crates/
│   ├── llmbroker/          # Main server library and binary
│   ├── llmbroker-cli/      # Command-line interface
│   ├── mcp-common/         # Shared MCP utilities
│   ├── mcp-ping/           # MCP ping test server
│   ├── mcp-serpapi/        # SerpAPI search MCP server
│   ├── mcp-serper/         # Serper search MCP server
│   ├── mcp-exa/            # Exa search MCP server
│   ├── mcp-scraperapi/     # ScraperAPI MCP server
│   └── mcp-scrapfly/       # Scrapfly MCP server
├── modelsconfig.yml        # Model definitions and pricing
└── mcp_servers.json        # MCP server configuration

Quick Start

Prerequisites

  • Rust 1.70 or later
  • At least one LLM provider API key

Installation

# Clone the repository
git clone https://github.com/your-org/aibroker
cd aibroker

# Copy example configuration
cp .env.example .env
# Edit .env with your API keys

# Build and run
cargo run --release

The server will start on http://127.0.0.1:8080 by default.

Configuration

Configure the broker using environment variables or a .env file:

# Server settings
HOST=127.0.0.1
PORT=8080

# LLM Provider API Keys
OPENROUTER_API_KEY=sk-or-v1-...
GROQ_API_KEY=gsk_...
SAMBANOVA_API_KEY=...

# Routing strategy: "cheapest" or "best"
ROUTING_STRATEGY=cheapest

# MCP Configuration (optional)
MCP_CONFIG_PATH=mcp_servers.json

# MCP Search Tool API Keys (optional)
SERPER_API_KEY=...
SERPAPI_API_KEY=...
EXA_API_KEY=...
SCRAPERAPI_API_KEY=...
SCRAPFLY_API_KEY=...

API Reference

Chat Completions

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt4o",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

Text-to-Speech

curl http://localhost:8080/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello, world!",
    "voice": "alloy"
  }' \
  --output speech.mp3

Speech-to-Text

curl http://localhost:8080/v1/audio/transcriptions \
  -F "file=@audio.mp3" \
  -F "model=whisper-1"

Embeddings

curl http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "Hello, world!"
  }'

List Models

curl http://localhost:8080/v1/models

MCP Tools

# List all available tools
curl http://localhost:8080/mcp/tools

# Call a specific tool
curl http://localhost:8080/mcp/tools/search \
  -H "Content-Type: application/json" \
  -d '{"query": "rust programming"}'

Client Examples

Python

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="not-needed"  # Key is configured on the server
)

response = client.chat.completions.create(
    model="gpt4o",
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

print(response.choices[0].message.content)

Streaming

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="not-needed"
)

stream = client.chat.completions.create(
    model="gpt4o",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

CLI Usage

# Interactive chat
cargo run --bin llmbroker-cli -- chat --model gpt4o

# List available models
cargo run --bin llmbroker-cli -- models

# List MCP tools
cargo run --bin llmbroker-cli -- tools

# Check server health
cargo run --bin llmbroker-cli -- health

Model Configuration

Models are configured in modelsconfig.yml:

models:
  gpt4o:
    display_name: "GPT-4o"
    tier: premium
    capabilities:
      - tool_calling
      - vision
    context_window: 128000
    backends:
      - provider: openai
        model_id: gpt-4o
        priority: 1
        input_cost: 2.5   # per million tokens
        output_cost: 10.0

Auto Model Selection

Use special model names for automatic selection:

Model Name Description
auto Use the configured routing strategy
autocheapest Select the cheapest available model
autobest Select the best premium model

MCP Integration

The broker can aggregate tools from multiple MCP (Model Context Protocol) servers. Configure servers in mcp_servers.json:

{
  "mcpServers": [
    {
      "name": "search",
      "command": "cargo",
      "args": ["run", "--bin", "mcp-serper"],
      "transport": "stdio"
    },
    {
      "name": "scraper",
      "url": "http://localhost:3001/sse",
      "transport": "sse"
    }
  ]
}

MCP Endpoints

Endpoint Description
GET /mcp/tools List all aggregated tools
POST /mcp/tools/:name Call a specific tool
GET /mcp/sse SSE endpoint for MCP clients

Included MCP Servers

  • mcp-serper - Web search via Serper API
  • mcp-serpapi - Web search via SerpAPI
  • mcp-exa - Semantic search via Exa
  • mcp-scraperapi - Web scraping via ScraperAPI
  • mcp-scrapfly - Web scraping via Scrapfly
  • mcp-ping - Simple ping server for testing

Architecture

┌─────────────────────────────────────────────────────┐
│                    API Layer                         │
│  (OpenAI-compatible endpoints: chat, tts, stt, etc) │
└─────────────────────────────────────────────────────┘
                          │
┌─────────────────────────────────────────────────────┐
│                  Service Layer                       │
│  (Routing logic, model selection, cost calculation) │
└─────────────────────────────────────────────────────┘
                          │
┌─────────────────────────────────────────────────────┐
│                 Registry Layer                       │
│  (Model catalog, backend resolution, pricing)       │
└─────────────────────────────────────────────────────┘
                          │
┌─────────────────────────────────────────────────────┐
│                 Provider Layer                       │
│  (OpenAI, Groq, SambaNova, OpenRouter adapters)     │
└─────────────────────────────────────────────────────┘

Development

Building

# Debug build
cargo build

# Release build
cargo build --release

# Build specific crate
cargo build -p llmbroker
cargo build -p llmbroker-cli

Running Tests

# Run all tests
cargo test

# Run tests for specific crate
cargo test -p llmbroker

Running Individual MCP Servers

# Run the Serper search server
SERPER_API_KEY=your-key cargo run --bin mcp-serper

# Run the ping test server
cargo run --bin mcp-ping

Documentation

Comprehensive documentation is available in the docs/ directory:

Document Description
Architecture System architecture and design principles
Technical Specs Requirements and specifications
Component Design Detailed component documentation
API Reference Complete API documentation
MCP Integration MCP tool integration guide
Data Flow Request/response data flows
Deployment Guide Production deployment guide

License

MIT License