🎛️ LLM Router Admin Panel
💾 Save Configuration
📤 Export
📥 Import
💬 Back to Chat
🤖
Model Controls
📝
Chat Templates
⚙️
System Instructions
🎚️
Parameters
📊
Monitoring
Model Configuration
Active Model:
Auto-Select (Router)
🔄 Refresh
Model Information
Select a model to view details
Generation Parameters
Temperature:
0.7
Controls randomness (0 = deterministic, 2 = very random)
Max Tokens:
500
Maximum response length in tokens
Top P:
0.9
Nucleus sampling threshold
Top K:
40
Limits vocabulary to top K tokens
Repetition Penalty:
1.1
Penalizes repeated tokens
Context Size:
2048
Model context window size
Routing Strategy
Balanced
Quality First
Speed Priority
Cost Optimized
Random
Chat Templates
Active Template:
Default
Alpaca
ChatML
Llama
Vicuna
Custom
Template Format
System Prefix:
User Prefix:
Assistant Prefix:
Message Separator:
Stop Sequences (comma-separated):
Preview:
Saved Templates
Save Current Template
System Instructions
System Prompt:
This prompt is prepended to every conversation
Preset Instructions
🤝
Helpful Assistant
🎨
Creative Writer
💻
Technical Expert
📚
Teacher
📊
Data Analyst
🌍
Translator
Behavior Settings
Enable System Prompt
Maintain Conversation Context
Stream Responses
Show Token Count
Enable Response Cache
Advanced Parameters
Sampling Parameters
Min Length:
Presence Penalty:
Frequency Penalty:
Seed (for reproducibility):
Performance Settings
Batch Size:
Threads:
GPU Layers:
Timeout (ms):
Memory Management
Cache Size (MB):
Cache TTL (seconds):
Max Concurrent:
System Monitoring
Model Status
Loading...
Memory Usage
0 MB
Requests Processed
0
Average Latency
0 ms
Performance History
System Logs
Clear Logs
Download Logs
Auto-scroll