Files
noisedestroyers 2c4bfefa95 Initial commit: localgenai stack
Containerized local LLM stack for the Framework Desktop / Strix Halo,
plus the OpenCode harness on the Mac side.

- pyinfra/framework/: pyinfra deploy targeting the box
  - llama.cpp (Vulkan), vLLM (ROCm), Ollama (ROCm with HSA override
    for gfx1151), OpenWebUI
  - Beszel (host + container + AMD GPU dashboard via sysfs)
  - OpenLIT (LLM fleet metrics)
  - Phoenix (per-trace agent waterfall)
  - OpenHands (autonomous agent in a Docker sandbox)
- opencode/: OpenCode config + Phoenix bridge plugin (OTel exporter)
  - install.sh deploys to ~/.config/opencode/
- StrixHaloSetup.md / StrixHaloMemory.md / Roadmap.md / TODO.md:
  documentation and planning
- testing/qwen3-coder-30b/: small evaluation harness

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 11:35:10 -04:00

109 lines
2.6 KiB
Markdown

# Local LLM Performance Testing Framework
A comprehensive framework for comparing the performance and quality of different local hosted LLMs on coding tasks.
## Features
- **Performance Metrics**: Response time, tokens per second, CPU/memory/GPU usage
- **Quality Assessment**: Automated code quality evaluation
- **Multi-Model Testing**: Test multiple LLMs simultaneously
- **Parallel Execution**: Run tests efficiently
- **Comprehensive Reporting**: Detailed test results and comparisons
## Installation
```bash
# Install required dependencies
pip install psutil
```
## Usage
### Quick Start
1. Create a test suite and add models:
```python
from llm_test_framework import TestSuite, MockLLM
# Create test suite
suite = TestSuite()
# Add models to test
suite.add_model(MockLLM("gpt4all", {"model_path": "/path/to/model"}))
suite.add_model(MockLLM("llama.cpp", {"model_path": "/path/to/llama"}))
# Define test tasks
tasks = [
{
"name": "Simple function",
"prompt": "Write a Python function to add two numbers",
"expected_output": "return a + b"
}
]
# Run tests
suite.run_all_tests(tasks)
suite.save_results()
```
### Running the Example
```bash
python llm_test_framework.py
```
This will run sample tests with mock models and save results to JSON.
## Components
### 1. TestSuite
Main class managing the testing process, including:
- Model registration
- Test execution
- Result collection and saving
### 2. LLMInterface
Abstract base class for LLM implementations with methods:
- `generate()`: Generate text from prompt
- `get_model_info()`: Get model information
### 3. TestResult
Data class storing individual test results with:
- Performance metrics
- Quality scores
- Raw outputs
- Success status
## Extending the Framework
To add support for a new LLM:
1. Create a new class inheriting from `LLMInterface`
2. Implement `generate()` and `get_model_info()` methods
3. Add your model to the test suite
## Test Types
The framework supports multiple types of coding tasks:
- Basic function implementations
- Code refactoring examples
- Algorithm challenges
- System design questions
## Output
Results are saved in JSON format with:
- Performance metrics (response time, TPS)
- Resource usage (CPU, memory, GPU)
- Quality assessment scores
- Raw model outputs
- Success/failure indicators
## Future Enhancements
- Integration with local LLM APIs (text-generation-webui, llama.cpp, etc.)
- Advanced code quality evaluation using static analysis
- Web-based dashboard for results visualization
- Database storage for historical comparisons
- Automated statistical analysis of results