Skip to content

Commit 967337c

Browse files
authored
Run models in parallel during benchmark (#53)
* Run models in parallel during benchmark * Updating eval docs * Formatting * Unused context * Handling race conditions
1 parent 58bfd43 commit 967337c

File tree

3 files changed

+286
-51
lines changed

3 files changed

+286
-51
lines changed

‎docs/evals.md‎

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,8 @@ This installs:
6767
|`--azure-api-version`|| Azure OpenAI API version (default: 2025-01-01-preview) |
6868
|`--models`|| Models for benchmark mode (benchmark only) |
6969
|`--latency-iterations`|| Latency test samples (default: 25) (benchmark only) |
70+
|`--max-parallel-models`|| Maximum number of models to benchmark concurrently (default: max(1, min(model_count, cpu_count))) (benchmark only) |
71+
|`--benchmark-chunk-size`|| Optional number of samples per chunk when benchmarking to limit long-running runs (benchmark only) |
7072

7173
## Configuration
7274

@@ -205,6 +207,8 @@ guardrails-evals \
205207
-**Automatic stage detection**: Evaluates all stages found in configuration
206208
-**Batch processing**: Configurable parallel processing
207209
-**Benchmark mode**: Model performance comparison with ROC AUC, precision at recall thresholds
210+
-**Parallel benchmarking**: Run multiple models concurrently (defaults to CPU count)
211+
-**Benchmark chunking**: Process large datasets in chunks for better progress tracking
208212
-**Latency testing**: End-to-end guardrail performance measurement
209213
-**Visualization**: Automatic chart and graph generation
210214
-**Multi-provider support**: OpenAI, Azure OpenAI, Ollama, vLLM, and other OpenAI-compatible APIs

0 commit comments

Comments
(0)