Troubleshooting Guide¶
Solutions for the most common checkllm issues.
API Key Issues¶
AuthenticationError: No API key found¶
Cause: The judge backend cannot locate the required API key.
Fix:
export OPENAI_API_KEY="sk-..." # OpenAI
export ANTHROPIC_API_KEY="sk-ant-..." # Anthropic
export GEMINI_API_KEY="AIza..." # Gemini
Verify the key is loaded before running tests:
In GitHub Actions, expose the secret explicitly:
Rate Limiting (429 Errors)¶
RateLimitError: Too many requests¶
Cause: Too many concurrent requests to the judge API.
Fix 1 — Reduce concurrency:
Fix 2 — Enable caching to skip duplicate calls:
Fix 3 — Use a cheaper model for development:
Judge Timeout / Hanging Tests¶
TimeoutError or tests that never finish¶
Cause: The judge API is unreachable or responding very slowly.
Fix — Add explicit timeouts:
Debug — Test connectivity:
Flaky Tests / Score Variance¶
A test passes inconsistently across runs¶
Cause: LLM judges are non-deterministic by default.
Fix 1 — Set temperature to 0:
Fix 2 — Average over multiple runs:
Fix 3 — Loosen the threshold slightly to account for variance:
@pytest.mark.llm_check(metric="hallucination", threshold=0.75) # was 0.80
def test_no_hallucination(response):
...
Coverage Threshold Not Met¶
FAILED - Required test coverage of 75% not reached¶
This is a pytest-cov failure, not a checkllm metric failure.
Fix — Find uncovered lines:
Add tests for any lines listed in the Miss column.
Import Errors¶
ModuleNotFoundError: No module named 'openai'¶
Cause: Optional dependency not installed.
pip install "checkllm[openai]" # OpenAI judge
pip install "checkllm[anthropic]" # Anthropic judge
pip install "checkllm[all]" # Install everything
Async Issues¶
RuntimeError: This event loop is already running¶
Fix — Ensure asyncio_mode = "auto" in pyproject.toml:
import pytest
@pytest.mark.asyncio
async def test_async_call():
result = await my_async_llm_call()
assert result is not None
Snapshot File Corruption¶
JSONDecodeError when running checkllm diff¶
Cause: A snapshot was partially written (interrupted run).
Fix — Delete and re-baseline:
Debug Mode¶
Enable full verbose output for any issue:
Or configure a debug profile:
[tool.checkllm.profiles.debug]
log_level = "DEBUG"
judge_model = "gpt-4o-mini"
cache_enabled = false
Run with:
Getting Help¶
- GitHub Issues: https://github.com/javierdejesusda/checkllm/issues
- Discussions: https://github.com/javierdejesusda/checkllm/discussions
- Security issues: see SECURITY.md