patronus-mcp-server
Patronus MCP Server is a standardized server designed to optimize and evaluate LLM systems using the Patronus SDK. It offers features like configurable evaluators for single and batch evaluations, and the capability to run experiments with datasets.
How do I initialize the Patronus MCP Server?
You can initialize the server by providing an API key and project settings either through a command line argument or an environment variable.
Can I run batch evaluations?
Yes, the server supports running batch evaluations with multiple evaluators, allowing for comprehensive analysis.
What types of experiments can I run?
You can run experiments with datasets, using both remote and custom evaluators to assess model performance.
How do I test the server interactively?
You can use the test script tests/test_live.py
to interactively test different evaluation endpoints with or without an API key.
What is the purpose of the custom evaluator function?
Custom evaluator functions allow you to define specific evaluation logic tailored to your needs, enhancing the flexibility of the evaluation process.