unsloth-mcp-server
Unsloth MCP Server is a server for the Unsloth library, aimed at enhancing the efficiency of fine-tuning large language models with improved speed and reduced memory usage. It supports multiple model types and provides a simple API for various model operations.
Unsloth MCP Server
An MCP server for Unsloth, a library designed to make large language model fine-tuning significantly more efficient. Unsloth offers improved speed, reduced memory usage, and extended context lengths while maintaining model quality. Key features include support for various models, a simple API, and compatibility with multiple export formats. It offers tools for checking installation, listing supported models, loading models, fine-tuning, generating text, and exporting models.
Features
- Fine-tune Llama, Mistral, Phi, and Gemma models efficiently
- 4-bit quantization to optimize training
- Extended context length support
- Simple API for model operations
- Export to GGUF, Hugging Face, and more
Troubleshooting
Common Issues
- CUDA Out of Memory: Reduce batch size or try a smaller model.
- Import Errors: Ensure correct version dependencies.
- Model Not Found: Use supported model names.
Version Compatibility
- Python: 3.10-3.12
- CUDA: 11.8 or 12.1+
- PyTorch: 2.0+
Performance Benchmarks
Model | VRAM | Unsloth Speed | VRAM Reduction | Context Length |
---|---|---|---|---|
Llama 3.3 (70B) | 80GB | 2x faster | >75% | 13x longer |
Llama 3.1 (8B) | 80GB | 2x faster | >70% | 12x longer |
Mistral v0.3 (7B) | 80GB | 2.2x faster | 75% less | - |
Requirements
- Python 3.10-3.12
- NVIDIA GPU with CUDA support
- Node.js and npm
License
Apache-2.0