unsloth-mcp-server

unsloth-mcp-server

2

Unsloth MCP Server is a server for the Unsloth library, aimed at enhancing the efficiency of fine-tuning large language models with improved speed and reduced memory usage. It supports multiple model types and provides a simple API for various model operations.

Unsloth MCP Server

An MCP server for Unsloth, a library designed to make large language model fine-tuning significantly more efficient. Unsloth offers improved speed, reduced memory usage, and extended context lengths while maintaining model quality. Key features include support for various models, a simple API, and compatibility with multiple export formats. It offers tools for checking installation, listing supported models, loading models, fine-tuning, generating text, and exporting models.

Features

  • Fine-tune Llama, Mistral, Phi, and Gemma models efficiently
  • 4-bit quantization to optimize training
  • Extended context length support
  • Simple API for model operations
  • Export to GGUF, Hugging Face, and more

Troubleshooting

Common Issues

  1. CUDA Out of Memory: Reduce batch size or try a smaller model.
  2. Import Errors: Ensure correct version dependencies.
  3. Model Not Found: Use supported model names.

Version Compatibility

  • Python: 3.10-3.12
  • CUDA: 11.8 or 12.1+
  • PyTorch: 2.0+

Performance Benchmarks

ModelVRAMUnsloth SpeedVRAM ReductionContext Length
Llama 3.3 (70B)80GB2x faster>75%13x longer
Llama 3.1 (8B)80GB2x faster>70%12x longer
Mistral v0.3 (7B)80GB2.2x faster75% less-

Requirements

  • Python 3.10-3.12
  • NVIDIA GPU with CUDA support
  • Node.js and npm

License

Apache-2.0