code-reasoning

code-reasoning

61

The Code Reasoning MCP Server optimizes Claude, an AI model, for complex programming tasks through structured thinking and prompt evaluation. It provides features like thought branching, thought revision, and safety controls, with educational documentation for easy setup and usage.

Code Reasoning MCP Server

A Model Context Protocol (MCP) server that enhances Claude's ability to solve complex programming tasks through structured, step-by-step thinking.

Code Reasoning Server MCP server

npm version License: MIT CI

Quick Installation

  1. Configure Claude Desktop by editing:

    • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
    • Windows: %APPDATA%\Claude\claude_desktop_config.json
    • Linux: ~/.config/Claude/claude_desktop_config.json
    {
      "mcpServers": {
        "code-reasoning": {
          "command": "npx",
          "args": ["-y", "@mettamatt/code-reasoning"]
        }
      }
    }
    
  2. Configure VS Code:

{
  "mcp": {
    "servers": {
      "code-reasoning": {
        "command": "npx",
        "args": ["-y", "@mettamatt/code-reasoning"]
      }
    }
  }
}

Usage

  1. To trigger this MCP, append this to your chat messages:

    Use sequential thinking to reason about this.
    
  2. Use ready-to-go prompts that trigger Code-Reasoning:

Code Reasoning Prompts

  • Click the "+" icon in the Claude Desktop chat window, or in Claude Code type /help to see the specific commands.
  • Select "Add from Code Reasoning" from the available tools
  • Choose a prompt template and fill in the required information
  • Submit the form to add the prompt to your chat message and hit return

See the for details on using the prompt templates.

Command Line Options

  • --debug: Enable detailed logging
  • --help or -h: Show help information

Key Features

  • Programming Focus: Optimized for coding tasks and problem-solving
  • Structured Thinking: Break down complex problems into manageable steps
  • Thought Branching: Explore multiple solution paths in parallel
  • Thought Revision: Refine earlier reasoning as understanding improves
  • Safety Limits: Automatically stops after 20 thought steps to prevent loops
  • Ready-to-Use Prompts: Pre-defined templates for common development tasks

Documentation

Detailed documentation available in the docs directory:

  • : Examples of sequential thinking with the MCP server
  • : All configuration options for the MCP server
  • : Using and customizing prompts with the MCP server
  • : Testing information

Project Structure

├── index.ts                  # Entry point
├── src/                      # Implementation source files
└── test/                     # Testing framework

Prompt Evaluation

The Code Reasoning MCP Server includes a prompt evaluation system that assesses Claude's ability to follow the code reasoning prompts. This system allows:

  • Testing different prompt variations against scenario problems
  • Verifying parameter format adherence
  • Scoring solution quality

To use the prompt evaluation system, run:

npm run eval

Prompt Comparison and Development

Significant effort went into developing the optimal prompt for the Code Reasoning server. The current implementation uses the HYBRID_DESIGN prompt, which emerged as the winner from our evaluation process.

We compared four different prompt designs:

Prompt DesignDescription
SEQUENTIALThe original sequential thinking prompt design
DEFAULTThe baseline prompt previously used in the server
CODE_REASONING_0_30An experimental variant focusing on code-specific reasoning
HYBRID_DESIGNA refined design incorporating the best elements of other approaches

Our evaluation across seven diverse programming scenarios showed that HYBRID_DESIGN outperformed other prompts:

ScenarioHYBRID_DESIGNCODE_REASONING_0_30DEFAULTSEQUENTIAL
Algorithm Selection87%82%88%82%
Bug Identification87%91%88%92%
Multi-Stage Implementation83%67%79%82%
System Design Analysis82%87%78%82%
Code Debugging Task92%87%92%92%
Compiler Optimization83%78%67%73%
Cache Strategy86%88%82%87%
Average86%83%82%84%

The HYBRID_DESIGN prompt marginally demonstrated both the highest average solution quality (86%) and the most consistent performance across all scenarios, with no scores below 80%. It also prodouced the most thoughts. The src/server.ts file has been updated to use this optimal prompt design.

Personally, I think the biggest improvement was adding this to the end of the prompt: "✍️ End each thought by asking: "What am I missing or need to reconsider?"

See for more details on the prompt evaluation system.

License

This project is licensed under the MIT License. See the LICENSE file for details.