Sourcebot
If you are the rightful owner of Sourcebot and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcpreview.com.
The Sourcebot MCP server allows LLM agents to fetch code context from various repositories hosted on platforms like GitHub, GitLab, and Bitbucket, enhancing the capabilities of LLMs in code-related tasks.
Sourcebot MCP - Fetch code context from GitHub, GitLab, Bitbucket, and more
The Sourcebot MCP server gives your LLM agents the ability to fetch code context across thousands of repos hosted on GitHub, GitLab, BitBucket and more. Ask your LLM a question, and the Sourcebot MCP server will fetch relevant context from its index and inject it into your chat session. Some use cases this unlocks include:
-
Enriching responses to user requests:
- "What repositories are using internal library X?"
- "Provide usage examples of the CodeMirror component"
- "Where is the
useCodeMirrorTheme
hook defined?" - "Find all usages of
deprecatedApi
across all repos"
-
Improving reasoning ability for existing horizontal agents like AI code review, docs generation, etc.
- "Find the definitions for all functions in this diff"
- "Document what systems depend on this class"
-
Building custom LLM horizontal agents like like compliance auditing agents, migration agents, etc.
- "Find all instances of hardcoded credentials"
- "Identify repositories that depend on this depreacted api"
Getting Started
-
Install Node.JS >= v18.0.0.
-
(optional) Spin up a Sourcebot instance by following this guide. The host url of your instance (e.g.,
http://localhost:3000
) is passed to the MCP server via theSOURCEBOT_HOST
url. This allows you to control which repos Sourcebot MCP fetches context from (including private repos).If a host is not provided, then the server will fallback to using the demo instance hosted at https://demo.sourcebot.dev. You can see the list of repositories indexed here. Add additional repositories by opening a PR.
-
Install
@sourcebot/mcp
into your MCP client:Cursor
Go to:
Settings
->Cursor Settings
->MCP
->Add new global MCP server
Paste the following into your
~/.cursor/mcp.json
file. This will install Sourcebot globally within Cursor:{ "mcpServers": { "sourcebot": { "command": "npx", "args": ["-y", "@sourcebot/mcp@latest" ], // Optional - if not specified, https://demo.sourcebot.dev is used "env": { "SOURCEBOT_HOST": "http://localhost:3000" } } } }
Windsurf
Go to:
Windsurf Settings
->Cascade
->Add Server
->Add Custom Server
Paste the following into your
mcp_config.json
file:{ "mcpServers": { "sourcebot": { "command": "npx", "args": ["-y", "@sourcebot/mcp@latest" ], // Optional - if not specified, https://demo.sourcebot.dev is used "env": { "SOURCEBOT_HOST": "http://localhost:3000" } } } }
VS Code
Add the following to your settings.json:
{ "mcp": { "servers": { "sourcebot": { "type": "stdio", "command": "npx", "args": ["-y", "@sourcebot/mcp@latest"], // Optional - if not specified, https://demo.sourcebot.dev is used "env": { "SOURCEBOT_HOST": "http://localhost:3000" } } } } }
Claude Code
Run the following command:
# SOURCEBOT_HOST env var is optional - if not specified, # https://demo.sourcebot.dev is used. claude mcp add sourcebot -e SOURCEBOT_HOST=http://localhost:3000 -- npx -y @sourcebot/mcp@latest
Claude Desktop
Add the following to your
claude_desktop_config.json
:{ "mcpServers": { "sourcebot": { "command": "npx", "args": ["-y", "@sourcebot/mcp@latest"], // Optional - if not specified, https://demo.sourcebot.dev is used "env": { "SOURCEBOT_HOST": "http://localhost:3000" } } } }
Alternatively, you can install using via Smithery. For example:
npx -y @smithery/cli install @sourcebot-dev/sourcebot --client claude
- Tell your LLM to
use sourcebot
when prompting.
For a more detailed guide, checkout the docs.
Available Tools
search_code
Fetches code that matches the provided regex pattern in query
.
Parameters
Name | Required | Description |
---|---|---|
query | yes | Regex pattern to search for. Escape special characters and spaces with a single backslash (e.g., 'console.log', 'console\ log'). |
filterByRepoIds | no | Restrict search to specific repository IDs (from 'list_repos'). Leave empty to search all. |
filterByLanguages | no | Restrict search to specific languages (GitHub linguist format, e.g., Python, JavaScript). |
caseSensitive | no | Case sensitive search (default: false). |
includeCodeSnippets | no | Include code snippets in results (default: false). |
maxTokens | no | Max tokens to return (default: env.DEFAULT_MINIMUM_TOKENS). |
list_repos
Lists all repositories indexed by Sourcebot.
get_file_source
Fetches the source code for a given file.
Parameters
Name | Required | Description |
---|---|---|
fileName | yes | The file to fetch the source code for. |
repoId | yes | The Sourcebot repository ID. |
Supported Code Hosts
Sourcebot supports the following code hosts:
| Don't see your code host? Open a GitHub discussion.
Future Work
Semantic Search
Currently, Sourcebot only supports regex-based code search (powered by zoekt under the hood). It is great for scenarios when the agent is searching for is something that is super precise and well-represented in the source code (e.g., a specific function name, a error string, etc.). It is not-so-great for fuzzy searches where the objective is to find some loosely defined category or concept in the code (e.g., find code that verifies JWT tokens). The LLM can approximate this by crafting regex searches that attempt to capture a concept (e.g., it might try a query like "jwt|token|(verify|validate).*(jwt|token)"
), but often yields sub-optimal search results that aren't related. Tools like Cursor solve this with embedding models to capture the semantic meaning of code, allowing for LLMs to search using natural language. We would like to extend Sourcebot to support semantic search and expose this capability over MCP as a tool (e.g., semantic_search_code
tool). GitHub Discussion
Code Navigation
Another idea is to allow LLMs to traverse abstract syntax trees (ASTs) of a codebase to enable reliable code navigation. This could be packaged as tools like goto_definition
, find_all_references
, etc., which could be useful for LLMs to get additional code context. GitHub Discussion
Got an idea?
Open up a GitHub discussion!
Related MCP Servers
View all developer_tools servers →context7
by upstash
Context7 MCP provides up-to-date, version-specific documentation and code examples directly into your prompt, enhancing the capabilities of LLMs by ensuring they use the latest information.
git-mcp
by idosal
GitMCP is a free, open-source, remote Model Context Protocol (MCP) server that transforms GitHub projects into documentation hubs, enabling AI tools to access up-to-date documentation and code.
Sequential Thinking
by modelcontextprotocol
An MCP server implementation that provides a tool for dynamic and reflective problem-solving through a structured thinking process.
github-mcp-server
by github
The GitHub MCP Server is a Model Context Protocol server that integrates with GitHub APIs for automation and interaction.
claude-task-master
by eyaltoledano
Task Master is a task management system for AI-driven development with Claude, designed to work seamlessly with Cursor AI.
deepwiki-mcp
by regenrek
This is an unofficial Deepwiki MCP Server that processes Deepwiki URLs, crawls pages, converts them to Markdown, and returns documents or lists by page.
terraform-mcp-server
by hashicorp
The Terraform MCP Server is a Model Context Protocol server that integrates with Terraform Registry APIs for advanced automation in Infrastructure as Code development.
Everything MCP Server
by modelcontextprotocol
The Everything MCP Server is a comprehensive test server designed to demonstrate the full capabilities of the Model Context Protocol (MCP). It is not intended for production use but serves as a valuable tool for developers building MCP clients.