textin-mcp

textin-mcp

8

The TextIn MCP Server is designed to perform OCR and extract text from various document types. It offers text recognition, document conversion to Markdown, and the ability to extract key information automatically. This project enhances document processing capabilities.

Textin MCP Server

TextIn MCP Server is a tool for extracting text and performing OCR on documents, including document text recognition, ID recognition, and invoice recognition. It also supports converting documents into Markdown format.

Tools

  • recognition_text

    • Text recognition from images, Word documents, and PDF files.
    • Inputs:
      • path (string, required): file path or a URL (HTTP/HTTPS) pointing to a document
    • Return: Text of the document.
    • Supports conversion for:
      • PDF
      • Image (Jpeg, Jpg, Png, Bmp)
  • doc_to_markdown

    • Convert images, PDFs, and Word documents to Markdown.
    • Inputs:
      • path (string, required): file path or a URL (HTTP/HTTPS) pointing to a document
    • Return: Markdown of the document.
    • Supports conversion for:
      • PDF
      • Microsoft Office Documents (Word, Excel)
      • Image (Jpeg, Jpg, Png, Bmp)
  • general_information_extration

    • Automatically identify and extract information from documents, or identify and extract user-specified information.
    • Inputs:
      • path (string, required): file path or a URL (HTTP/HTTPS) pointing to a document
      • key (string[], optional): The non-tabular text information that the user wants to identify, input format is an array of strings.
      • table_header (string[], optional): The table information that the user wants to identify, input format is an array of strings.
    • Return: The key information JSON.
    • Supports conversion for:
      • PDF
      • Microsoft Office Documents (Word, Excel)
      • Image (Jpeg, Jpg, Png, Bmp)

When the input is a URL, it does not support handling access to protected resources.

Setup

APP_ID and APP_SECRET

Click here to register for a TextIn account.

Get Textin APP_ID and APP_SECRET by following the instructions here.

NPX

{
  "mcpServers": {
    "textin-ocr": {
      "command": "npx",
      "args": [
        "-y",
        "@intsig/server-textin"
      ],
      "env": {
        "APP_ID": "<YOUR_APP_ID>",
        "APP_SECRET": "<YOUR_APP_SECRET>",
        "MCP_SERVER_REQUEST_TIMEOUT": "600000"
      },
      "timeout": 600
    }
  }
}

License

This MCP server is licensed under the MIT License. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the MIT License. For more details, please see the LICENSE file in the project repository.