⚠️ UNDER CONSTRUCTION⚠️ This project is currently under active development. While the core functionality is operational, please be aware that:
- No guarantee is provided that the software is bug-free
- Breaking changes may occur without notice
- Some features may be incomplete or experimental
- Use at your own risk in production environments
Feedback and contributions are welcome as we work to improve the server!
A Model Context Protocol (MCP) server providing access to the EMBL-EBI Protein database. This server enables LLMs and other MCP clients to search, retrieve, and analyze protein data from UniProt and related databases.
Requirements: Python 3.10+ (due to MCP library requirements)
# Create virtual environment with Python 3.10+
python3.11 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install the package
pip install -e .# Start the MCP server
embl_ebi_protein_mcpThe server runs on stdio and is ready to accept MCP protocol requests.
get_protein_summary- One-stop comprehensive protein lookup (BEST STARTING POINT)find_protein_accession- Easy protein ID lookup for common names
search_proteins- Search UniProt protein database by text queryget_protein_by_accession- Get detailed protein information by UniProt accessionget_protein_interactions- Get known protein-protein interactionsget_protein_isoforms- Get alternative protein isoforms (splice variants)
search_features- Search for specific protein sequence featuresget_features_by_accession- Get all sequence features for a proteinget_features_by_type- Get features by specific type (domains, sites, etc.)search_variations- Search natural variants in proteinsget_variations_by_accession- Get variations by UniProt accession
get_taxonomy_by_id- Get taxonomic information by NCBI taxonomy IDget_taxonomy_lineage- Get complete taxonomic lineageget_taxonomy_children- Get taxonomy children by ID
search_proteomes- Search proteomesget_proteome_by_upid- Get proteome by UniProt Proteome IDsearch_coordinates- Search genomic coordinatessearch_uniparc- Search UniParc entries
Total: 30 tools available
// Find human p53 protein - comprehensive overview
{
"method": "tools/call",
"params": {
"name": "get_protein_summary",
"arguments": {
"gene_name": "p53",
"organism": "human"
}
}
}
// Find insulin in mouse
{
"method": "tools/call",
"params": {
"name": "find_protein_accession",
"arguments": {
"gene_name": "insulin",
"organism": "mouse"
}
}
}import asyncio
from embl_ebi_protein_mcp.bridge import EMBLEBIBridge
async def example():
async with EMBLEBIBridge() as bridge:
# Find protein accession
result = await bridge.find_protein_accession("p53", "human")
print(f"Found: {result[0]['accession']}")
# Get comprehensive summary
summary = await bridge.get_protein_summary("insulin", "human")
print(f"Insulin accession: {summary['accession']}")
asyncio.run(example())The server includes convenient organism name mapping:
| Common Name | Taxonomy ID | Example Usage |
|---|---|---|
| human | 9606 | "organism": "human" |
| mouse | 10090 | "organism": "mouse" |
| rat | 10116 | "organism": "rat" |
| yeast | 559292 | "organism": "yeast" |
| e.coli | 83333 | "organism": "e.coli" |
| drosophila | 7227 | "organism": "drosophila" |
Add to your Claude Desktop configuration:
{
"mcpServers": {
"embl-ebi-proteins": {
"command": "embl_ebi_protein_mcp",
"env": {}
}
}
}The server follows the standard MCP protocol and should work with any MCP-compatible client.
The project uses pytest for comprehensive testing with unit tests, integration tests, and code coverage.
# Run all tests
pytest
# Run with coverage
pytest --cov=embl_ebi_protein_mcp
# Run only unit tests (fast)
pytest -m unit
# Run only integration tests
pytest -m integration# Run comprehensive test suite with coverage
./scripts/test.sh
# Run code quality checks (black, flake8, mypy)
./scripts/lint.sh- Unit tests (
-m unit): Fast tests with mocked dependencies - Integration tests (
-m integration): Tests with mocked API calls - Slow tests (
-m slow): Comprehensive workflow tests
# Run with debug logging
DEBUG=1 embl_ebi_protein_mcpThis server accesses the following EMBL-EBI REST API endpoints:
- Proteins:
https://www.ebi.ac.uk/proteins/api/proteins - Features:
https://www.ebi.ac.uk/proteins/api/features - Variations:
https://www.ebi.ac.uk/proteins/api/variation - Taxonomy:
https://www.ebi.ac.uk/proteins/api/taxonomy - Proteomes:
https://www.ebi.ac.uk/proteins/api/proteomes - Coordinates:
https://www.ebi.ac.uk/proteins/api/coordinates - UniParc:
https://www.ebi.ac.uk/proteins/api/uniparc
EMBL-EBI-Protein-mcp/
├── embl_ebi_protein_mcp/
│ ├── __init__.py
│ ├── bridge.py # API bridge to EMBL-EBI
│ ├── mcp_server.py # MCP server implementation
│ ├── cli.py # Command-line interface
│ └── main.py # Main entry point
├── test_api.py # API functionality tests
├── test_enhancements.py # Enhanced features tests
├── ENHANCEMENTS.md # Enhancement documentation
└── pyproject.toml # Package configuration
# Install with development dependencies
pip install -e .[dev]
# Run tests
pytest
# Run code quality checks
./scripts/lint.sh
# Format code
black embl_ebi_protein_mcp/ tests/
# Type checking
mypy embl_ebi_protein_mcp/- Added
get_protein_summaryfor comprehensive protein lookup - Added
find_protein_accessionfor easy ID lookup - Enhanced all tool descriptions with usage guidance and examples
- Added organism name mapping for better usability
- Improved error handling and fallback strategies
- Added clear workflow guidance for LLMs
See ENHANCEMENTS.md for detailed information.
- Some proteins may not have complete data in all endpoints
- Large result sets may be truncated or timeout
- The API requires specific UniProt accession formats for some endpoints
- Rate limiting may apply for high-volume usage
MIT License - see LICENSE file for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Issues: Use the GitHub issue tracker
- Questions: Open a discussion on GitHub
- API Documentation: EMBL-EBI Proteins API
Note: This is an unofficial implementation and is not affiliated with EMBL-EBI. All data is accessed through their public REST API.