A streamlined Model Context Protocol (MCP) server for author disambiguation and academic research using the OpenAlex.org API. Specifically designed for AI agents with optimized data structures and enhanced functionality.
- Advanced Author Disambiguation: Handles complex career transitions and name variations
- Institution Resolution: Current and past affiliations with transition tracking
- Academic Work Retrieval: Journal articles, letters, and research papers
- Citation Analysis: H-index, citation counts, and impact metrics
- ORCID Integration: Highest accuracy matching with ORCID identifiers
- Streamlined Data: Focused on essential information for disambiguation
- Fast Processing: Optimized data structures for rapid analysis
- Smart Filtering: Enhanced filtering options for targeted queries
- Clean Output: Structured responses optimized for AI reasoning
- Multiple Candidates: Ranked results for automated decision-making
- Structured Responses: Clean, parseable output optimized for LLMs
- Error Handling: Graceful degradation with informative messages
- Enhanced Filtering: Journal-only, citation thresholds, and temporal filters
- MCP Best Practices: Built with FastMCP following official guidelines
- Tool Annotations: Proper MCP tool annotations for optimal client integration
- Resource Management: Efficient HTTP client management and cleanup
- Rate Limiting: Respectful API usage with proper delays
- Python 3.10 or higher
- MCP-compatible client (e.g., Claude Desktop)
- Email address (for OpenAlex API courtesy)
For detailed installation instructions, see INSTALL.md.
-
Clone the repository:
git clone https://github.com/drAbreu/alex-mcp.git cd alex-mcp
-
Create a virtual environment:
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install the package:
pip install -e .
-
Configure environment:
export OPENALEX_MAILTO=your-email@domain.com
-
Run the server:
./run_alex_mcp.sh # Or, if installed as a CLI tool: alex-mcp
Add to your Claude Desktop configuration file:
{
"mcpServers": {
"alex-mcp": {
"command": "/path/to/alex-mcp/run_alex_mcp.sh",
"env": {
"OPENALEX_MAILTO": "your-email@domain.com"
}
}
}
}
Replace /path/to/alex-mcp
with the actual path to the repository on your system.
You can load this MCP server in your OpenAI agent workflow using the agents.mcp.MCPServerStdio
interface:
from agents.mcp import MCPServerStdio
async with MCPServerStdio(
name="OpenAlex MCP For Author disambiguation and works",
cache_tools_list=True,
params={
"command": "uvx",
"args": [
"--from", "git+https://github.com/drAbreu/alex-mcp.git@4.1.0",
"alex-mcp"
],
"env": {
"OPENALEX_MAILTO": "your-email@domain.com"
}
},
client_session_timeout_seconds=10
) as alex_mcp:
await alex_mcp.connect()
tools = await alex_mcp.list_tools()
print(f"Available tools: {[tool.name for tool in tools]}")
This MCP server is specifically optimized for academic research workflows:
# Optimized for academic research workflows
from alex_agent import run_author_research
# Enhanced functionality with streamlined data
result = await run_author_research(
"Find J. Abreu at EMBO with recent publications"
)
# Clean, structured output for AI processing
print(f"Success: {result['workflow_metadata']['success']}")
print(f"Quality: {result['research_result']['metadata']['result_analysis']['quality_score']}/100")
# Standard launch
uvx --from git+https://github.com/drAbreu/alex-mcp.git@4.1.0 alex-mcp
# With environment variables
OPENALEX_MAILTO=your-email@domain.com uvx --from git+https://github.com/drAbreu/alex-mcp.git@4.1.0 alex-mcp
Search for authors with streamlined output for AI agents.
Parameters:
name
(required): Author name to searchinstitution
(optional): Institution name filtertopic
(optional): Research topic filtercountry_code
(optional): Country code filter (e.g., "US", "DE")limit
(optional): Maximum results (1-25, default: 20)
Streamlined Output:
{
"query": "J. Abreu",
"total_count": 3,
"results": [
{
"id": "https://openalex.org/A123456789",
"display_name": "Jorge Abreu-Vicente",
"orcid": "https://orcid.org/0000-0000-0000-0000",
"display_name_alternatives": ["J. Abreu-Vicente", "Jorge Abreu Vicente"],
"affiliations": [
{
"institution": {
"display_name": "European Molecular Biology Organization",
"country_code": "DE"
},
"years": [2023, 2024, 2025]
}
],
"cited_by_count": 316,
"works_count": 25,
"summary_stats": {
"h_index": 9,
"i10_index": 5
},
"x_concepts": [
{
"display_name": "Astrophysics",
"score": 0.8
},
{
"display_name": "Machine Learning",
"score": 0.6
}
]
}
]
}
Features: Clean structure optimized for AI reasoning and disambiguation
Retrieve works for a given author with enhanced filtering capabilities.
Parameters:
author_id
(required): OpenAlex author IDlimit
(optional): Maximum results (1-50, default: 20)order_by
(optional): "date" or "citations" (default: "date")publication_year
(optional): Filter by specific yeartype
(optional): Work type filter (e.g., "journal-article")authorships_institutions_id
(optional): Filter by institutionis_retracted
(optional): Filter retracted worksopen_access_is_oa
(optional): Filter by open access status
Enhanced Output:
{
"author_id": "https://openalex.org/A123456789",
"total_count": 25,
"results": [
{
"id": "https://openalex.org/W123456789",
"title": "A platform for the biomedical application of large language models",
"doi": "10.1038/s41587-024-02534-3",
"publication_year": 2025,
"type": "journal-article",
"cited_by_count": 42,
"authorships": [
{
"author": {
"display_name": "Jorge Abreu-Vicente"
},
"institutions": [
{
"display_name": "European Molecular Biology Organization"
}
]
}
],
"locations": [
{
"source": {
"display_name": "Nature Biotechnology",
"type": "journal"
}
}
],
"open_access": {
"is_oa": true
},
"primary_topic": {
"display_name": "Biomedical Engineering"
}
}
]
}
Features: Comprehensive work data with flexible filtering for targeted queries
This MCP server provides focused, structured data specifically designed for AI agent consumption:
- Identity Resolution: Names, ORCID, alternatives for disambiguation
- Affiliation Tracking: Current and historical institutional connections
- Impact Metrics: Citation counts, h-index, and scholarly impact
- Research Context: Fields, concepts, and domain expertise
- Career Analysis: Temporal affiliation changes and transitions
- Publication Metadata: Title, DOI, venue, and publication details
- Impact Assessment: Citation counts and scholarly influence
- Access Information: Open access status and availability
- Authorship Details: Complete author lists and institutional affiliations
- Research Classification: Topics, concepts, and domain categorization
# Target high-impact journal articles
works = await retrieve_author_works(
author_id="https://openalex.org/A123456789",
type="journal-article", # Focus on journal publications
open_access_is_oa=True, # Open access only
order_by="citations", # Most cited first
limit=15
)
# Career transition analysis
authors = await search_authors(
name="J. Abreu",
institution="EMBO", # Current institution
topic="Machine Learning", # Research focus
limit=10
)
from alex_mcp.server import search_authors_core
# Comprehensive author search
results = search_authors_core(
name="J Abreu Vicente",
institution="EMBO",
topic="Machine Learning",
limit=20
)
print(f"Found {results.total_count} candidates")
for author in results.results:
print(f"- {author.display_name}")
if author.affiliations:
current_inst = author.affiliations[0].institution.display_name
print(f" Institution: {current_inst}")
print(f" Metrics: {author.cited_by_count} citations, h-index {author.summary_stats.h_index}")
if author.x_concepts:
fields = [c.display_name for c in author.x_concepts[:3]]
print(f" Research: {', '.join(fields)}")
from alex_mcp.server import retrieve_author_works_core
# Comprehensive work retrieval
works = retrieve_author_works_core(
author_id="https://openalex.org/A5058921480",
type="journal-article", # Academic focus
order_by="citations", # Impact-based ordering
limit=20
)
print(f"Found {works.total_count} publications")
for work in works.results:
print(f"- {work.title}")
if work.locations:
journal = work.locations[0].source.display_name
print(f" Published in: {journal} ({work.publication_year})")
print(f" Impact: {work.cited_by_count} citations")
if work.open_access and work.open_access.is_oa:
print(" β Open Access")
# Analyze career transitions
def analyze_career_path(author_result):
affiliations = author_result.affiliations
if len(affiliations) > 1:
print("Career path:")
for aff in sorted(affiliations, key=lambda x: min(x.years)):
years = f"{min(aff.years)}-{max(aff.years)}"
print(f" {years}: {aff.institution.display_name}")
# Research evolution
if author_result.x_concepts:
print("Research areas:")
for concept in author_result.x_concepts[:5]:
print(f" {concept.display_name} (score: {concept.score:.2f})")
# Usage
results = search_authors_core("Jorge Abreu Vicente")
if results.results:
analyze_career_path(results.results[0])
# Required
export OPENALEX_MAILTO=your-email@domain.com
# Optional settings
export OPENALEX_MAX_AUTHORS=100 # Maximum authors per query
export OPENALEX_USER_AGENT=research-agent-v1.0
export ALEX_MCP_VERSION=4.1.0
# Rate limiting (respectful usage)
export OPENALEX_RATE_PER_SEC=10
export OPENALEX_RATE_PER_DAY=100000
# For comprehensive research applications
config = {
"max_authors_per_query": 25, # Detailed author analysis
"max_works_per_author": 50, # Complete publication history
"enable_all_filters": True, # Full filtering capabilities
"detailed_affiliations": True, # Complete institutional data
"research_concepts": True # Detailed concept analysis
}
alex-mcp/
βββ src/alex_mcp/
β βββ server.py # Main MCP server
β βββ data_objects.py # Data models and structures
β βββ utils.py # Utility functions
βββ examples/
β βββ basic_usage.py # Simple examples
β βββ advanced_queries.py # Complex query examples
β βββ integration_demo.py # AI agent integration
βββ tests/
β βββ test_server.py # Server functionality tests
β βββ test_integration.py # Integration tests
βββ docs/
βββ api_reference.md # Detailed API documentation
# Install test dependencies
pip install -e ".[test]"
# Run functionality tests
pytest tests/test_server.py -v
# Test with real queries
python examples/basic_usage.py
# Test AI agent integration
python examples/integration_demo.py
# Test author disambiguation
python examples/basic_usage.py --query "J. Abreu" --institution "EMBO"
# Test work retrieval
python examples/advanced_queries.py --author-id "A123456789" --type "journal-article"
# Test integration patterns
python examples/integration_demo.py --workflow "career-analysis"
Perfect integration with AI-powered research analysis:
# Enhanced academic research agent
from alex_agent import AcademicResearchAgent
agent = AcademicResearchAgent(
mcp_servers=[alex_mcp], # Streamlined data processing
model="gpt-4.1-2025-04-14"
)
# Complex research queries with structured data
result = await agent.research_author(
"Find J. Abreu at EMBO with machine learning publications"
)
# Rich, structured output for AI reasoning
print(f"Quality Score: {result.quality_score}/100")
print(f"Author disambiguation: {result.confidence}")
print(f"Research fields: {result.research_domains}")
# Collaborative research analysis
async def research_collaboration_network(seed_author):
# Find primary author
authors = await alex_mcp.search_authors(seed_author)
primary = authors['results'][0]
# Get their works
works = await alex_mcp.retrieve_author_works(
primary['id'],
type="journal-article"
)
# Analyze co-authors and build network
collaborators = set()
for work in works['results']:
for authorship in work.get('authorships', []):
collaborators.add(authorship['author']['display_name'])
return {
'primary_author': primary,
'publication_count': len(works['results']),
'collaborator_network': list(collaborators),
'research_impact': sum(w['cited_by_count'] for w in works['results'])
}
We welcome contributions to improve functionality and add new features:
- Fork the repository
- Create a feature branch:
git checkout -b feature/enhanced-filtering
- Add tests: Ensure your changes maintain data quality and structure
- Submit a pull request: Include examples and documentation
- Enhanced filtering capabilities
- Additional data enrichment
- Performance optimizations
- Integration examples
- Documentation improvements
This project is licensed under the MIT License. See LICENSE for details.