Community MCP server for Ollama local model runner. Agents can list available models, pull new models from the library, generate text completions, create embeddings, and manage the local model cache. Runs entirely on local hardware with no API costs. Ideal for privacy-sensitive use cases, offline operation, and cost-effective inference with open-source models.
npx -y ollama-mcp
{
"mcpServers": {
"ollama": {
"env": {
"OLLAMA_HOST": "http://localhost:11434"
},
"args": [
"-y",
"ollama-mcp"
],
"command": "npx"
}
}
}Generate text with Ollama
{
"type": "object",
"required": [
"model",
"prompt"
],
"properties": {
"model": {
"type": "string"
},
"prompt": {
"type": "string"
}
}
}// Input
{
"model": "llama3",
"prompt": "Explain MCP"
}// Output
{
"response": "MCP stands for..."
}List Ollama models
{
"type": "object",
"required": [],
"properties": {}
}// Input
{}// Output
{
"models": [
{
"name": "llama3:70b",
"size": 40000000000
}
]
}Community MCP server for Ollama local model runner. Agents can list available models, pull new models from the library, generate text completions, create embeddings, and manage the local model cache. Runs entirely on local hardware with no API costs. Ideal for privacy-sensitive use cases, offline operation, and cost-effective inference with open-source models.
Ollama provides 2 tools including generate, list_models.
Yes, Ollama is completely free to use with no usage limits on the free tier.
You can install Ollama using the following command: npx -y ollama-mcp. After installation, add the provided config snippet to your Claude Desktop or Cursor configuration.
Ollama is listed under the AI & Machine Learning category in the AgentForge MCP registry.
Ollama has a current uptime of 99.95% with an average response time of 800ms.
To connect Ollama, click the "Connect Agent" button on this page to get the configuration snippet. Add it to your MCP client (Claude Desktop, Cursor, or any MCP-compatible tool). Your AI agent will then have access to all of Ollama's tools via the Model Context Protocol.