Development¶
📦 Dependencies¶
All Python dependencies are listed in requirements.txt. The new productivity tools use only standard library features:
| Tool | Python Packages | External Tools |
|---|---|---|
| TodoWrite | Standard library only | None |
| Edit Tool | Standard library only | None |
| Glob Search | Standard library (glob) |
None |
| Grep Search | Standard library (subprocess) |
ripgrep (optional) |
| Error Handler | Standard library (functools) |
None |
| Git Safety | Standard library (subprocess) |
git |
| Plan Mode | Standard library (json, os) |
None |
| Background Tasks | Standard library (threading) |
None |
External Tools:
- ripgrep: Required for
grep_searchtool. Install via system package manager (see Installation section). If not installed, the assistant automatically falls back to slower alternatives.
Core Python Packages:
langgraph: Agent orchestration frameworklangchain,langchain-core: LLM abstraction layerlangchain-ollama: Ollama integrationlangchain-aws: AWS Bedrock integrationlangchain-openai: OpenAI integration (also used for Bedrock Mantle OpenAI/Responses protocols)langchain-anthropic: Anthropic integration (Bedrock Mantleanthropicprotocol)aws-bedrock-token-generator: Bearer-token auth for Bedrock Mantlemcp,mcp[cli]: Model Context Protocolollama: Local LLM supportboto3: AWS Bedrock/SageMakertiktoken: Token countingchromadb,faiss-cpu: Vector stores for RAGPyPDF2,python-docx: Document readersPygments: Code syntax highlightingprompt_toolkit: Interactive CLIbrave-search-python-client: Web searchcrawl4ai: Web crawling
🛠️ Development¶
Testing¶
The test suite uses pytest and is split into two tiers under tests/:
tests/unit/— fast, deterministic tests for pure logic (BM25, reasoning helpers, response parsing, subtask parsing, the tool error handler, git-safety command classification, file editing/search, bash timeout handling, and episodic-memory heuristics). No LLM, Ollama, or network required, so they run in seconds and don't need aconfig.yaml.tests/integration/— end-to-end tests that drive the real agent against a live Ollama server and the MCP subprocess (routing, tool calls, bash timeout, no silent empty turns). Marked with@pytest.mark.integrationand auto-skipped unless a runtimeutils/config.yamlexists and the configured Ollama host is reachable.
# Install test dependencies
pip install -r requirements-dev.txt
# Run everything (integration auto-skips if Ollama/config aren't available)
python -m pytest
# Unit tier only (fast — good for CI and pre-commit)
python -m pytest tests/unit
# Integration tier only (requires Ollama running + a real config.yaml)
python -m pytest -m integration
# Run a single file
python -m pytest tests/unit/test_bm25.py
When adding new code, keep import-time side effects independent of config.yaml so the module stays unit-testable.
CI. The .github/workflows/tests.yml workflow runs the unit tier (plus a ruff import-sort check) on every push and pull request across Python 3.11–3.13. The integration tier is not run in CI (it needs a live Ollama server) — run it locally before a release, per the checklist below.
Stability & Versioning¶
Mnemo AI follows Semantic Versioning. The public surface that versioning protects is:
- Config keys in
config.yaml(theMODEL_ID/VISION_MODEL_ID/RAG.EMBED_MODEL_IDfields, theENABLE_*/REQUIRE_*toggles, and the documented section keys). - Prompt keys in
prompts.yaml(SYSTEM_PROMPT,ROUTING_PROMPT,ORCHESTRATOR_PROMPT,AGGREGATOR_PROMPT,SUMMARY_SYSTEM_PROMPT,SUMMARY_TASK_PROMPT). As of 0.8.16 these live inprompts.yaml, notconfig.yaml(keys left inconfig.yamlare ignored with a migration warning). - The
mcp.jsonschema for external MCP servers (mcpServerswithcommand/args/env/disabled). - CLI commands (
/config,/model,/params,/mcp,/memory,/plan,/compact,/clear,/save,/load) and themnemoaiconsole command +--no-verboseflag. - The distribution/import name (
pip install mnemoai-assistant→import mnemoai).
Pre-1.0.0, minor releases may add features and occasionally adjust these. From 1.0.0 onward, a breaking change to any of the above bumps the major version; new backward-compatible features bump the minor; fixes bump the patch. Internal modules (anything under client/, server/, models/, utils/ not listed above) are not part of the public contract and may change between any releases. All changes are recorded in CHANGELOG.md.
Release Checklist¶
Before tagging a release:
- Unit tests + lint pass (also enforced by CI):
python -m pytest -m "not integration"andruff check --select I .. - Integration smoke test with a live model (
PYTHONPATH=src python -m pytest -m integration, or manually drive the app and verify). Prefer a capable model here — small local models (e.g. a 4B) are intermittently unreliable at tool-calling, so the tool-backed checks can flake; the suite passes deterministically on a strong model (e.g. Bedrock Claude Sonnet). Point the run at a specific config withMNEMOAI_CONFIG=/path/to/config.yamlif needed: - a greeting / simple Q&A returns a non-empty answer;
- a tool-backed query runs a tool (e.g. "list files here") and the
[⚙ …]marker fires; - with routing on, a multi-step task is decomposed (orchestrator) and completes;
- plan mode:
/planon → an edit/bash request is blocked;/planoff → it proceeds; - an external MCP tool (from
mcp.json) is callable. - Update
CHANGELOG.md— moveUnreleaseditems under the new version + date. - Bump
versioninpyproject.toml. - Build + validate:
uv buildthentwine check dist/*. - Tag
vX.Y.Z, push, thentwine upload dist/*(refreshes the PyPI description).
Adding New Tools¶
- Create tool file in
server/tools/:
from mcp.server.fastmcp import FastMCP
def register_your_tool(mcp: FastMCP):
@mcp.tool()
async def your_tool(param: str) -> str:
"""Tool description for the LLM."""
# Implementation
return result
- Register in
tools_manager.py:
Adding New File Readers¶
- Create reader in
server/tools/readers/:
async def read_your_format(path: str) -> str:
"""Read your custom format."""
# Implementation
return content
- Register in
fs_read.py:
Switching Model Providers¶
The application uses controller classes for centralized model management. To switch providers, just update config.yaml:
For LLM:
For Vision:
For Embeddings:
The controllers (llm_controller.py, vision_model_controller.py, embeddings_controller.py) handle all provider-specific initialization automatically.
Adding New Model Providers¶
- Update the appropriate controller in
models/:
def initialize_model(self):
if self.model_type == "your_provider":
# Your provider initialization
self.model = YourProviderModel(...)
- Add configuration in
config.yaml
🔧 Ollama Utilities (Optional)¶
The bash/ directory contains helper scripts for Ollama users on macOS and Linux.
Ollama Environment Setup (macOS)¶
Sets Ollama performance environment variables at boot and launches the Ollama app:
Setup:
- Edit
bash/ollama-env-mac/ollama.environment.plist(no changes needed for defaults) - Copy to LaunchAgents:
cp bash/ollama-env-mac/ollama.environment.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/ollama.environment.plist
VRAM Cleaner¶
Automatically unloads idle Ollama models from VRAM to free GPU memory. Useful when running multiple models or when GPU memory is limited.
macOS (LaunchAgent, runs every 60 seconds):
- Edit
bash/ollama-freeup-vram/com.ollama.vramcleaner.plist: - Replace
<PATH_TO_FOLDER>with the actual path to this repository - Replace
<PATH_TO_USER_HOME>with your home directory - Install:
cp bash/ollama-freeup-vram/com.ollama.vramcleaner.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.ollama.vramcleaner.plist
Linux (systemd):
- Edit
bash/ollama-freeup-vram/ollama-vram-cleaner.service: - Replace
<PATH_TO_FOLDER>with the actual path - Install:
sudo cp bash/ollama-freeup-vram/ollama-vram-cleaner.service /etc/systemd/system/
sudo systemctl enable ollama-vram-cleaner
sudo systemctl start ollama-vram-cleaner
See bash/ollama-freeup-vram/README.md and bash/ollama-env-mac/README.md for more details.
🐛 Troubleshooting¶
Common Issues¶
MCP Connection Errors
- Verify Python path in
client.pymatches your environment - Check server path is correct
- Ensure all dependencies are installed (
pip install -r requirements.txt)
Model Loading Issues
- Verify model name and type in
config.yaml - For Ollama: Ensure Ollama is running (
ollama serve) and model is pulled (ollama pull model-name) - For AWS Bedrock: Check credentials (
aws sts get-caller-identity), region, and model access - For OpenAI: Ensure
OPENAI_API_KEYenvironment variable is set
RAG / Episodic Memory Not Working
- Ensure
ENABLE_RAG: true(orENABLE_EPISODIC_MEMORY: true) in config - Verify embedding model is configured and available (
RAG.EMBED_MODEL_IDin config) - For Ollama embeddings: ensure the embedding model is pulled (
ollama pull mxbai-embed-large) - Check logs for "fallback embeddings" warnings — this means the real model is unreachable
- Verify documents are being indexed with
list_documents()
Permission Errors
- Ensure write permissions for
~/.mnemoai/ - Ensure write permissions for
~/.mnemoai/(the app home: config, plans, tasks, per-profile state) - Check file paths in configuration
Import Errors on Startup
- Some dependencies (chromadb, faiss-cpu, crawl4ai) can be tricky to install. Check platform-specific instructions.
- On Apple Silicon:
faiss-cpumay requirepip install faiss-cpu --no-cache-dir
Logging¶
Logs are output to stderr with configurable level:
📄 License¶
This project is licensed under the MIT License - see the LICENSE file for details.
🤝 Contributing¶
This is a personal development project. If you'd like to use or extend it, feel free to fork the repository and adapt it to your needs!
If you use this code in your own projects, attribution to the original repository is appreciated but not required.
🙏 Acknowledgments¶
- Built with LangGraph and LangChain
- Uses FastMCP for Model Context Protocol
- Powered by Ollama, Amazon Bedrock, and Amazon SageMaker AI