Welcome to Jan Nano Docs

We spent on the money on GPU - so here is a blank HTML docs page.

Overview

Jan-Nano is a compact 4-billion parameter language model specifically designed and trained for deep research tasks. This model has been optimized to work seamlessly with Model Context Protocol (MCP) servers, enabling efficient integration with various research tools and data sources.

Demo

Jan-Nano Demo

Jan-Nano is supported by Jan, an open-source ChatGPT alternative that runs entirely on your computer. Jan provides a user-friendly interface for running local AI models with full privacy and control.

System Requirements

Minimum Requirements:
- 8GB RAM (with iQ4_XS quantization)
- 12GB VRAM (for Q8 quantization)
- CUDA-compatible GPU
Recommended Setup:
- 16GB+ RAM
- 16GB+ VRAM
- Latest CUDA drivers
- RTX 30/40 series or newer

Setup Guidelines

Quick Start

Install Jan Beta
Download Jan-Nano from Hub
Start new chat with Jan-Nano

MCP Server (Serper) Setup

Recommended: Serper MCP Server
Requirements: Node.js ≥ 18, Serper API key (get your API key here)

Using with Jan

Jan Beta Guidelines

Start the Serper MCP server as above.
In Jan, go to Settings → MCP Servers.

Add a new MCP server, set the command to:

env SERPER_API_KEY=your_api_key_here npx -y serper-search-scrape-mcp-server

Save and ensure Jan can connect to the MCP server.

Performance

Jan-Nano has been evaluated on the SimpleQA benchmark using our MCP-based benchmark methodology:

image/png

The evaluation was conducted using our MCP-based benchmark approach, which assesses the model’s performance on SimpleQA tasks while leveraging its native MCP server integration capabilities. This methodology better reflects Jan-Nano’s real-world performance as a tool-augmented research model.

FAQ

What are the recommended GGUF quantizations?
- Q8 GGUF is recommended for best performance
- iQ4_XS GGUF for very limited VRAM setups
- Avoid Q4_0 and Q4_K_M as they show significant performance degradation
Can I run this on a laptop with 8GB RAM?
- Yes, but use the recommended quantizations (iQ4_XS)
- Note that performance may be limited with Q4 quantizations
How much did the training cost?
- Training was done on internal A6000 clusters
- Estimated cost on RunPod would be under $100 using H200
- Hardware used:
  - 8xA6000 for training code
  - 4xA6000 for vllm server (inferencing)
What frontend should I use?
- Jan Beta (recommended) - Minimalistic and polished interface
- Download link: https://jan.ai/docs/desktop/beta
Getting Jinja errors in LM Studio?
- Use Qwen3 template from other LM Studio compatible models
- Disable “thinking” and add the required system prompt
- Fix coming soon in future GGUF releases
Having model loading issues in Jan?
- Use latest beta version: Jan-beta_0.5.18-rc6-beta
- Ensure proper CUDA support for your GPU
- Check VRAM requirements match your quantization choice

Resources

Contact

For support, questions, and community chat, join the Menlo Discord Community