Cortex Blog

Latest news, updates, and insights from the Cortex team

Introducing Cortex 2.0: Faster, Better, Stronger

We're excited to announce the release of Cortex 2.0, featuring significant performance improvements and new capabilities.

READ MORE
CPU

CPU Inference Optimizations

Discover techniques to optimize LLM inference on CPU-only machines for better performance.

API

Building Applications with Cortex API

A step-by-step guide to building applications using our API...

RAG

Retrieval-Augmented Generation with Cortex

Implementing RAG systems for accurate and contextual responses...

GPU

GPU Recommendations for Local LLMs

Our guide to selecting the right GPU for running various models...

Tools

Function Calling with Local Models

How to implement function calling capabilities with local models...

Quick Start Code Example


# Install Cortex
pip install cortexcpp

# Initialize with your first model
cortex pull llama3

# Start serving
cortex serve --model llama3