Skip to content

Getting Started with Cortex

Installation

Cortex has a Local Installer with all of the required dependencies, so that once you’ve downloaded it, no internet connection is required during the installation process.

Starting the Server

Cortex runs an API server on localhost:39281 by default. The port can be customized in .cortexrc with the apiServerPort parameter.

Terminal window
cortex start

Engine Management

Cortex supports specialized engines for different multi-modal foundation models: llama.cpp and ONNXRuntime. By default, Cortex installs llama.cpp as its main engine.

For more information, check out Engine Management.

List Available Engines

Terminal window
curl --request GET \
--url http://127.0.0.1:39281/v1/engines

Install an Engine

Terminal window
curl http://127.0.0.1:39281/v1/engines/llama-cpp/install \
--request POST \
--header 'Content-Type: application/json'

Model Management

Pull a Model

You can download models from:

Terminal window
cortex pull llama3.3

Or for specific models

Terminal window
cortex pull bartowski/Meta-Llama-3.1-8B-Instruct-GGUF

All model files are stored in the ~/cortex/models folder.

Stop Model Download

Terminal window
curl --request DELETE \
--url http://127.0.0.1:39281/v1/models/pull \
--header 'Content-Type: application/json' \
--data '{"taskId": "tinyllama:tinyllama:1b-gguf-q3-km"}'

List All Models

Terminal window
curl --request GET \
--url http://127.0.0.1:39281/v1/models

Delete a Model

Terminal window
curl --request DELETE \
--url http://127.0.0.1:39281/v1/models/tinyllama:1b-gguf-q3-km

Running Models

Start a Model

Terminal window
# This downloads (if needed) and starts the model in one command
cortex run llama3.3

Create Chat Completion

Terminal window
curl --request POST \
--url http://localhost:39281/v1/chat/completions \
-H "Content-Type: application/json" \
--data '{
"model": "llama3.1:8b-gguf",
"messages": [
{
"role": "user",
"content": "Write a Haiku about cats and AI"
}
],
"stream": false
}'

System Status

Check the running model and hardware system status (RAM, Engine, VRAM, Uptime).

Terminal window
cortex ps

Stop a Model

Terminal window
cortex models stop llama3.3

Stopping the Server

Terminal window
cortex stop

What’s Next?

Now that Cortex is set up, you can continue to:

  • Adjust the folder path and configuration using the .cortexrc file
  • Explore the Cortex’s data folder to understand how data gets stored
  • Learn about the structure of the model.yaml file in Cortex