Using Models

Cortex’s Models API is compatible with OpenAI’s Models endpoint. It is a fork of the OpenAI API used for model management. Additionally, Cortex exposes lower-level operations for managing models like downloading models from a model hub and model loading.

Model Operation

Model Operation allows you to pull, run, and stop models.

Run Model

curl --request POST \
    --url http://localhost:39281/v1/models/mistral/start \
    --header 'Content-Type: application/json' \
    --data '{
    "prompt_template": "system\n{system_message}\nuser\n{prompt}\nassistant",
    "stop": [],
    "ngl": 4096,
    "ctx_len": 4096,
    "cpu_threads": 10,
    "n_batch": 2048,
    "caching_enabled": true,
    "grp_attn_n": 1,
    "grp_attn_w": 512,
    "mlock": false,
    "flash_attn": true,
    "cache_type": "f16",
    "use_mmap": true,
    "engine": "llamacpp"
}'

cortex models run <model_id>

Stop Model

curl --request POST \
    --url http://localhost:39281/models/mistral/stop

cortex models stop <model_id>

Pull Model

curl --request POST \
    --url http://localhost:39281/v1/models/mistral/pull

# Download a built-in model
cortex models pull mistral
# Download a specific variant
cortex models pull bartowski/Hermes-2-Theta-Llama-3-70B-GGUF

Models Management

Model Management allows you to manage your local models, which can be found in ~users/user_name/cortex/models.

List Models

curl --request GET \
    --url http://localhost:39281/v1/models

cortex models list

Get Model

curl --request GET \
    --url http://localhost:39281/v1/models/mistral

cortex models get <model_id>

Delete Model

curl --request DELETE \
    --url http://localhost:39281/v1/models/mistral

cortex models remove <model_id>

Update Model

curl --request PATCH \
    --url http://localhost:39281/v1/models/mistral \
    --header 'Content-Type: application/json' \
    --data '{}'

cortex models update