403 5 months ago

Q6_K / Q5_K_M / Q4_K_S | mistral-small3.1:24b-instruct-2503

tools

5 months ago

09467e913633 · 20GB

mistral3
·
24B
·
Q6_K
{{- range $index, $_ := .Messages }} {{- if eq .Role "system" }}[SYSTEM_PROMPT]{{ .Content }}[/SYSTE
You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup head
{ "num_ctx": 4096 }

Readme

Extra quants for Mistral-Small-3.1-24B

Q6_K / Q5_K_M / Q4_K_S

These are quantized using ollama client, so these quants supports Vision


On an RTX 4090 with 24GB of VRAM

Q8 KV Cache enabled

Leave 1GB to 800MB of VRAM as a buffer


Q6_K: 35K context

Q5_K_M: 64K context

Q4_K_S: 100K context