10 2 days ago

Storytelling mixture of experts model for consumer GPUs. Made by DavidAU (Huggingface).

tools

2 days ago

ecb74ebe7a60 · 13GB ·

llama
·
24.9B
·
IQ4_XS
Write {{char}}'s next reply in this fictional roleplay with {{user}}.
{ "stop": [ "<|im_start|>", "<|im_end|>" ] }
{{- if .Suffix }}<|fim_prefix|>{{ .Prompt }}<|fim_suffix|>{{ .Suffix }}<|fim_middle|> {{- else if .M

Readme

DARK PLANET REBEL FURY / I-MATRIX / 25B (4X8B) / I-QUANT

A more recent model from DavidAU’s “Dark Planet” line, one the creator favored themselves among their similar MoE (mixture of experts) models. This model was uploaded for the speed MoEs provide relative for their size, and as I have had good experience with the Dark Planet series. If a higher active parameter model is preferred, there is a 13-billion active parameter Mixtral model available. To stuff as many parameters in as little VRAM as possible, weighted K and I-quants will be listed.

Note that I-quants forfeit some token generation speed relative to K-quants in exchange for storage efficiency. Either of the 4-bit quantizations are recommended for 16GB GPUs. These models were taken from GGUF formats from Huggingface.

Original model (DavidAU):

GGUF weighted quantizations (mradermacher):

[No obligatory model picture. Ollama did not like it.]