9 Downloads Updated 2 days ago
Name
7 models
DARK_PLANET_REBEL_FURY-Llama3-25b:Q3_K_S
11GB · 8K context window · Text · 2 days ago
DARK_PLANET_REBEL_FURY-Llama3-25b:Q4_K_S
14GB · 8K context window · Text · 2 days ago
DARK_PLANET_REBEL_FURY-Llama3-25b:Q5_K_M
18GB · 8K context window · Text · 2 days ago
DARK_PLANET_REBEL_FURY-Llama3-25b:Q6_K
20GB · 8K context window · Text · 2 days ago
DARK_PLANET_REBEL_FURY-Llama3-25b:IQ2_XXS
6.8GB · 8K context window · Text · 2 days ago
DARK_PLANET_REBEL_FURY-Llama3-25b:IQ3_S
11GB · 8K context window · Text · 2 days ago
DARK_PLANET_REBEL_FURY-Llama3-25b:IQ4_XS
13GB · 8K context window · Text · 2 days ago
DARK PLANET REBEL FURY / I-MATRIX / 25B (4X8B) / I-QUANT
A more recent model from DavidAU’s “Dark Planet” line, one the creator favored themselves among their similar MoE (mixture of experts) models. This model was uploaded for the speed MoEs provide relative for their size, and as I have had good experience with the Dark Planet series. If a higher active parameter model is preferred, there is a 13-billion active parameter Mixtral model available. To stuff as many parameters in as little VRAM as possible, weighted K and I-quants will be listed.
Note that I-quants forfeit some token generation speed relative to K-quants in exchange for storage efficiency. Either of the 4-bit quantizations are recommended for 16GB GPUs. These models were taken from GGUF formats from Huggingface.
GGUF weighted quantizations (mradermacher):
[No obligatory model picture. Ollama did not like it.]