22 Downloads Updated 8 months ago
From https://huggingface.co/sometimesanotion/Lamarck-14B-v0.7
[!TIP] With no benchmark regressions, mostly gains over the previous release, this version of Lamarck has broken the 41.0 average maximum for 14B parameter models. Those providing feedback, thank you!
Lamarck 14B v0.7: A generalist merge with emphasis on multi-step reasoning, prose, and multi-language ability. The 14B parameter model class has a lot of strong performers, and Lamarck strives to be well-rounded and solid:
Lamarck is produced by a custom toolchain to automate a complex sequences of LoRAs and various layer-targeting merges:
Lamarck’s performance comes from an ancestry that goes back through careful merges to select finetuning work, upcycled and combined. Through intermediate merges, arcee-ai/Virtuoso-Small sthenno-com/miscii-14b-1225 and VAGOsolutions/SauerkrautLM-v2-14b-DPO are emphasized in early layers for extra BBH; later layers add synergistic influence from deepseek-ai/DeepSeek-R1-Distill-Qwen-14B, Krystalan/DRT-o1-14B, EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2, and CultriX/Qwen2.5-14B-Wernicke.
More subjectively, its prose and translation abilities are boosted by repeated re-emphasis of Krystalan/DRT-o1-14B and underwoods/medius-erebus-magnum-14b. Other models found in sometimesanotion/Qwenvergence-14B-v3-Prose have their impact on prose quality - and surprising synergy of reasoning.
Kudos to @arcee-ai, @deepseek-ai, @Krystalan, @underwoods, @VAGOSolutions, @CultriX, @sthenno-com, and @rombodawg whose models had the most influence. Vimarckoso v3 has the model card which documents its extended lineage.