38 1 year ago

This model was trained 2x faster with Unsloth and Huggingface's TRL library. This is an experiment on fixing models with incorrect behaviors.