31 Downloads Updated 3 months ago
Developed by: emre
Finetuned from model : unsloth/gemma-3-27b-it-unsloth-bnb-4bit
This gemma3 model was trained 2x faster with Unsloth and Huggingface’s TRL library.
English version is given below.
Aşağıda, TARA v1 veri seti üzerinde değerlendirilen bazı modellerin ilk sonuçları gösterilmektedir. Bu sonuçlar, belirtilen değerlendirici model (gemini-2-flash
) kullanılarak success_rate (%)
metriğine göre hesaplanmıştır. Bu tablo resmi bir leaderboard değildir ancak modellerin farklı akıl yürütme alanlarındaki göreceli performansını göstermeyi amaçlamaktadır.
gemini-2-flash
success_rate (%)
(Başarı Oranı %)Model | Bilimsel (RAG) (%) | Etik (%) | Senaryo (%) | Yaratıcı (%) | Mantıksal (%) | Matematik (%) | Planlama (%) | Python (%) | SQL (%) | Tarihsel (RAG) (%) | Genel Başarı (%) |
---|---|---|---|---|---|---|---|---|---|---|---|
emre/gemma-3-4b-it-tr-reasoning40k | 73.64 | 62.73 | 60.91 | 48.18 | 60.00 | 38.18 | 51.82 | 35.45 | 41.82 | 75.45 | 54.82 |
unsloth/gemma-3-4b-it | 62.73 | 74.55 | 88.18 | 58.18 | 71.82 | 59.09 | 41.82 | 70.91 | 41.82 | 95.45 | 66.45 |
google/gemma-2-2b-it | 63.64 | 46.36 | 47.27 | 40.00 | 54.55 | 27.27 | 17.27 | 33.64 | 30.00 | 53.64 | 41.36 |
emre/gemma-7b-it-Turkish-Reasoning-FT-smol | 52.73 | 42.73 | 45.45 | 21.82 | 39.09 | 33.64 | 28.18 | 30.00 | 30.00 | 60.91 | 38.45 |
emre/gemma-3-12b-it-tr-reasoning40k | 92.73 | 70.91 | 86.36 | 62.73 | 71.82 | 83.64 | 60.00 | 92.73 | 55.45 | 79.09 | 75.55 |
unsloth/gemma-3-12b-it-tr | 85.45 | 93.64 | 93.64 | 68.18 | 77.27 | 62.73 | 53.64 | 86.36 | 61.82 | 95.45 | 77.82 |
emre/gemma-3-12b-ft-tr-reasoning40k | 86.36 | 68.18 | 77.27 | 54.55 | 47.27 | 50.91 | 43.64 | 59.09 | 23.64 | 85.55 | 59.55 |
emre/gemma-3-27b-it-tr-reasoning40k-4bit | 93.64 | 95.45 | 97.27 | 65.45 | 77.27 | 82.73 | 71.82 | 92.73 | 75.45 | 95.45 | 84.73 |
unsloth/gemma-3-27b-it-unsloth-bnb-4bit | 86.36 | 71.82 | 96.36 | 59.09 | 81.82 | 76.36 | 66.36 | 93.64 | 69.09 | 99.09 | 80.00 |
TURKCELL/Turkcell-LLM-7b-v1 | 50.91 | 49.09 | 31.82 | 12.73 | 43.73 | 14.55 | 15.45 | 20.00 | 0.91 | 75.45 | 31.36 |
google/gemini-1.5-flash | 100.00 | 90.91 | 100.00 | 77.27 | 100.00 | 63.64 | 71.82 | 92.73 | 85.45 | 100.00 | 88.18 |
google/gemini-2.0-flash-lite | 95.45 | 100.00 | 100.00 | 79.09 | 100.00 | 85.45 | 80.91 | 92.73 | 90.91 | 97.27 | 92.18 |
Trendyol/Trendyol-LLM-7B-chat-v4.1.0 | 84.55 | 71.82 | 68.18 | 54.55 | 70.91 | 60.00 | 46.36 | 80.00 | 46.36 | 81.82 | 66.46 |
Openai/gpt-4o-mini-2024-07-18 | 93.64 | 87.27 | 100.00 | 75.45 | 82.73 | 75.45 | 71.82 | 92.73 | 76.36 | 100.00 | 85.55 |
Openai/o3-mini-2025-01-31 | 100.00 | 93.64 | 100.00 | 92.73 | 100.00 | 100.00 | 85.45 | 88.18 | 100.00 | 100.00 | 96.00 |
neuralwork/gemma-2-9b-it-tr | 94.55 | 81.82 | 91.82 | 91.82 | 79.09 | 58.18 | 46.36 | 61.82 | 49.09 | 96.36 | 75.09 |
Openai/gpt-4.1-nano-2025-04-14 | 100.00 | 95.45 | 82.73 | 91.82 | 82.73 | 69.09 | 71.82 | 86.36 | 75.45 | 100.00 | 85.55 |
Openai/gpt-4o-2024-08-06 | 89.09 | 80.91 | 90.91 | 91.82 | 91.82 | 92.73 | 71.82 | 92.73 | 70.00 | 100.00 | 87.18 |
Openai/gpt-4.1-mini-2025-04-14 | 100.00 | 100.00 | 100.00 | 92.73 | 91.82 | 100.00 | 84.55 | 100.00 | 100.00 | 100.00 | 96.91 |