8,233 1 month ago

Qwen3-30B-A3B-Instruct-2507 has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Number of Parameters: 30.5B in total and 3.3B activated - Number of Paramaters (Non-Embedding): 29.9B - Number of Layers

tools thinking
cff3f395ef37 · 120B
{
"repeat_penalty": 1,
"stop": [
"<|im_start|>",
"<|im_end|>"
],
"temperature": 0.6,
"top_k": 20,
"top_p": 0.95
}