15.3K 2 months ago

A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

vision 8b
75357d685f23 · 28B
You are a helpful assistant.