MinMo, developed by researchers from Tongyi Lab and Alibaba Group, is an advanced multimodal large language model designed for seamless voice interaction. Trained on over 1.4 million hours of speech data, MinMo excels in tasks like speech-to-text, text-to-speech, emotion recognition, and multilingual speech recognition. The
•4m read time• From marktechpost.com
Sort: