Featured Publications

arXiv January 16th, 2023
Msanii: High Fidelity Music Synthesis on a Shoestring Budget

Kinyugo Maina - Independent Researcher

kinyugomaina@gmail.com

In this paper, we present Msanii, a novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently. Our model combines the expressiveness of mel spectrograms, the generative capabilities of diffusion models, and the vocoding capabilities of neural vocoders. We demonstrate the effectiveness of Msanii by synthesizing tens of seconds (190 seconds)...

Sound Machine Learning Audio and Speech Processing

All Publications

arXiv January 16th, 2023
Msanii: High Fidelity Music Synthesis on a Shoestring Budget

Kinyugo Maina - Independent Researcher

kinyugomaina@gmail.com

In this paper, we present Msanii, a novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently. Our model combines the expressiveness of mel spectrograms, the generative capabilities of diffusion models, and the vocoding capabilities of neural vocoders. We demonstrate the effectiveness of Msanii by synthesizing tens of seconds (190 seconds)...

Sound Machine Learning Audio and Speech Processing