fbpx

/mulab-mir/ The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation

/mulab-mir/ The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation


We introduce the Song Describer dataset (SDD), a brand new crowdsourced corpus of high-quality audio-caption pairs, designed for the analysis of music-and-language fashions. The dataset consists of 1.1k human-written pure language descriptions of 706 music recordings, all publicly accessible and launched below Creative Common licenses. To showcase using our dataset, we benchmark widespread fashions on three key music-and-language duties (music captioning, text-to-music technology and music-language retrieval). Our experiments spotlight the significance of cross-dataset analysis and provide insights into how researchers can use SDD to achieve a broader understanding of mannequin efficiency.



PDF



Abstract

HI-FI News

by way of Papers with Code: Trending (unofficial) https://ift.tt/2Wlxory

November 19, 2023 at 11:35AM

Select your currency