You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Leveraging the LLM-enhanced MetaScore dataset, our proposed MetaScore Transformer (MST) model generates symbolic music using natural language prompts with difficulty, genre, instrument and composer controls. The symbolic music outputs allow the user to further edit and complete the composition.
Dataset Distribution
We collect 963K songs paired with musical scores and metadata from the MuseScore forum.
MetaScore-Raw (963K): The raw MuseScore files and metadata scraped from the MuseScore forum as well as the corresponding musicxml file for future research.
Metascore-Genre (181K): A subset of MuseScore-Raw containing files with user-annotated genres. Additionally, we discard any songs composed by a composer that has less than 100 compositions in MetaScore-Raw. We also provide LLM-generated captions based on information extracted from the metadata in Metascore-Genre.
MetaScore-Plus (963K): MetaScore-Raw where missing genre tags are completed by the trained genre tagger.We also provide LLM-generated captions based on information extracted from the metadata in MetaScore-Plus.
Due to copyright concerns, we will publicly release music scores and metadata that are in the public domain (228K) or licensed with a Creative Commons licenses (46K) from MetaScore-Plus. The rest of the dataset will be provided upon request for research purpose.