2-dimensional matrix, where one dimension represents time and the other represents pitch
- MIDI-like(Music transformer)
- including note-on, note-off, velocity, pitch, etc.
- REMI(Pop Music Transformer)
- improves the MIDI-like representation by using duration, bar, chord, and tempo
- Structured MIDI
- use time-shift tokens instead of note-on/off or duration tokens
- MuMIDI(PopMAG), CP(Compound word transformer), OctupleMIDI(MusicBERT), Multitrack Music Transformer
- compress the attributes of a note, including pitch, duration, and velocity into one symbol
- smaller values are better
- e.g., Music Transformer
- smaller values are better
- e.g., Museformer
- the ratio of empty bars
- values closer to the original data are better
- e.g., MuseGAN
- the number of used pitch classes per bar
- values closer to the original data are better
- e.g., MuseGAN
- the ratio of qualified notes that are no shorter than a time step(i.e. a 32th note)
- values closer to the original data are better
- e.g., MuseGAN
- the ratio of notes in 8- or 16-beat patterns
- values closer to the original data are better
- e.g., MuseGAN
- the harmonicity between a pair of tracks
- smaller values are better
- e.g., MuseGAN
- convert the generated symbolic music into audio music and evaluate by beat tracking model
- values closer to the original data are better
- e.g., Pop music transformer
- match if generated onset, measure, pitch, track match exactly those of original
- higher values are better
- e.g., MT3, Composer Assistant
- Drums are ignored
- lower values are better
- e.g., Composer Assistant, XLNet
- higher values are better
- choose the correct answer from 4 choices by calculating the average probability of generating the event
- e.g., Jazz transformer, XLNet Piano Infilling
-
number of wins
- e.g., Music Transformer
-
harmonious/rhythmic/musically structured/coherent/overall rating
- e.g., MuseGAN
-
musicality/short-term structure/long-term structure/overall/overall score/preference
- e.g., Museformer
-
structure/richness/pleasure/overall
- e.g., MELON
-
coherence/richness/arrangement/overall
- e.g., MMT
-
average rank, p-value
- e.g., composer's assistant
-
distinguish pro and non-pro, p-value
- e.g., Pop Music Transformer
- the error between the similarity distribution of original data and generated music
- smaller values are better
- e.g., Museformer
- notes per second
- e.g., MMT
- to analyze the repetition
- e.g., Museformer
- e.g., MMT