Patchwork – [wt : 4]

One of the modules I’m likely to be mucking in on next year is the brand new Auditory Computing. It’s not at all clear what I’ll be doing on it, and I haven’t even seen the syllabus yet, but one of the potential tasks might be the setting of coursework. In idle and probably misdirected preparation for that I’ve been having a bit of a play with generating music using deep learning. People have tended to use LSTMs for this, but I thought it would be fun to try Andrej Karpathy’s neat little implementation of the notorious GPT. Training data is from the Classical Music MIDI dataset.

The results aren’t exactly going to be headlining the Last Night of the Proms, but some are quite cute, I think:

GPT-nano, trained on Mozart, Bach & Haydn

GPT-micro, trained on the whole dataset

You can definitely hear the model regurgitating memorised fragments. But don’t we all do that?

The task is treated as a language modelling one, with a vocabulary of chords and durations. To somewhat reduce the vocab size and increase the representational overlap I’ve pruned chords to no more than 3 notes at a time. A snippet of code for this is below — not because I expect anyone to read or reuse it, really this is just to test out the syntax colouring WordPress plugin that I’ve just installed.

def simplify ( s, limit, mode='low', rng=local_rng ):
    """
    Drop notes from big chords so they have no more than `limit` notes.
    NB: operates in place.
    
    Drop strategies are pretty dumb. We always keep the highest and lowest notes
    (crudely assumed to be melody and bass respectively). Notes are dropped from
    the remainder according to one of three strategies:
    
        'low': notes are dropped from low to high (the default)
        'high': notes are dropped from high to low
        'random': notes are dropped randomly
    
    Latter could actually increase vocab by mapping the same input chord
    to several outputs. Modes can be abbreviated to initial letters.
    """
    if limit < 2: limit = 2
    
    drop_func = {
                    'r' : lambda d, c: rng.choice(d, c, replace=False),
                    'h' : lambda d, c: d[(len(d)-c):]
                }.get(mode.lower()[0],
                      lambda d, c: d[:(c-len(d))])
    
    for element in s.flat:
        if isinstance(element, MU.chord.Chord):
            if len(element) > limit:
                drop_count = len(element) - limit
                drops = [ nn.pitch.nameWithOctave for nn in element ][1:-1]
                
                if len(drops) > drop_count:
                    drops = drop_func(drops, drop_count)
                
                for note in drops:
                    element.remove(note)

Perhaps that will get more use in future, if all this coheres and I start working more of this out in public. Perhaps not.

Leave a Reply Cancel reply