- .mp3, analysing soundwaves
- .mid, analysing scores
- Processing: Mixture Density Network (MDN), Density function approximation
- Output: autoregressive music predictor
To process soundwaves in a standard mp3 file you will have to worry about 44k datapoints per second. Even by reducing quality that is very muchas muchachas for learning time.
For that I took a Neural Network with bayesian interpretation of the output units.
The model can solve toyproblems, like generating Sinus.
The model can solve toyproblems, like imitate a drum.
The model can solve toyproblems, like extracting the baseline of a music piece.
Musical Instrument Digital Interface (MIDI)
Patterns are always hierarchical. Thinking about the environment, I don’t want to predict in a complexity of a soundwave. Lets go MIDI.
By reducing the dimensions of the environment the model is capable of predicting a MIDI file in perfect manner. The Mixture Density Network creates a Random distribution over the Music peace. What you basically hear in the following are random samples drawn from the probability distribution of the given song. The easiest example is a coinflip. Given a large enough sequence of coinflips my mixture density network would approximate the probability distribution and you would have a random generator for a coinflip. Exactly the same is happening for the music peaces. Just in a higher dimensional manner.
My favorite (from 1.5min+). So much different from the original.
Scale A major
Simple music basics for the agent to learn with interesting result. Maybe one has to give the agent music patterns and no music pieces…
Very exciting, it is archived as the first working descision process.