T HE U NIVERSITY OF N EW S OUTH WALES S CHOOL OF E LECTRICAL E NGINEERING AND C OMPUTER S CIENCE AND E NGINEERING
Arti cial Evolution for Sound Synthesis Jonathon Crane (2172214) Bachelor of Engineering (Computer Engineering) July 5, 1996
Supervisor: Andrew Taylor Assessor: Tim Lambert
ii
Abstract A technique of Artificial Evolution is applied to Sound Synthesis algorithms with the goal of producing an effective, intuitive way of searching the space of possible sounds. Interactive Evolutionary Algorithms (IEA’s) are optimisation techniques inspired by the process of biological evolution. In this study, an IEA is applied to a Frequency Modulation (FM) synthesis algorithm. This produces a tool which allows users to explore, sculpt and evolve sounds without any knowledge or understanding of the underlying algorithm. To determine the effectiveness of the system, seven different users compared three different exploration techniques: manually adjusting parameters of the algorithm; randomly adjusting parameters; and using the IEA to adjust the parameters. It was discovered that the random method performed best, but did not offer the degree of control required by users. The manual method provided this fine grained control, but was slow and difficult to use - it required the users to understand the underlying FM algorithm. In contrast, the IEA method provided all the advantages of the random method, combined with the control of the manual method. It enabled users to rapidly locate good sounds and then refine them as necessary, even with no knowledge or understanding of the FM synthesis algorithm.
Contents 1
2
3
Introduction 1.1 Motivation . 1.2 Goals . . . 1.3 Objectives . 1.4 Outline . .
. . . .
1 1 3 3 3
. . . . . . . . . . . . . . . . . . . . . .
5 5 5 5 7 8 9 10 12 12 13 14 15 17 17 18 19 21 21 23 24 25 27
System Design 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 System Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28 28 28
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
Previous Work 2.1 Introduction . . . . . . . . . . . . . . . . . 2.2 Sound Synthesis . . . . . . . . . . . . . . . 2.2.1 Describing Sound . . . . . . . . . . 2.2.2 Synthesis Algorithms . . . . . . . . 2.2.3 Additive Synthesis . . . . . . . . . 2.2.4 Subtractive Synthesis . . . . . . . . 2.2.5 Frequency Modulation . . . . . . . 2.2.6 Sampling . . . . . . . . . . . . . . 2.2.7 Other Methods . . . . . . . . . . . 2.3 Evolutionary Algorithms (EA’s) . . . . . . 2.3.1 Genetic Algorithms (GA’s) . . . . . 2.3.2 Evolution Strategies (ES’s) . . . . . 2.3.3 Evolutionary Programming (EP) . . 2.4 EA’s for Sound Synthesis . . . . . . . . . . 2.5 Interactive Evolutionary Algorithms (IEA) . 2.5.1 Dawkins’ Biomorphs . . . . . . . . 2.5.2 Oppenheimer’s Artificial Menagerie 2.5.3 Smith’s Bugs . . . . . . . . . . . . 2.5.4 Sims’ Artificial Evolution . . . . . 2.5.5 Moore’s GAMusic 1.0 . . . . . . . 2.5.6 van Goch’s P-Farm . . . . . . . . . 2.6 Summary . . . . . . . . . . . . . . . . . .
ii
. . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . .
CONTENTS 3.3
3.4 3.5 4
iii
Considerations . . . . . . . . . 3.3.1 Sound Synthesis . . . . 3.3.2 Evolutionary Algorithm 3.3.3 User Interface . . . . . . The Design Decision . . . . . . Evaluation . . . . . . . . . . . .
System Development 4.1 Introduction . . . . . . 4.2 EvoS 2 . . . . . . . . . 4.2.1 Aim . . . . . . 4.2.2 Implementation 4.2.3 Results . . . . 4.2.4 Conclusion . . 4.3 EvoS 3 . . . . . . . . . 4.3.1 Aim . . . . . . 4.3.2 Implementation 4.3.3 Results . . . . 4.3.4 Conclusion . . 4.4 EvoS 4 . . . . . . . . . 4.4.1 Aim . . . . . . 4.4.2 Implementation 4.4.3 Results . . . . 4.4.4 Conclusion . . 4.5 EvoS 5 . . . . . . . . . 4.5.1 Aim . . . . . . 4.5.2 Implementation 4.5.3 Results . . . . 4.5.4 Conclusion . . 4.6 EvoS 6 . . . . . . . . . 4.6.1 Aim . . . . . . 4.6.2 Implementation 4.6.3 Results . . . . 4.6.4 Conclusion . . 4.7 EvoS 7 . . . . . . . . . 4.7.1 Aim . . . . . . 4.7.2 Implementation 4.7.3 Results . . . . 4.7.4 User evaluation 4.7.5 Conclusion . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
29 29 31 32 33 34
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36 36 37 37 38 40 42 42 42 42 43 45 45 45 47 49 53 53 53 54 54 57 57 57 59 60 61 61 61 61 62 66 68
CONTENTS 5
6
iv
Results 5.1 Aim . . . . . . . . . . . . . . . 5.2 Implementation . . . . . . . . . 5.2.1 Manual Tool . . . . . . 5.2.2 Random Tool . . . . . . 5.2.3 Evolution Tool . . . . . 5.3 Method . . . . . . . . . . . . . 5.4 Results . . . . . . . . . . . . . . 5.4.1 Raw Data . . . . . . . . 5.4.2 Analysis of Data . . . . 5.4.3 An Alternative Analysis 5.4.4 User Opinions . . . . . 5.5 Conclusions . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
70 70 71 71 73 74 77 79 79 79 82 84 87
Conclusion 6.1 Conclusion . . . . 6.2 Other lessons learnt 6.3 Further Work . . . 6.4 A final note . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
89 89 90 90 92
A Evolving Samples A.1 Introduction . . . . . . A.1.1 Implementation A.1.2 Results . . . . A.1.3 Conclusion . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
93 93 94 97 98
. . . .
List of Figures 1.1
This Moog synthesizer illustrates the complexity of synthesis algorithms. Each knob controls a separate parameter that can be adjusted by the user to effect the resulting sound - taken from [Mus98]. . . . .
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9
2
Patch required for additive synthesis - taken from [Wil88]. . . . . . . A basic subtractive synthesis patch - taken from [Wil88]. . . . . . . . The most basic setup required for FM synthesis - taken from [DJ85]. . A more complicated FM patch - taken from [DJ85]. . . . . . . . . . . An illustration of 5 point crossover - taken from [B¨ac95]. . . . . . . . Evolution of “biomorphs” with Dawkins’ system - taken from [Daw86]. Example of an evolved plant form - taken from [Ope88]. . . . . . . . Biomorphs evolved with Smith’s system - taken from [Smi91]. . . . . An image generated from an evolved symbolic lisp expression - taken from [Sim91]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.10 A screen shot of the GAMusic 1.0 user interface. . . . . . . . . . . . 2.11 A screen shot of P-Farm’s user interface. . . . . . . . . . . . . . . . .
8 9 10 11 15 20 21 22
3.1
The three main components of the proposed system . . . . . . . . . .
29
4.1 4.2 4.3 4.4 4.5 4.6
A screen shot of the EvoS2 user interface. . . . . . . . . . . . . . . . 39 A sample evolution run for EvoS2. . . . . . . . . . . . . . . . . . . . 41 Evolution of synthesis parameters in EvoS 3. . . . . . . . . . . . . . 44 Evolution of variances in EvoS 3. . . . . . . . . . . . . . . . . . . . . 44 An envelope (top) controls the amplitude of a waveform (bottom). . . 46 The envelope of figure 4.5 shown with an envelope produced by mutation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 The envelope of 4.5 compared with an envelope produced by randomization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 1 . . . . . . . . . . . . . . . . . 51 EvoS 4 evolution with a variance of 50 1 . . . . . . . . . . . . . . . . . 52 EvoS 4 evolution with a variance of 10 1 . . . . . . . . . . . . . . . . . 52 EvoS 4 evolution with a variance of 20 EvoS 5 evolution with a variance of 1/50. . . . . . . . . . . . . . . . 55 EvoS 5 evolution with a variance of 1/10. . . . . . . . . . . . . . . . 56 EvoS 5 evolution with a variance of 1/20. . . . . . . . . . . . . . . . 57
4.7 4.8 4.9 4.10 4.11 4.12 4.13
v
23 25 26
LIST OF FIGURES
vi
4.14 4.15 4.16 4.17
EvoS 7 evolution: Modulation indices (I 1 and I2 ). EvoS 7 evolution: Frequency ratios (N1 and N2 ). EvoS 7 evolution: Amplitude Envelopes. . . . . . EvoS 7 evolution: Modulation Index Envelopes. .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. 63 . 64 . 65 . 65
5.1 5.2 5.3 5.4 5.5 5.6
Screen shot of the manual tool user interface. . . . . . . . Screen shot of the random tool user interface . . . . . . . Screen shot of the evolution tool . . . . . . . . . . . . . . Sounds Liked plotted against Time for all search tools. . . Sounds Liked vs. Sounds Auditioned for all search tools. Sounds Liked plotted against search tool . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. 72 . 73 . 75 . 82 . 83 . 84
A.1 A.2 A.3 A.4 A.5
Example of a heterodyne analysis file, the amplitude envelopes Example of a heterodyne analysis file, the frequency envelopes Example of a nicely mutated amplitude envelope . . . . . . . Example of a nicely mutated frequency envelope . . . . . . . Example of a badly mutated amplitude envelope . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . .
. . . .
. . . .
. . . .
94 95 96 96 97
List of Tables 2.1
Papers published by Andrew Horner et. al. on the subject of applying GA’s to sound synthesis . . . . . . . . . . . . . . . . . . . . . . . . .
18
4.1 4.2 4.3 4.4
Ranges and variances for the parameters of EvoS 2. Variances and ranges for the variances of EvoS 3 . Effects of different variances in EvoS 4. . . . . . . Effects of different variances in EvoS 5 . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
40 43 53 58
5.1 5.2 5.3 5.4
Raw statistics collected for the manual tool . Raw statistics collected for the random tool . Raw statistics collected for the evolution tool Mean statistics for each search tool. . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
79 79 80 80
vii
. . . .
. . . .
. . . .
Acknowledgements I would firstly like to thank Emma Hogan, who supported me throughout this thesis thank you for all the meals you cooked me. Next, I would like to thank my family: my mother, Colleen, who has supported me financially these last four years; my brother, Andrew, who got me out of some dodgy situations; and my aunt, Nathalie, who also cooked a lot of meals for me. In relation to this project I would like to thank all those users who volunteered to test the system: Emma Hogan (again); Brett Webb; Nick Perkins; Dougal May; Rachael James; Tom Fryer; Chris V.; Nick Mariette; and Brendan Hanna (I’m sorry about the outlier stuff). I would also like to thank Tom and Will Edwards for letting me use their scanner. Academically I would like to thank my supervisor, Andrew Taylor, especially for approving my project which was a bit out of the ordinary. And also my assessor, Tim Lambert, for dealing with my frequent anxiety attacks ;). Similarly I would like to thank those people who answered my emails with querys about this project: Jason Moore; Arno van Goch; and Waleed Kadous. Finally, I would like to thank anyone else that I may have left out - I apologise for this and hereby give you permission to run a guilt trip on me.
In memory of my father (“The journey of a thousand miles starts with the first step.”)
ix
Chapter 1 Introduction In theory at least, any sound can be synthesized electronically... The synthesizer is a wonderfully versatile musical instrument. It can sound like dozens, hundreds, or even thousands of instruments... Many other sounds are only possible through sound synthesis. —Delton T. Horn, Music Synthesizers Indeed, sound synthesis can produce any sound you desire, but the above quote does not mention the difficulty involved in obtaining such sounds. This thesis is concerned with using a process of artificial evolution 1 to produce novel and interesting sound spectra. More specifically, an Interactive Evolutionary Algorithm (IEA) is used here to generate parameter sets for sound synthesis algorithms. The results will show that artificial evolution provides an efficient and intuitive way for people to search through the space of possible sounds.
1.1 Motivation Why should anyone bother with such an exercise? — There are a number of reasons. Synthesis algorithms typically involve a large number of parameters (see figure 1.1). This often makes it difficult to obtain a desired sound quickly. Users of the synthesis algorithms either have to understand exactly how each parameter will effect the final sound, or they have to spend a lot of time aimlessly “twiddling knobs” 1
An optimisation procedure modeled on biological evolution.
1
1. Introduction
2
until a satisfactory sound is obtained. What is needed is an efficient, intuitive and enjoyable way to search through the multi-dimensional space created by the synthesis parameters.
Figure 1.1: This Moog synthesizer illustrates the complexity of synthesis algorithms. Each knob controls a separate parameter that can be adjusted by the user to effect the resulting sound - taken from [Mus98]. Primarily this could have application as a creative tool for musicians, composers and electronic artists. Instead of twiddling knobs all day they could very quickly obtain novel and interesting sounds for use in their compositions. Indeed, the process of evolving a sound could even form the basis of a composition. Secondly, it would open the world of sound synthesis to many people who don’t have the time or patience to understand the algorithms. The only expertise you need is your opinion of ‘what sounds good’ and ‘what sounds bad’. Finally, another application may lie with researchers experimenting with new kinds of synthesis algorithms. The system would provide an easy way to quickly test the boundaries of any new algorithm they discover.
1. Introduction
3
1.2 Goals The goals of this project were:
To investigate the use of an Interactive Evolutionary Algorithm applied to sound synthesis. To determine whether this provides a useful tool for people interested in exploring the space of possible sounds.
1.3 Objectives The objectives accomplished in the process of achieving the above goals are summarised below:
To conduct a review the previous research done in this field. This will serve as a background for an informed investigation. To assess the possible avenues of implementation for the idea and decide on the most feasible. To execute this most feasible avenue of implementation - resulting in a system that embodies the goals of this project. To collect data (by conducting appropriate experiments) in order to asses whether the system is effective or not. To analyse the collected data in a suitable manner and so asses the feasibility of this idea for the applications suggested. To document the findings of this investigation in a concise and legible manner - accessible to people in related fields of research, or anyone interested in the results.
1.4 Outline So far we have defined the goals and objectives of this thesis, the rest of the report will be structured as follows:
1. Introduction
4
Chapter 2 Following the standard format of an undergraduate thesis, we begin with a review relevant literature. This will provide a picture of what has already been achieved in this area of research and illustrate the originality of this project. Chapter 3 We then consider the possibilities for implementation of the project. The pros and cons of each design issue are discussed and the final implementation decision is described and justified. The problem of measuring the project’s success is also discussed. Chapter 4 Next, we describe how the software for this project was developed. This is presented in the form of a series of experiments, each with its own aim and conclusion. The final experiment in this series combines all the previous discoveries into a complete system. Chapter 5 This chapter sees the complete system fashioned into an experiment that compares three different ways of searching the same sound space. The experiment is described and the results collected are reported and analysed. Chapter 6 Finally, this chapter will summarise the findings of this investigation and suggest avenues of future research which could expand on the work presented here. So without further ado, lets get down to business 2... I hope you enjoy it!
2
A note on grammar: throughout this document there is occasional use of language traditionally regarded as informal. For example, the use of dashes ‘-’ to add afterthoughts, and the word ‘But’ at the beginning of a sentence. It is the author’s opinion that in some circumstances the use of informal language helps to communicate ideas, and this is the justification for its use in this document. Informal language also serves the purpose of adding the occasional ‘personal touch’ which can promote the readers interest in an otherwise bland report.
Chapter 2 Previous Work 2.1 Introduction This chapter gives a summary of the background research that was conducted for this project. First a general introduction to the field of Evolutionary Algorithms is given. This is followed by a review of the research done in applying these techniques to sound synthesis. Next, applications of Interactive Evolutionary Algorithms are reviewed, and finally a short summary is given. This will demonstrate the validity and originality of this project in the context of the research described.
2.2 Sound Synthesis The process of synthesizing sound has a long and detailed history that is beyond the scope of this document. Instead, this section aims to give a very brief introduction, focusing on aspects relevant to this thesis.
2.2.1 Describing Sound Before methods of synthesizing sound are discussed, it would be good to discuss some characteristics of sound itself. The basic properties of a (musical) sound are its pitch, amplitude, duration and timbre:
5
2. Previous Work
6
Pitch Pitch is our perception of the frequency of a sound. Frequencies are usually measured in Hertz (Hz) which is a measure of the number of cycles per second in the sound. Most sounds have many frequencies present at the same time. If these frequencies are related in a harmonic series then the pitch we hear for the complete sound is that of the lowest frequency or the fundamental. People are very used to the idea of controlling the pitch of a sound - take for example the piano. Each key on the keyboard generates a note of a different pitch. You can change the pitch of the sound by pressing different keys on the keyboard. Amplitude Amplitude (or volume) is our perception of how loud a sound is. Amplitude is commonly measured in decibels (dB). Using the example of the piano again, you control the amplitude of the sound by striking keys with different speeds. The faster (or harder) you hit a key, the louder the resulting sound. Duration Duration is the time a sound lasts for. It is usually measured in seconds. On the piano the duration of the sound is controlled by the amount of time a key is held for. The longer you hold a key down, the longer the duration of the sound. Timbre This is much harder to describe. Timbre is “the characteristic tone quality of a particular class of sounds”1 . You describe timbre of a sound when you say things like “that sounds like a trumpet” or “that sounds very metallic”. The timbre of all brass instruments is quite similar - a trumpet sounds like a trombone, yet the timbre of brass and wind instruments are quite different - a trumpet does not sound at all like a flute. Although timbre cannot be measured on any scale, two important aspects that contribute to the timbre of a sound are its spectrum and its amplitude envelope. The amplitude envelope of a sound is the way the volume of a sound varies over its duration. Sounds whose envelopes have a sharp attack segment will be heard as per1
From page 48 of [DJ85].
2. Previous Work
7
cussive. Those with very long attack segments will sound like they are being played backwards. The spectrum of a sound is determined by Fourier analysis which describes it as a sum of simple sinusoids. Changing the spectrum of a sound can radically alter its characteristics. A sound with a simple or narrow spectrum will sound “thin” and “pure” - like a sine wave. A sound with a wide spectrum will sound “rich” and “buzzy” - like a square wave. In our example of the piano, there is no way to control the timbre of the sound 2. It was not until the advent of electronic musical instruments that very radical changes in timbre could be effected. Today, a synthesizer can provide many controls that effect the timbre of a sound in different ways.
2.2.2 Synthesis Algorithms Research in sound synthesis is usually focussed with musical applications in mind. As a result, a lot of work has been done in finding ways to simulate the sounds of various acoustic instruments. Often along the way, many completely new and unnatural sounds are discovered. These (usually very large) portions of the unnatural timbral space can only be accessed via sound synthesis algorithms. Synthesis algorithms can be implemented with analog electronics or digital computers. Early analog sound synthesisers were large machines 3 , laden with knobs4 and strewn with patch cables which generated all manner of weird and wonderful sounds. To create a sound, one manually “patched” the output of an oscillator, for example, to the input of a filter or any other sound processing module which was available. Any such configuration of modules was called a patch 5. The advantage of synthesis in this manner was that each parameter (e.g. oscillator frequency, filter cut-off frequency, etc.), had an associated knob which could be “tweaked” by the user. This real-time tweaking of knobs, combined with the ability to patch the output of any module to the input of another afforded an intuitive and enjoyable way to explore the space of possible sounds. The disadvantages of modular analog synthesis are its high cost and the sheer physical size of the machines. If these are concerns, you should probably implement your synthesis algorithm on a digital computer. 2
You could make very minor alterations to the timbre by doing things like opening the lid of the piano - this would cause a slight difference in tone quality. 3 Indeed, decent modern analog synthsisers are still very large. 4 Look back at figure 1.1. 5 A term still in use today and throughout this document.
2. Previous Work
8
The advantage of synthesising sound on computers is that any analog configuration can be simulated, but the cost and size of the machine is much smaller. However you lose the intuitive control (and usually the real-time response) of an analog system. The next few sections describe some popular sound synthesis algorithms as explained in [Pre92], [DJ85] and [Wil88]. Although these are usually implemented on digital computers, in order to understand them it is sometimes easier to picture them in an analog implementation.
2.2.3 Additive Synthesis This is the most direct way to form any kind of sound spectra you desire. For each sinusoidal component (harmonic) 6 of the desired spectra you use a separate sinusoidal oscillator to produce it. The outputs of all the oscillators are then simply added together. In most natural sounds, the volume and pitch of each sinusoidal component varies with time. To mimic this, two envelopes are used to control each oscillator. One envelope controls the amplitude and the other varies the pitch. This results in the patch illustrated in figure 2.1. Here, you can see each oscillator being controlled by a volume envelope generator (VEG) and a pitch envelope generator (PEG).
Figure 2.1: Patch required for additive synthesis - taken from [Wil88]. 6
The terms ‘sinusoidal component’ and ‘harmonic’ are usually interchangeable.
2. Previous Work
9
Using this technique, researchers have been able to very accurately synthesize the sounds of many instruments. All they have to do is analyse a given instrument tone (via Fourier decomposition) to determine the shape of the envelopes for each harmonic. These envelopes are then used to control the volume of oscillators set at the correct frequencies. The resulting sound is almost indistinguishable from the original instrument tone. There are a few disadvantages of the additive synthesis technique however. Firstly, because a separate oscillator and envelope generator are required for each harmonic (typically there are more than ten harmonics) it is computationally expensive. If a fast response is required between specification of the parameters and synthesis of the sound, you will need either a very powerful computer or specialised hardware. Secondly, because this technique involves specification of so many parameters, it can be hard for a musician to achieve a desired sound.
2.2.4 Subtractive Synthesis In contrast to additive synthesis, subtractive synthesis starts with a dense spectrum and carves away selected portions to produce the desired sound. Instead of sine wave oscillators, square and sawtooth wave oscillators are used and combined with noise generators. The dense spectrum that results is passed to various combinations of high pass, low pass or band pass filters. To create sounds that change their character (timbre) over time, envelope generators and other oscillators are used to control the cut-off frequencies of the filters. Figure 2.2 illustrates the concept of subtractive synthesis. Here, the filter is controlled by an envelope generator (EG) and a low frequency oscillator (LFO).
Figure 2.2: A basic subtractive synthesis patch - taken from [Wil88]. Subtractive synthesis is useful for imitating instruments with harmonic spectra
2. Previous Work
10
such as wind and string instruments. Inharmonic spectra (such as the sounds of bells and drums) can be produced by combining other oscillators or using other devices such as ring modulators. The range of sounds producible with this technique depends on the way in which you interconnect modules rather than the numbers you throw at an algorithm.
2.2.5 Frequency Modulation Frequency Modulation (FM) is a very popular synthesis method that was first definitively described by John Chowning [Cho73]. FM synthesis involves using one oscillator (the modulator) to control the frequency of another (the carrier). The simplest FM setup is depicted in figure 2.3.
Figure 2.3: The most basic setup required for FM synthesis - taken from [DJ85]. In order to understand the effect of FM, imagine that the frequency of the modulating oscillator (fm ) is very low (1 to 10Hz). The output of the carrier will be a tone that wavers up and down in pitch. The pitch will vary between (f c d) and (fc + d), where fc is the frequency of the carrier and d is the amplitude of the modulator. Musicians call this wavering pitch effect “vibrato”. If the amplitude (d) of the modulating oscillator is increased, the tone will waver up and down more wildly - like a police
2. Previous Work
11
siren. Thus, increasing (d) increases the depth of the vibrato. If the frequency of the modulating oscillator (f m ) is now increased, the tone will waver up and down faster. As the modulating frequency goes into the audio range (above 20Hz), the output of the carrier is no longer heard as a vibrato, but obtains a distinct timbre of its own. Many different timbres can be obtained by varying the frequency and amplitude of the modulating oscillator. Remember this is only the simplest FM patch, a more complex one is shown in figure 2.4. Here, the constant d is replaced by an envelope generator. The envelope dynamically changes the amplitude of the modulating oscillator over time. This results in a time varying spectrum at the output - a very interesting sound that changes its character over time. Notice also that the amplitude of the carrier is shaped by an envelope. This envelope dynamically controls the volume of the time-varying spectrum.
Figure 2.4: A more complicated FM patch - taken from [DJ85]. Recall that two important contributors to the timbre of a sound were its spectrum and amplitude envelope7. This simple FM algorithm, has achieved control of both of these! Compare this to the additive synthesis approach which required tens of oscillators. In the late 60’s and 70’s, the prospects of FM synthesis seemed so good that Yamaha purchased the rights to it 8 . 7 8
See section 2.2.1. See [Wil88] page 52.
2. Previous Work
12
However, the disadvantage of FM is that it is hard to control. For a given set of parameters, it is difficult for a human to predict how the result will sound.
2.2.6 Sampling Sampling synthesis has become a lot more popular in recent years due to the falling price of digital electronics. Sampling is the process of converting an analog audio source into digital form (via an ADC9 ) and storing this information. The digital audio is stored as a string of bits and can be played back at any time via a DAC10 . It can also be played back at different rates to simulate different pitches. For example to synthesize a piano, you could sample just one note of a real piano. The rest of the notes can be synthesized by playing the sampled note back at different rates. Sampling has proved very successful for synthesising all manner of instruments and usually provides a much more realistic imitation than any other synthesis method. There is a price to pay for this however - it is very data intensive. To accurately synthesize an instrument you have to sample it a number of times at a very high rate. This can lead to huge amounts of data which has to be stored and processed. Compare this with a sophisticated FM algorithm with which you only need to store about 20 parameters.
2.2.7 Other Methods Other methods of sound synthesis include waveshaping, discrete summation synthesis, group synthesis and many others. Receiving particular attention in recent times is granular synthesis which creates sound by combining together tiny sound “grains”. However, we cannot possibly hope to cover all these methods - and besides, they are not really relevant to this thesis 11 . And now for something completely different... we move to a discussion of Evolutionary Algorithms. The material that follows is based largely on the excellent framework presented by B¨ack [B¨ac95]. 9
Analog to Digital Converter, pronounced “aydack”. Digital to Analog Converter, pronounced “dack”. 11 However, future projects may explore artificial evolution applied to these and other, even more bizarre synthesis methods. 10
2. Previous Work
13
2.3 Evolutionary Algorithms (EA’s) Evolutionary Algorithms (EA’s) are a broad class of optimisation techniques that mimic the process of biological evolution. In nature, populations of organisms adapt to their environment through a process of reproduction and selection. Unfit organisms do not survive to reproduce, which leaves the fitter organisms to reproduce and dominate the population. Occasionally, mutations in the genes of an individual organism will lead to a more successful creature. This creature’s genes will come to dominate the gene pool and so raise the level of fitness of the population. Likewise, the EA works by optimising artificial genes which represent parameters of the problem you are trying to solve. The following pseudo code shows basically how the EA works12 : 1 2 3 4 5 6 7
population’s genes randomly. While (population unfit), do genes of individuals with others (mating). genes of individuals. the fitness of each individual. the individuals which form the new population. od.
This description skips a lot of detail. Each of the operations in the above pseudo code needs careful consideration for effective implementation: This involves setting the genes of each individual in the population to a random number. What probability distribution should you use? Recombination (mating) can be sexual13 or panmictic14 . Selecting the individuals that mate with each other is usually random. There are also discrete and intermediate forms of mating. Discrete recombination forms a child by copying genes from one parent or the other. Intermediate recombination allows interpolation of gene values from parents to form the child. Mutation introduces slight variations into the population by randomly modifying genes of individuals. You have to decide what proportion of the parent population is subject to mutation and also the degree of mutation - how much different should the child be from the parent? This is a big step. First, you have to translate each individual from a genotype to a phenotype15 . You then have to evaluate each phenotype using a 12
Adapted from [B¨ac95] page 66. Involving just two individuals from the parent population. 14 Involving three or more individuals. 15 How you do this will depend on how the genes are encoded. 13
2. Previous Work
14
fitness function. The fitness function must give an indication of how close an individual is to the optimum solution. Once you have found the relative fitness of each individual, you must select which individuals survive to form the next population. This can be as simple as just selecting the n most fit individuals 16, or more complicated. Usually, an individual is more likely to be selected if it has a high fitness score, but unfit individuals still have a chance of being selected. As the field of EA’s is relatively new, there is still much debate as to the best method for implementing all these operations 17. There are currently three main flavours of EA, which each handle these problems in different ways. They are Genetic Algorithms, Evolution Strategies and Evolutionary Programming.
2.3.1 Genetic Algorithms (GA’s) The Genetic Algorithm (GA) seems to be the most popular form of EA and no doubt, you have already heard of it. They were developed in America John Holland [Hol75] and also popularised David Goldberg [Gol89] 18 . The most distinctive feature of the GA is utilisation of a binary encoding for genes. That is, the mutation and recombination operations operate on raw bitstrings - sequences of zeros and ones. This leads to very simple operations and a very versatile problem solver. Since everything on a computer is represented as a bitstring at the lowest level, you can apply a GA to a very wide range of problems. A disadvantage with this approach is that implementation is complicated as it is often difficult to operate on single bits. Mutation is achieved by very occasionally ‘flipping a bit’ 19 of an individual. Recombination works in two steps. First, a number of crossover points are chosen along the length of the genome. Next, the child genome is formed by taking the first parent’s genome up until the first crossover point, taking the second parent’s genome until the next crossover point and so on. This process is illustrated in figure 2.5. There are two possible children formed from the same set of crossover points depending on which parent is chosen first. In GA literature, the recombination process is often just referred to as “crossover”. GA philosophy, stresses that mutation is a background operator and crossover does the real work of optimisation. This claim is backed up by experimental evidence and 16
Literally “survival of the fittest”. A lot of it is guesswork. 18 For an entertaining and inspiring story of the discovery of the GA, see [Lev93]. 19 Changing a zero to a one, or vice versa. 17
2. Previous Work
15
Figure 2.5: An illustration of 5 point crossover - taken from [B¨ac95]. also Holland’s schema theorem [Hol75]. This theory of schemata basically proposes that crossover raises the fitness of the population by combining “building blocks” of high fitness together in different ways20 . Although GA’s are very versatile, they are often slow to converge on a solution. B¨ack demonstrated that GA’s converged slower than both Evolution Strategies and Evolutionary Programming in a number of test problems [B¨ac95].
2.3.2 Evolution Strategies (ES’s) The Evolution Strategy21 (ES) was a developed in Germany independent of the GA [B¨ac95]. ES’s use a real encoding for the genome; each gene is represented by a real number as opposed to a single bit in the GA. An individual is formed by a vector of real numbers. While this means that ES’s are not as flexible as GA’s, they are often much easier to implement. In the most basic ES, an individual ! a is represented by a vector of object variables ! x which are the parameters being optimised: ! a = (x1 ; x2 ; : : : ; xn )
Mutation is performed by adding a small amount of noise to each parameter:
x0i = xi + N (0; i ) where N (0; ) denotes a normally distributed random variable with zero mean and standard deviation . The standard deviation may be different for each parameter xi and must be specified in advance. The child individual is formed as a vector of all the mutated parameters:
! a 0 = (x01 ; x02 ; : : : ; x0n ) 20 21
See [Hol75] for a lot more detail. Or more correctly Evolutionsstrategie.
2. Previous Work
16
This mutation procedure is all we need to implement the simple (1+1)-ES 22 ; an evolution strategy where 1 parent is mutated to form 1 child. The best individual out of the parent and child is chosen to become the parent of the next generation, the process is then repeated. Obviously recombination (mating) is impossible with a parent population of one. Much more complex kinds of ES arise when populations of more than one are considered. The ( + )-ES works by recombining parents which are then mutated to form children. The best individuals are chosen from the parents and children to form the parent population of the next generation. The current state-of-the-art evolution strategy is the (; )-ES. Here again the parents are recombined then mutated to form children, but the new parent population is selected only from the children. The current state-of-the-art Evolution Strategies also employ self optimisation. x as before, but Individuals are not only represented by a vector of object variables ! ! ! also by a vector of standard deviations and rotation angles :
! a = (! x;! ;! )
The standard deviations and rotation angles are subject to mutation as described above. After these have been mutated, they are used to modify the probability distribution for mutation of each object variable. Thus each object variable is mutated by a special normal distribution that is being optimised as well [B¨ac95]:
x0i = xi + N (0; C (i0 ; i0 )) In the above equation C 1 (; ) is a covariance matrix. If a variable xi needs a large standard deviation, its i will eventually be mutated to a larger value. Using this method, you do not have to worry about specifying standard distributions for each variable, they will be set to optimum values automatically. In contrast to GA’s, mutation is stressed as the main operator in ES’s. The contribution of recombination is not disregarded, but all the self optimisation features described above are for the benefit of mutation, not recombination. B¨ack demonstrates that the ES converges much faster than both GA’s and Evolutionary Programming on a variety of fitness landscapes [B¨ac95]. 22
This notation is used in [B¨ac95] to distinguish the different flavours of ES. The first number is the size of the parent population. The second number is the size of the child population that is formed from the parent. If the numbers are separated by a ‘+’ (plus), the new parent population is chosen from the old parents and the children. If the numbers are separated by a ‘,’ (comma), the new parent population is chosen only from the children.
2. Previous Work
17
2.3.3 Evolutionary Programming (EP) Evolutionary Programming (EP) is basically the American equivalent of the ES 23 . It uses a real encoding for genes and has some self optimising features. In EP, variances are stored in the genome as opposed to ES’s which store standard deviations. Finally, EP has no recombination operator and relies on the power of mutation alone. It still performs admirably however, converging faster than GA’s24 on a variety of fitness landscapes [B¨ac95]. We now move on to look at how EA’s have been applied to the field of sound synthesis.
2.4 EA’s for Sound Synthesis Almost all of the published research on EA’s applied to sound synthesis has been conducted by Andrew Horner and his colleagues. He first applied a GA to a musical application with David Goldberg in 1991 [HG91]. Given a starting and finishing pattern of notes, they used a GA to evolve patterns of notes that bridged the start pattern to the finishing pattern25 . The results were gratifying and the authors suggested a further application could be found in using a GA to evolve the timbre of a sound 26 , as opposed to a sequence of notes. This is exactly what Horner went on to do in subsequent years, publishing a veritable barrage of papers on the subject. These are shown in table 2.4. In his first paper on the subject, Horner et. al. outline just why GA’s are useful for sound synthesis [HBH93] : FM synthesis is a very efficient, though not always easily controlled technique for generating interesting sounds. The paper reviews the techniques used so far to solve the problem of choosing parameters for FM synthesis and labels them “ad hoc.” Instead [HBH93]: The task of finding FM parameters to match musical tones is typical of problems that defy traditional optimisation, yet are suited to [a] genetic algorithm solution. 23
However, it seems that EP is not as advanced as ES. But slower than ES’s 25 For musicians: this process is called “Thematic Bridging”. 26 Evolving timbre’s with an EA is exactly what this thesis is concerned with, so on reading this I was very interested to find out if Horner had followed up his suggestion. 24
2. Previous Work Year 1993 1995 1995 1996 1996 1996 1996 1997
Title GA’s and their Application to FM Matching Synthesis Wavetable Matching Synthesis of Instruments with GA’s Envelope Matching with Genetic Algorithms A GA Based method for Synthesis of Low Peak Amp Signals Group Synthesis with Genetic Algorithms Common Tone Adaptive Tuning using Genetic Algorithms Discrete Summation Synthesis using GA’s Hybrid Sampling-Wavetable Synthesis with GA’s
18 Reference [HBH93] [Hor95a] [Hor95b] [HB96] [CH96a] [HA96] [CH96b] [YH97]
Table 2.1: Papers published by Andrew Horner et. al. on the subject of applying GA’s to sound synthesis All of the papers in table 2.4 use the same basic method to apply a GA to a sound synthesis algorithm 27: First, a sample of the instrument to be synthesised is taken, for example a trumpet tone. This original sample will form the fitness function by enabling a comparison between it and the synthesized sound. The GA then chooses parameters for the synthesis algorithm. The tone is synthesised and a fitness score is obtained from the relative spectral error between the synthesised tone and the original sample. As the generations pass, each population of individuals matches more closely with the sampled tone. In the example of the trumpet tone, as each generation passes you would hear synthesised tones that get closer and closer to the sampled trumpet sound. The results of this technique were very successful in all of the papers in table 2.4. Most of the papers conclude that the Genetic Algorithm method of choosing synthesis parameters is much more effective than any of the “ad hoc” methods used to date. Horner and his colleague’s work may be extensive, but note that his method does not involve human interaction. The user of the system gives only a sample of the instrument they want synthesised, and the GA goes away and does the rest. This implies that the method is limited to producing sounds that already exist. That is good news for this project: Horner’s work has shown that GA’s can be successful when applied to sound synthesis algorithms, but he hasn’t tried an interactive system capable of producing unknown, novel and unheard-of sounds.
2.5 Interactive Evolutionary Algorithms (IEA) So far we have only considered Evolutionary Algorithms where the fitness function is objective. For example, in Horner’s work, the fitness of a synthesised instrument 27
The synthesis algorithm was different for each paper.
2. Previous Work
19
tone was calculated from the relative spectral error between it and the original sampled tone. What if instead, a human steps in and tells the computer how good the synthesised tone sounds? In this case the fitness function becomes subjective. There has actually been a fair amount of investigation into EA’s which use subjective, human supplied fitness functions. This section will present a summary of this work. It is given in chronological order so you can develop the story as you read.
2.5.1 Dawkins’ Biomorphs Richard Dawkins [Daw86] was the first to demonstrate the power of a subjective fitness function. He aimed to construct a system that would demonstrate the role of mutation in evolution; his system would enable a human user to “breed” pictures of trees and plants. He was astounded testing his system for the first time, when after only 20 generations he had evolved bug-like forms that he later called “biomorphs”. Dawkins’ system28 worked with a simple recursive tree drawing algorithm that took nine parameters. These parameters determined the form of the resulting tree. The user of the system is presented with a screen of “mutant offspring” trees that are generated by randomly changing one parameter of the “parent” tree. The user examines the offspring and selects the most aesthetically pleasing 29 - the chosen tree goes on to be the parent for the next screenfull of mutant progeny. Figure 2.6 demonstrates evolution using Dawkins system. The picture shows each individual biomorph in a box. A line between boxes connects each biomorph to its parent, forming a sort of “family tree”. You can see how quick the process is - a bug like creature is evolved in just a few generations. This mutation and selection process produced surprising results. Dawkins quickly found that the scope of his simple algorithm was not limited to drawing trees. He was able to produce all manner of plants, bugs and sea creatures. He even managed to evolve the letters of his own name! Clearly, this manner of forms did not seem apparent when he first designed the algorithm. By manually plugging numbers into the algorithm, he may never have found the variety of forms he saw. It was by using the EA as a way to explore the parameter space of the algorithm that he was able to see its real potential. 28 29
The system is described in much more detail in [Daw88]. This is Artificial selection as opposed to Natural Selection.
2. Previous Work
20
Figure 2.6: Evolution of “biomorphs” with Dawkins’ system - taken from [Daw86].
2. Previous Work
21
2.5.2 Oppenheimer’s Artificial Menagerie Peter Oppenheimer also described a system of interactive artificial evolution [Ope88]. However, his images were more complex than Dawkins. Oppenheimer used an algorithm with 15 parameters to evolve impressive three dimensional plant forms (see figure 2.7).
Figure 2.7: Example of an evolved plant form - taken from [Ope88]. As in Dawkins system, mutation is the only genetic operator used and “Fitness, of course, is in the eye of the beholder.”30
2.5.3 Smith’s Bugs Joshua Smith extended Dawkins’ work with “biomorphs,” as well as formalising some concepts about GA’s with subjective fitness functions [Smi91]. Smith implemented a system like Dawkins’ but with a larger breeding population and a genetic recombination operator added. This allowed the user to select multiple biomorphs and have them mate as opposed to just mutating them. Also, a two dimensional Fourier series was used to generate the biomorphs as opposed to Dawkins’ 30
From [Ope88].
2. Previous Work
22
recursive tree drawing algorithm. Here, the genes of an individual are Fourier coefficients31 . Figure 2.8 shows a sample of biomorphs evolved using Smith’s system.
Figure 2.8: Biomorphs evolved with Smith’s system - taken from [Smi91]. Smith defined the term Interactive Genetic Algorithm (IGA) to refer to systems where a human user acts as the fitness function for a GA32 . Three criteria are specified for applicability of an IGA to a problem [Smi91]:
31
The problem can be formulated as a search through a parameter space. Candidate solutions to the problem can be generated in near real time. The utility of candidate solutions can be compared by humans, but not (practically) by means of a precisely specified formula.
Smith uses a real encoding for his genes which implies that he is using an ES as opposed to a GA which are characterised by binary encodings. However, Smith considers ES’s to be a subset of GA’s as he states in [Smi94]. This is contrary to B¨ack’s belief that ES’s and GA’s fall under the umbrella of EA’s [B¨ac95]. 32 Likewise I use the term Interactive Evolutionary Algorithm (IEA) to refer to an Evolutionary Algorithm (EA) that uses a human user as the fitness function. I believe this to be a more politically correct term as IGA implies the use of a binary encoding.
2. Previous Work
23
Note that the problem of generating novel and interesting sounds with synthesis algorithms (artificial evolution for sound synthesis) satisfies these criteria:
Finding interesting sounds is a search through the algorithm’s parameter space. Current computing power allows synthesis of sound from parameters in near real time33 . Deciding whether a sound is “novel” or “interesting” cannot be achieved with a precisely specified formula, but can be easily accomplished by a human.
2.5.4 Sims’ Artificial Evolution Karl Sims’ [Sim91] constructed one of the most elaborate demonstrations of these techniques to date34 . Sims used a powerful super-computer to generate and evolve astounding images (see figure 2.9).
Figure 2.9: An image generated from an evolved symbolic lisp expression - taken from [Sim91]. 33 34
Not to mention the real time synthesis available with specialised hardware Indeed, this paper was the main inspiration for this thesis.
2. Previous Work
24
Sims evolved 3D plant structures generated from procedural models as in [Ope88], but he also added four different recombination operators. This allowed the user to mate plant structures in a variety of ways to form all manner of offspring. Sims went on further and used symbolic lisp expressions as genotypes for the evolution process. In this case, evolution was not limited to number of parameters in a procedural model, but was “open ended.” The symbolic expressions were mutated and mated to generate 2D images35 , 3D volume textures and even animations. Although the resulting images are rather abstract, they are nevertheless beautiful. In a later paper, he reported that he had extended the system to evolve 3D shapes and even 2D dynamical systems [Sim93].
2.5.5 Moore’s GAMusic 1.0 Jason Moore’s program “GAMusic 1.0” [Moo94] is an interactive melody evolver and up until recently was the closest example of work similar to this thesis. The program runs on Microsoft Windows and consists of a simple interface to enable the user to evolve melodies that play over the PC speaker. The user auditions each melody and assigns a fitness value (good, average or poor). Melodies are represented as a 128 bit binary string and a simple GA is used to recombine and mutate these. There are controls that enable adjustment of the mutation and recombination frequency. After the user assigns a fitness value to each melody in the population, the GA mutates and recombines the binary strings that represent them. The newly created population is again auditioned by the user and the process is repeated until the user is satisfied with the melody. Figure 2.10 shows a screen shot of the GAMusic user interface. You can see the controls for mutation and recombination frequency, the population of 12 melodies (with fitness ratings) and bit string that represents the current melody. There are a number of problems with GAMusic. Since the melodies play over the PC speaker as simple bleeps, they are not very pleasing to listen to. Also, there is no way (except by ear) to obtain the melody data so you can use it in other applications (e.g. a composition that you are working on). However, the most frustrating aspect of GAMusic is that it takes a long time to evolve decent melodies. This is partly due to the time it takes to audition a single population. There are 12 melodies in each population which can last up to 5 seconds each. On average, it takes about 1 minute to go through all the melodies and judge their fitness. Many populations must be auditioned to get a good melody - evolution progresses slowly and the user quickly becomes bored 36 . In documentation for the 35 36
Again, see figure 2.9. Especially after listening to the PC speaker bleeps for so long.
2. Previous Work
25
Figure 2.10: A screen shot of the GAMusic 1.0 user interface. program, Moore mentions plans to develop a program to evolve more complex sounds using a PC sound-card37 , but these have subsequently been canceled [Moo98].
2.5.6 van Goch’s P-Farm When I first discovered this program I was shattered. Arno van Goch’s “P-Farm” [Goc96] achieved exactly what I hoped this project would - and it did it well! The problem was, I only discovered it halfway through the year when my literature survey had been completed and I was thinking about implementation. P-Farm is an experimental program that works with an external synthesizer. In its current implementation (version 0.3) it supports the Roland Juno 106 and Yamaha V50 synthesisers, and runs under Microsoft Windows. To use P-Farm, you must have an external synthesiser. The program works by evolving patches, then sending the patch information to the synthesiser via MIDI 38 . The user can then play the synthesiser to evaluate the fitness of the patch that has just been sent. Figure 2.11 shows a screen shot of P-Farm’s user interface. The external synthesizer also required to use the system is not shown. 37 38
A program that evolves sounds is the topic of this thesis. Musical Instrument Digital Interface.
2. Previous Work
26
Figure 2.11: A screen shot of P-Farm’s user interface. P-Farm uses a GA with a population of 32 patches. The fitness function is binary: the user can choose either to keep a patch or delete it. After you have gone through the population keeping only the fit patches, the deleted patches are filled with new individuals formed by crossing and mutating the fit patches. The program lets you control three parameters of the GA, namely: crossover ratio What proportion of each parent’s genes appear in the child. mutation rate The probability that a given gene will mutate. transposition rate The probability that genes will get transposed to different parts of the chromosone. After you get used to it, you can breed quite good sounds using P-Farm. There are however some criticisms. Evaluating and keeping track of all 32 patches is sometimes difficult. You frequently have to go back to a patch and remind yourself of what it sounded like. Also, switching between the computer keyboard and the synthesiser keyboard can be frustrating. Although some of the program functions are accessible via the synthesiser keyboard, when you have to switch around it can make evolution slow and painful. The work of van Goch is very close to the subject of this thesis. So close in fact that I could base my work on extending or modifying the algorithms used in P-Farm. This
2. Previous Work
27
however is not a possibility. Although the program is still under active development, technical details39 are not available [Goc98].
2.6 Summary The survey of literature conducted gives a good indication that an Interactive Evolutionary Algorithm applied to sound synthesis would yield fruitful results. This is highlighted by the following facts:
Techniques such as FM synthesis can produce a wide variety of timbres from a small number of parameters. However, choosing the right parameters is difficult. Evolutionary Algorithms are optimization procedures that have proved to be useful in searching the parameter spaces of a wide variety of problems. Horner and his colleagues have shown that EA’s can be successfully applied to sound synthesis algorithms in a non-interactive manner. Dawkins, Sims and Oppenheimer demonstrated the power of Interactive Evolutionary Algorithms for searching the parameter space of procedural models for computer graphics. The problem of evolving “interesting” and “novel” sounds fits Smiths criteria for applicability of an IEA. Very few people have tried to apply an IEA to sound synthesis before. Moore canceled his plans and the only other known work is that of van Goch, which is still very experimental.
So it seems that artificial evolution for sound synthesis is a good idea. Using the surveyed literature as a foundation we can now begin to design a system that will embody the goals of this thesis – this is the topic of the next chapter.
39
That is, source code that I could modify.
Chapter 3 System Design 3.1 Introduction This chapter is concerned with the initial design of a system that can achieve the goals of this project. First, the problem is divided into a number of subproblems – this gives an outline of the basic system. Then, for each of these subproblems issues related to implementation are discussed. Next, the design decisions actually taken are explained and justified. Then finally, we look at ways of evaluating the system in order to determine whether the goals have been achieved.
3.2 System Outline A system that applies an Interactive Evolutionary Algorithm to sound synthesis can be broken into three main parts as shown in figure 3.1. Each of part achieves a different function: User Interface This part allows the user to audition individuals (hear the sounds produced by the Sound Synthesis section) and rate their fitness (evaluate them). Evolutionary Algorithm This part takes the fitness data from the User Interface and accordingly mates and mutates the synthesis parameters. Sound Synthesis This part generates the actual audio data from the sound synthesis parameters passed from the Evolutionary Algorithm section.
28
3. System Design
29
Synthesis Parameters
Evolutionary Algorithm
Sound Synthesis Generates sound from individual
Mutation and mating of individuals
Selected Individuals Audio Data
User Interface Selection of fit individuals.
Figure 3.1: The three main components of the proposed system
3.3 Considerations For each part of the proposed system (figure 3.1), a number of issues have to be considered before you can decide on an implementation. These issues are discussed in the following sections.
3.3.1 Sound Synthesis There are basically two ways you can implement this part: in hardware or in software. Hardware A hardware implementation would use an existing stand-alone synthesizer 1 . The Evolutionary Algorithm would send parameters to the synthesizer via MIDI 2 . The user would then be able to audition the sound by playing the synthesizer as normal. This 1
Sometimes the abbreviation ‘synth’ is used in this document - it refers to a stand-alone hardware synthesizer. 2 Musical Instrument Digital Interface. See [Boo87] for a good introduction to MIDI.
3. System Design
30
is exactly the way that van Goch’s system works 3. The advantage of a hardware approach is speed. The only delay required is the time taken to transfer the parameter data (typically less than 200 bytes) from a computer to the synth. The user can then test out the synthesis algorithm in real time, playing any combination of notes desired. The disadvantages of a hardware system are cost and inflexibility. Synthesizers are typically quite expensive and they only implement a very limited number of synthesis algorithms4. Since each synthesizer manufacturer has their own way of interpreting parameters, you could only hope to support one particular synth from one particular manufacturer. Users who didn’t own that particular synth wouldn’t be able to use the system. Even if they did own it, they would always be restricted to exploring the set parameter space that comes with that particular synth. Software Alternatively, a software implementation of the Sound Synthesis part would provide infinite flexibility at a lower cost, since no extra hardware would be required by the user. If they possessed a basic computer capable of sample playback5 , they would be able to use the system. Freely available packages such as Csound [Cso98] enable the user to experiment with virtually any known synthesis algorithm. MatLab [Mat98] also allows construction of custom algorithms. MatLab is not as musically oriented as Csound, but it provides other useful features such as data visualisation and user interface tools. Another advantage of a software Sound Synthesis part is the increased integration. Users don’t have to divide their attention between a musical keyboard and a computer keyboard6. They can audition individuals, then rate their fitness all from the same place. However, the disadvantage of a software approach is speed. A software synthesis algorithm will always be slower than a dedicated hardware synth. This is cause for concern, as recalling one of Smith’s criteria: candidate solutions must be generated in near real time7. Never fear – today’s computing power is fast enough to satisfy this requirement as long as the algorithm is not too complex 8. 3
Refer back to section 2.5.6 for a description of van Goch’s system. Most hardware synths only implement 1 synthesis algorithm. 5 Most computers these days come with sound cards - these provide sample playback functionality. 6 This was one of the annoying features of van Goch’s system - see section 2.5.6. 7 Smith’s criteria for application of an IGA were reported in section 2.5.3. 8 There are now real time versions of Csound that run with Pentium processors. 4
3. System Design
31
3.3.2 Evolutionary Algorithm Implementing the Evolutionary Algorithm part of the system also raises issues which must be considered. It is assumed that this part of the system will run on a small computer (UNIX or Windows) as this is easiest and cheapest option. Given this, the following issues have to be considered: Representation How will the genes of individuals be represented? The basic choice here is between a binary encoding (as in a GA) or a real encoding (as in an ES). Binary encodings are much more flexible, but they are harder to implement. A real encoding for genes leads to an easier implementation – each gene is just a real number representing a different parameter of the synthesis algorithm. It could also lead to faster convergence9 – that is, the user is able to find interesting sounds more quickly. Operators What kind of genetic operators will be useful for a sound synthesis application? Mutation seems an essential operation, and is also fairly easy to implement. Mating (recombination) requires more thought and also has many alternatives 10. Since hardly any work has been done in this field, it is not known which types of mating work better for an interactive sound synthesis application. Random Numbers Whichever operators for the Evolutionary Algorithm are chosen, good random number generators will be required for their implementation. This fact should be kept in mind when making a decision on the implementation language. Self Adaption Should any of self adaptive features of ES’s11 be used in the system? 9
Section 2.3 discusses the convergence rates of different EA’s. These were discussed in section 2.3. 11 These were discussed in section 2.3.2. 10
3. System Design
32
Self adaption has the potential to make convergence very fast. However, it has only been used in non-interactive ES’s with large populations – it may not work in an interactive system because population sizes are much smaller and are evolved for fewer generations.
3.3.3 User Interface The User Interface component possibly requires the most design thought. The issue here is basically: How do you make the system easy and intuitive to use? Some User Interface considerations are: Population size A user of the system can only hear one sound at a time 12 . This is a major difference to the graphical systems of Dawkins, Sims and Smith 13 where the user can very quickly audition a lot of images. Only being able to audition one sound at a time implies that population size should be somewhat smaller. If it weren’t, auditioning the whole population would take a long time, and the user may become bored. Fitness rating Would it be better to have a binary rating (e.g. good or bad) or some kind of scale (e.g. 1 to 10)? This decision will also be dependent on the population size and the genetic operators available. Implementation Platform Because the interface to the system is likely to be graphical, the possible implementation platforms need to be considered. Java provides very good GUI14 facilities and also has the advantage that it could be used over the World Wide Web. However at present, the audio support in Java is less than adequate – Java cannot generate or play audio files on most architectures. This 12
A possible method to overcome this limitation involves use of the “cocktail party effect” [CE95]. This is the phenomenon whereby a person can distinguish multiple sound sources as long as they are spatially separated. For example, at a cocktail party where there are many conversations occurring at once, you are able to clearly understand the one you are focussed on. 13 Described in section 2.5. 14 Graphical User Interface
3. System Design
33
could possibly be circumvented by using a CGI script to generate audio files. Most Web browsers support playback of audio files, so you could send the pre-generated audio files across the Web to the client. Alas, this idea is also impractical: audio files are typically quite large, and thus would take a long time to send over the Web. This would make the system slow and frustrating to use. Microsoft Windows provides good GUI support once you come to grips with its ridiculous conventions. However, its sound playback utilities are very hard to use. Tcl/Tk for Unix provides very good and easy to use GUI facilities. The powerful redirection features of Unix would make audio playback a breeze. MatLab also has easy to use GUI features, but these are limited in some respects.
3.4 The Design Decision The final decision for the implementation of the three system parts was not reached in a single instant, but evolved 15 over the course of the project. The initial experiments and system development (chapter 4) were conducted in MatLab. MatLab was very easy to use and provided a rapid development environment. Later in the project when the basic system had been worked out, it seemed easiest to stick with MatLab. Also, the extra time saved in not having to re-implement the system in another language meant that more could be accomplished. As a result, the final system was implemented using the GUI features of MatLab (chapter 5). Implementing all three system parts in MatLab had a number of impacts. These are described as follows: Sound Synthesis The sound synthesis algorithm ran in software. This cost less than using external hardware, but was slower. The slow speed of the software meant that users were limited to auditioning just one note of each sound. However, the advantage of having everything implemented in MatLab was increased integration - there were no problems switching between computer keyboards and synthesizer keyboards. A software approach also offers flexibility in the choice of synthesis algorithm, however this project mainly focussed on FM synthesis 16. Evolutionary Algorithm Being a statistical package, MatLab provides very good random number generators – this was an advantage for this part of the system. The big disadvantage however was the lack of data structures. This made some 15 16
This pun was bound to happen, sooner or later. An experiment was also conducted using additive synthesis - see Appendix A.
3. System Design
34
tasks quite difficult and meant concepts like ‘Object Oriented’ did not exist. Fortunately, implementing a real encoding for genes was no problem – vectors of real numbers are supported well by MatLab. User Interface In preliminary experiments, a simple keyboard interface was used and this was no problem to implement in MatLab. In the final system, a GUI was built. MatLab’s GUI features were somewhat limited and at times painful to use. However, due to MatLab’s interactive nature, development was probably a lot easier and quicker than it would have been on other platforms. Other design decisions not explained here (e.g. population size) will be illuminated as the system is developed in chapter 4.
3.5 Evaluation It’s all very well having a system that generates “novel” sounds, but how can you measure the success of such a system? A successful system is one that satisfies its goals. The first goal as stated in section 1.2 could be satisfied by implementing the system described above. What about the second goal? One way to approach this would be to collect opinions from a number of users. These users could be composers or musicians who require such sounds for their work, or merely novices who are seeking entertainment. Some questions that could be asked of the users to determine the success of the system include:
Is the system easy to use? In your opinion, st the evolution of sounds controllable? Is this system useful? Would you be able to use the sounds generated in compositions? Is using this system easier than other methods of obtaining the same kinds of sound?
Another way to test success would be to collect statistics from users. These statistics would be obtained from a number of experiments:
First the user would be asked to use a system where they manually had to adjust parameters of the sound synthesis algorithm. The number of changes they had to make until they had a satisfying sound would be recorded.
3. System Design
35
This process would be repeated on a system where the parameters of the algorithm were chosen at random. Finally the users would use the evolutionary system. After these tests, we would have a measure of how intuitive each system was to use. A comparison could be made about which is the most effective system.
The actual experiment conducted to assess the success of the system forms the subject of chapter 5. Now that we have discussed the various issues relating to the implementation and assessment of artificial evolution for sound synthesis, we move on to chapter 4 which tells the story of how the actual system was developed.
Chapter 4 System Development 4.1 Introduction This chapter documents the steps taken in developing the software that forms the basis of this project – software that applies artificial evolution to sound synthesis 1. The guiding philosophy throughout the development of this system was to imitate Dawkins’ system2, only applied to sound instead of graphics. Recall that Dawkins’ implemented an IEA based on a very simple tree drawing algorithm with mutation as the only genetic operator. Humble as it was, the fantastic results obtained inspired a whole new generation of research. It seems justified to attempt to repeat this for sound synthesis. The idea then, is to keep the synthesis algorithm simple and focus on mutation. To this end, FM synthesis was chosen as the sound synthesis algorithm. FM synthesis can produce a wide variety of timbres with only a few parameters, yet it is very hard to ‘control’. It seemed a perfect candidate for artificial evolution. The software development is presented in a series of experiments. As described before, these experiments were conducted in MatLab as it was easy to use, and provided a rapid development environment. The programs that performed the experiments were named EvoS which stands for Evolution Strategy. An outline of this chapter is as follows: EvoS 2 aims to implement a most basic form of evolution with the most basic FM algorithm. 1 2
Yeah, well... it was a spur of the moment thing. See section 2.5.1.
36
4. System Development
37
EvoS 3 experiments with self-adaption, but concludes it is unsuitable for an interactive system. EvoS 4 tackles the tricky problem of mutating envelopes. EvoS 5 develops a method of mutating modulation indices. EvoS 6 describes how the problem of mutating frequency ratios was overcome. EvoS 7 finally, combines all the previously developed mutation techniques into a single system. This system is tested by a number of users. That being said, lets start at the beginning...
4.2 EvoS 2 4.2.1 Aim The goal of EvoS 23 was to create a most basic interactive Evolution Strategy. To keep things very simple, a (1+1)-ES was decided on. Here, one parent sound is mutated to form a single child sound. If the fitness function 4 deems the child sound is better than the parent sound, it is replaced by the child who becomes the new parent. On the other hand, if the child sound is worse than the parent sound, that child is discarded and a new one is formed by mutation of the same parent. Note that there is no recombination or mating involved, the only genetic operator is mutation. Likewise, to keep things simple in the sound synthesis section, a very basic algorithm was used: fixed index frequency modulation (FM) 5 . This requires only three parameters to produce a sound:
fm = modulating frequency fc = carrier frequency I = modulating index, a measure of how much the modulator deviates the carrier frequency. Combining the FM synthesis with the Evolution Strategy, EvoS 2 aimed to produce a system that provided interactive evolution of sounds. 3
What happened to EvoS 1? – It was a mutant that got out of control and had to be deleted. Remember, in an interactive system the human user acts as the fitness function 5 FM synthesis was described in detail in section 2.2.5. Also, see figure 2.3 for an illustration of the basic FM algorithm implemented in EvoS 2 4
4. System Development
38
4.2.2 Implementation Implementing the system described above in MatLab required a number of subprograms or procedures. These subprograms were drawn together in the main program (EvoS 2) which acted as a user interface. Since there are three parameters required to generate a sound using FM, an individual is represented by a vector of three real numbers - these are its genes:
! a = (fc ; fm ; I ) There are a number of procedures that operate on the genes of an individual and achieve the vital functions of the ES: eval This procedure takes the genotype of an individual and evaluates the phenotype. Although this sounds complicated, all it involves is extracting the synthesis parameters (fm , fc and I ) from the vector, generating the resulting sound, and playing this sound to the user. In a non-interactive system, eval would also asses the phenotype according to the fitness function, but here we must let the user decide how good it sounds. mutate This procedure takes the genes of a parent individual and returns genes of the mutant child offspring. The ‘mutation’ is just a random deviation applied to each gene of the parent. This is achieved by sampling a normal distribution with a mean of the parent gene. The variance of the normal distribution differs according to which gene is being mutated. For example the variance for mutating the frequency genes fm and fc is different to the variance for mutating the index gene I . Deciding on these variances is tricky and will be discussed later. So basically, mutate works by applying the normal distribution to each element in the vector of numbers passed to it. What results is a mutated child genome. init This procedure returns a random set of genes. It is used to initialise the first parent before any mutation can take place. To generate the random genes, a uniform probability distribution is sampled. Here we have to decide on the range of legal values for each gene, this is also a tricky issue that deserves a discussion of its own. When eval, mutate and init are combined with a suitable user interface, an Evolution Strategy is formed. The interface randomly initialises the first parent using init and then forms the first mutant child with mutate. These two sounds are played to the user by calling eval. The user then decides which sound they like more: If they like the parent more, the old child is discarded and mutate is called
4. System Development
39
on the parent once again; If they like the child more, the child takes the parent’s place and a new child is formed via mutate. With this system the user explores the space of possible sounds by listening to some randomly chosen directions, deciding which is the best and then moving forward in that direction. Figure 4.1 shows a screen shot of EvoS 2, demonstrating the user interface. The
Figure 4.1: A screen shot of the EvoS2 user interface. user executes commands by entering a character corresponding to a command in the menu. The following commands can be seen in the menu in figure 4.1: Parent This plays back the current parent sound to the user. Child This plays back the current child sound to the user. Mutate This mutates the parent to form a new child sound and then plays this new sound so the user can hear it. Replace This replaces the parent with the current child. A mutant child is formed from the new parent and this sound is played to the user. Genes This prints out the actual genes (fc , fm and I ) of the current parent and child. Display The user can obtain a graph of all the parents they have chosen to date. The graph shows how they have navigated through the space of possible sounds.
4. System Development
40
History The user can hear all of the parents they have chosen to date, starting with the first and ending with the most current. This enables the user to hear how the sound they started with has gradually changed to become the sound it is now. Variances and Ranges One of the tricky issues encountered in implementing mutate and init was deciding on valid ranges and variances for each of the genes in the genome. Remember, the genes represent parameters of a synthesis algorithm and so certain values will not make any sense – for example if fm or fc are negative. Once a range valid range for each parameter was chosen, a variance was needed. Choosing the “right” variance was critical to the effective operation of mutate. For example, suppose valid frequencies for fm and fc range from 0 to 5,000Hz. If the variance is 1Hz then the user will not notice the difference between a parent sound and its mutant child. They will sound almost exactly the same. As a result, all directions in the sound space will sound the same and the user won’t be able to decide which way to go. On the other hand, suppose the variance is 1000Hz. The user will not be able to tell that a parent and its mutant child are related. They will sound vastly different, the user will end up hopping at random around the space of possible sounds. Clearly a balance must be reached where the variance is small enough that the user can hear how the child sound relates to its parent, but large enough so there is a discernible difference. In EvoS 2 the main focus was to get a system up and running, so deciding on effective ranges and variances was left to later versions of the system. The somewhat arbitrary ranges and variances selected are shown in table 4.1. Parameter
fc fm I
Minimum Maximum Variance 0Hz 5,000Hz 100Hz 0Hz 5,000Hz 100Hz 0 30 2
Table 4.1: Ranges and variances for the parameters of EvoS 2.
4.2.3 Results Although EvoS 2 was very simple it did provide some gratifying results. As there were only three parameters, it meant that you could graph these in 3 dimensional
4. System Development
41
space and obtain a picture of how the user navigated through the parameter values (the ‘Display’ feature). Figure 4.2 shows one such picture.
Figure 4.2: A sample evolution run for EvoS2. Each cross in figure 4.2 represents a parent that the user has chosen. The evolution starts where there is no cross and finishes at the asterix6 . The thing to notice about figure 4.2 is the clustering of points in one area. This cluster represents an area of the parameter space that the user found “interesting” and wanted to keep exploring. The user has auditioned points outside this cluster but found them unpleasant. You can see at the start, the user has cut a straight course away from the initial point and into the cluster. This implies the initial point represented a parameter combination resulting in an unpleasant sound. In reality, the sounds produced by EvoS 2 were very simple, ranging from mild bleeps to harsh distorted tones. The only real form of navigation the user could conduct was to make the discomforting outbursts more sublime. The cluster in figure 4.2 actually represents an area of the sound space that was less harsh than its surroundings. Using the ‘history’ feature you could hear how each parent was related to its 6
Just an apology about the graphs: the results for the EvoS experiments were collected at different times. Later on, clearer methods of displaying the data were discovered, but is was too late for these early results. As well as being less clear, it means that the display format is inconsistent between graphs. Whenever there is a format change, it is explained in detail, but once again I apologise for this inconvenience
4. System Development
42
predecessor, but you could also notice the difference. This suggests that the choice of variances was adequate. Another interesting result was the effect of a small initial gene pool. In nature, when the gene pool becomes small, it is disastrous. If a change in the environment is introduced that kills one individual, all individuals will die because they are so similar to one another. In EvoS 2 the gene pool is very small – the entire population consists of just one individual. If this individual sounds very displeasing, it is most likely that all its mutant children will sound displeasing. Subsequently the user cannot pick an appropriate child and the entire population is doomed. This problem may be alleviated if the initial population consists of a few random parents. Hopefully, at least one of these will sound slightly pleasant and allow meaningful evolution.
4.2.4 Conclusion EvoS 2 demonstrated that a user could navigate through a simple FM sound space using a (1+1) evolution strategy. Further work needs to address the problems of: choosing appropriate ranges and variances for parameters; and generating sounds that are actually pleasant to listen to.
4.3 EvoS 3 4.3.1 Aim The goal of EvoS 3 was to trial an automated method of choosing variances for parameters. In EvoS 2 it was noted that deciding on effective variances for each of the parameters was a tricky matter. Why not, instead of guessing the variances by trial and error, include the variances as part of the genome? The variances would then evolve to suitable values just as the synthesis values do. This is the kind of self-adaption or “metaevolution” that is characteristic of more advanced Evolution Strategies [B¨ac95].
4.3.2 Implementation In order to evolve the variances for each of the parameters, extra genes were added the genome from EvoS 2. These were:
f2m The variance used for mutation of the modulating frequency (fm ).
4. System Development
43
Parameter
f2c f2m 2 I
Minimum Maximum Variance 0Hz 200Hz 100Hz 0Hz 200Hz 100Hz 0 4 2
Table 4.2: Variances and ranges for the variances of EvoS 3
f2c The variance used for mutation of the carrier frequency (fc ). I2 The variance used for mutation of the modulation index (I ). These additions resulted in a genome of the form:
! a = (fc ; fm ; I; f2 ; f2 ; I2 ) c
m
The eval subroutine remained unchanged from EvoS 2, as it only required the three synthesis parameters to generate the sound. The init subroutine needed a slight change in order to generate random initial values for the new parameters. Once again this raised the issue of a valid range for the new parameters. The mutate subroutine was modified so that the variances of each parameter were mutated first. These mutated variances were used to mutate the synthesis parameters. This again raised the issue of what variance and valid range to choose for mutating the variances f2m , f2c , and I2 . Variances and Ranges In EvoS 3, the variances and valid ranges of f2m , f2c and I2 were chosen in the aim of achieving a similar results to EvoS 2, however there was still a lot of guesswork involved. The values chosen are shown in table 4.2.
4.3.3 Results EvoS 3 produced very similar results to EvoS 2. The sounds produced were still quite unpleasant, but the user did have some control of the evolution. It seems that evolving the variances did not achieve anything. This is illustrated by the figures 4.3 and 4.4 which show a typical evolution session using EvoS 3. As
4. System Development
Figure 4.3: Evolution of synthesis parameters in EvoS 3.
Figure 4.4: Evolution of variances in EvoS 3.
44
4. System Development
45
before, each cross represents a parent that the user has chosen. The evolution starts at where there is no cross and ends at the asterisk. Figure 4.3 shows the evolution of the synthesis parameters with the small clusters that were illustrated in EvoS 2. It was concluded that these clusters were evidence of controlled evolution, the user had found an area in the sound space that was pleasant and stayed there. Figure 4.4 shows the variance parameters of the genome. It is clear that there is no convergence on any area in this space, it is a wild mess. Together figures 4.3 and 4.4 demonstrate that while the user can navigate the sound space adequately, they wildly flop around the variance space. One reason for this might be that the fitness function doesn’t stay stable for long enough for the variances to converge on any point. For instance, if a freak mutant whose large variances’ places it in an interesting area of the sound space, the human user will choose it, instead of sticking to the goal they were following before. Clearly a human fitness function is not as stable as an objective assessment of fitness. Another reason could be the small number of individuals that can be tested by the human user. In a proper Evolution Strategy with a computer evaluated fitness function, thousands more individuals can be evaluated in a shorter amount of time. Subsequently, convergence in the variances may be observed whereas in EvoS 3, it was absent.
4.3.4 Conclusion EvoS 3 demonstrated that evolving the variances of synthesis parameters does not offer any advantage over fixed variances.
4.4 EvoS 4 4.4.1 Aim Since automatically choosing variances seemed like a dead end in EvoS 3, EvoS 4 aimed to begin addressing the problem of creating more interesting sounds. Generation of more musically useful sounds using FM synthesis requires the use of ‘envelopes.’ Envelopes allow the values of synthesis parameters to vary over the duration of a sound. For example in EvoS 2 and EvoS 3 the modulation index parameter (I ) remained constant throughout the course of the sound. A much more interesting result would occur if the value of I was varied as the sound proceeded. This can be accomplished by using an envelope to control the value of I .
4. System Development
46
The simplest kind of envelope is one that controls the amplitude (volume) of a sound. This situation is depicted in figure 4.5.
Figure 4.5: An envelope (top) controls the amplitude of a waveform (bottom). To use envelopes in an Evolution Strategy, you have to be able to mutate them - this is not as easy as it appears. In order to isolate the problem of mutating envelopes, EvoS 4 aimed to implement an instrument that only evolved amplitude envelopes. This seems like a step backwards in the journey toward interesting sounds, as an amplitude envelope modifying a sine wave produces more boring results than the bleeps of EvoS 2. However, this intermediate step was necessary in order to effectively advance. Another goal of EvoS 4 was to start isolating the timbre of sound from other musical characteristics7 . One of the problems of EvoS3 was that as well as timbre being subject to evolution, pitch was also being evolved. It would be desirable to separate pitch from the evolution process. A composer would then be able to work out a melody, and then use the evolution system to find an appropriate timbre for that melody. Finally, EvoS4 addressed another shortcoming of the previous versions. Recall that init was called to generate a random parent to start off the evolution process. If the user did not like this initial sound, they had to quit the system and start again. 7
Timbre is the characteristic tone quality of a particular instrument. See section 2.2.1 for more information.
4. System Development
47
This was the problem of the small initial gene pool. EvoS4 would allow the user to randomize the genome from within the system. This would allow the user to quickly get to other parts of the sound space if they had become disinterested in the current area.
4.4.2 Implementation Representing Envelopes Representing an envelope with a genome, requires the following parameters:
n The number of ‘breakpoints’ in the envelope. xi The point in the sound where each breakpoint occurs. yi The amplitude value at each breakpoint. So an individual
! a has the following form:
! a = (n; x1 ; : : : ; xn ; y1 ; : : : ; yn)
where
xi ; yi 2 [0::1]
For example, the envelope in figure 4.5 would be represented as:
! a = (3; 0:166; 0:333; 0:833; 1; 0:75; 0:625) Notice here that only 3 breakpoints are specified, the endpoints (0; 0) and added implicitly.
;
(1 0)
are
Mutating Envelopes Once we have a representation for envelopes, we can work out a strategy that mutates them. This might be pretty straight forward, except for a few considerations:
If we change the number of envelope points (n), we have to decide which points to remove or add. This can be done by randomly picking one (or more) of the breakpoints and deleting or copying it. Also note that n is an integer and so if it is mutated with a normal distribution, it must be rounded back to an integer.
If we change the xi ’s we must make sure that they remain monotonic, otherwise we might end up with an envelope that runs backwards in time! We can make sure of this by sorting the x i ’s in increasing order after we have made any changes.
4. System Development
48
If we change the yi ’s we should scale them so that the largest y i has a value of 1. This ensures that synthesized sound is played with maximum volume 8.
With these constraints in mind, the mutation procedure can be described in the following steps: 1. Mutate n, the number of envelope points. Any change in n occurs fairly infrequently so we set the variance quite low. Also if we do change n, we must make sure it is still an integer. 2. For each xi , mutate with a normal distribution in the same way as in EvoS2. Once the xi ’s have been mutated, sort them to make sure they are monotonically increasing. 3. For each yi , mutate as for the values so the largest yi is 1.
xi ’s. Once they have been mutated, scale the
These steps were implemented in the mutate subroutine. Variances and Ranges What about the variances and ranges for the parameters in the above mutation procedure? In this experiment, to make things easier, mutation of the number of breakpoints n was disabled. Envelopes always had 3 breakpoints. This means the focus was on finding a good variance for the xi ’s and yi ’s. Note that both xi ’s and yi ’s are in the range [0..1] so they will have the same variance. A few different values of variance were trialled, and the results of these are documented in the next section. Randomizing Envelopes As mentioned in the aim, EvoS4 let the user randomize envelopes without having to quit and start again. This was simply done be providing an option that replaced the current child with one generated by the init procedure. init also had to be modified to deal with the envelope genome. It used a very similar strategy to mutate except that uniform distributions were used instead of normal distributions. 8
If we wanted sounds to play with different volumes, we would have overall-volume as a parameter in the genome. Here, we are only interested in the relative values of envelope breakpoints.
4. System Development
49
Isolating Timbre A new eval subroutine was implemented that allowed any sequence of notes to be played with the evolved parameters. This allowed a user to enter a melody that would be heard each time with a new mutant timbre.
4.4.3 Results The envelope mutation procedure seemed quite successful, figure 4.6 shows a parent envelope and its mutant child.
Figure 4.6: The envelope of figure 4.5 shown with an envelope produced by mutation. The most interesting result observed was the difference between mutation and randomization. Randomization produced envelopes that were completely unrelated to the parent sound (see figure 4.7), whereas mutation produced slightly different ones (compare figures 4.6 and 4.7). The degree of similarity of mutated envelopes was controlled by the variance used for mutation of the xi ’s and yi ’s. Being able to compare the results of mutation and randomization provided a useful metric for tuning this variance. If the variance was too small, no difference would be heard between the parent and the child and the user
4. System Development
50
Figure 4.7: The envelope of 4.5 compared with an envelope produced by randomization. resorted to using randomization to search the space of possible sounds. If the variance was too large, there seemed to be no difference between randomizing an individual and mutating it, both produced equally dissimilar sounds. However, if the variance was just right, mutation would produce a sound that was satisfyingly different yet clearly related to the parent sound. It was here that useful evolution really took place. The user can feel that they are trying different directions in the sound space and then choosing the one that sounds best. If this area of the space proves uninteresting they can jump to an entirely new area by randomizing. This idea is illustrated in the following figures which show the use of different variances for mutation. Each of the figures plots how the x i ’s (top graph) and yi ’s (bottom graph) changed as the user evolved a sound. Each cross represents a parent that the user has chosen. A solid line joining two parents means mutation was used to make this jump, whereas a dotted line means randomization was used. The evolution run starts at the asterisk and ends at the circle. Note that the number of breakpoints (n) is always 3 in these graphs, that is, n was not subject to mutation. This restriction was necessary in order to effectively visualize the results.
1 of the paramFigure 4.8 shows an evolution where the variance was set at 0.02 ( 50 eter range). You can see the very tightly bunched clusters of points separated by dotted lines. These represent groups of mutants separated by randomizations. The mutants
4. System Development
51
1. Figure 4.8: EvoS 4 evolution with a variance of 50 are so similar to each other that you can’t hear (or see) the difference between them. This variance is too small.
1 of Likewise, figure 4.9 shows an evolution where the variance was set at 0.1 ( 10 the parameter range). Here, the distance between mutations (solid lines) and randomizations (dotted lines) is almost the same. It is difficult to tell the difference between a mutant sound and a random one. As a result, randomization seems just as effective for exploring the space as mutation. This variance is too large. 1 of Finally, figure 4.10 shows an evolution where the variance was set at 0.05 ( 20 the parameter range). In this figure you can see that mutations are separated by a small yet still visible distance (as opposed to figure 4.8). Randomizations traverse a much larger distance than mutations (as opposed to figure 4.9). This allows other parts of the space to be explored that would take too long to reach via mutation alone. This variance is just right. Table 4.3 summarises these results at a glance.
4. System Development
1. Figure 4.9: EvoS 4 evolution with a variance of 10
1. Figure 4.10: EvoS 4 evolution with a variance of 20
52
4. System Development Variance 0.02 0.1 0.05 a
53
Normalised Shown in... Observations a Variance 1/50 Figure 4.8 Too small, no audible difference between parent and child. 1/10 Figure 4.9 Too large, mutation no different to randomization. 1/20 Figure 4.10 Just right! Useful evolution obtained.
Normalised variance is the variance expressed as a fraction of the parameter range
Table 4.3: Effects of different variances in EvoS 4. Isolating Timbre The melody playing feature worked well. It isolated the timbre of the sound from other musical characteristics such as pitch and duration. However, it was too slow to be of much use. The more notes that the melody contained, the longer MatLab took to produce the accompanying sound. It was so slow in fact, that the only viable way of using the system was to just have one note programmed in the melody. By the way, the sounds produced by EvoS 4 were quite boring. As it was only a volume envelope that was being evolved, you really had to know what you were listening for to tell the difference between sounds.
4.4.4 Conclusion EvoS 4 demonstrated a successful system for evolving simple envelopes. It was found that a variance of one twentieth of the parameter range facilitated useful evolution.
4.5 EvoS 5 4.5.1 Aim EvoS5 planned to advance the complexity the evolution instrument a step further. Instead of applying an envelope to the volume of a tone, an envelope was applied to the modulation index (I ) input of an FM instrument. When this was done, two extra parameters were included to control the range that this envelope varied the modulation index over. These are called I1 and I2 .
4. System Development
54
For example if I1 = 0 and I2 = 5 and we have an envelope as shown in figure 4.5, then the modulation index will start out as 0, rise quickly to 5 at the maximum amplitude and then decay slowly back down to 0. If I 1 = 5 and I2 = 0 then the envelope will be flipped. The modulation index will start at 5, drop quickly to zero and then rise slowly back to 5. Because the modulation index is a control of the spectral density of the resulting sound, quite interesting and lively sounds can be produced merely by changing the values of I1 and I2 . In EvoS 4, an effective method for evolving envelopes was developed. EvoS5 aimed to develop an effective method for evolving the parameters I 1 and I2 .
4.5.2 Implementation Representation In order to concentrate on the parameters I1 and I2 , all other aspects of the sound were kept constant. The amplitude envelope, the modulation index envelope and the ratio of fm to fc remained unchanged. So the genome for an individual in EvoS5 was very simple:
! a = (I1 ; I2 )
where
I1 ; I2 2 [0::20]
Mutation
I1 and I2 were very easy parameters to mutate. Their effective range was [0..20] and it didn’t matter if their values were real or integer. Consequently the mutation procedure (mutate) just involved adding a small amount of normally distributed noise as usual. The only consideration is the variance of the normal distribution. The next section shows how an effective variance was obtained. Changing init to provide a randomization procedure was equally simple. It just involved sampling a uniform distribution with the range [0..20].
4.5.3 Results In EvoS5, the most interesting result was once again the contrast between the mutation and randomization procedures. This was illustrated particularly well as the space was only two dimensional. Different variances were trialled for the mutation procedure and you can really see their effect on the corresponding graphs. The graphs show how
4. System Development
55
the values of I1 and I2 changed while a user was searching for interesting sounds. It was interesting to note that the results were very similar to EvoS4, that is a variance of 1/20 (one twentieth) the parameter range led to the most successful evolution. The next few paragraphs document the results of the trials conducted. Figure 4.11 shows a typical user’s course through the parameter space with a vari1 of the parameter range). ance of 0.4 ( 50
Figure 4.11: EvoS 5 evolution with a variance of 1/50. As with the pictures in EvoS 4, each cross represents a parent that the user has chosen. The evolution starts on the left of the figure and ends at the cross with the circle. Mutations are represented by solid lines joining crosses and randomizations are a dotted line between crosses. Notice how tightly the clusters of points are bunched. This is because the small variance doesn’t allow mutants to occur very far away from the parent sound. As a result mutants always sound very similar to the parent. In fact with a variance of 1/50 there was no discernible difference between a parent and its mutant child. Consequently, this is not very useful for evolving sounds – the user can never make any progress as all mutant children sound exactly the same. Also note how these clusters of points are widely spaced. These large jumps are caused when the user decides to randomize the parameters.
1 of the parameter range). Figure 4.12 shows a user’s path with a variance of 2 ( 10
4. System Development
56
You can immediately see the contrast to figure 4.11. The clusters of points are no
Figure 4.12: EvoS 5 evolution with a variance of 1/10. longer tightly bunched and it is difficult to tell a mutation from a randomization. What the user finds when listening to the sounds is that each mutant child sounds quite different from the parent – so different that the user can’t tell how they are related. As a result, using randomization seems just as effective for getting good sounds as using mutation.
1 of the paFigure 4.13 shows a user’s path, this time with a variance of 1/20 ( 20 rameter range). Here is a balance between the extremes of figures 4.11 and 4.12. The mutant children are spaced widely enough from the parent that the user can hear the difference in the sound, yet they are close enough that the user can hear how the sound is related to the parent. Consequently when the user hears something they like, they can follow that direction and obtain a pleasing sound. Instead of getting random clusters of points you get ‘lines’ of points that correspond to the user following a particular direction in the sound space. Note also in figure 4.13 that you can see the difference between mutation steps and randomization steps. It is not as marked as it was in figure 4.11 but it is still visible. Table 4.4 sums up at a glance the discussion of the previous paragraphs. A final point is that the sounds produced were now a bit more interesting. Instead of just a volume envelope modifying a sine wave (as was the case in EvoS 4), the
4. System Development
57
Figure 4.13: EvoS 5 evolution with a variance of 1/20. sounds had characteristics of trumpets and oboes.
4.5.4 Conclusion EvoS 5 demonstrated an effective system for evolving the parameters I 1 and I2 . It was found that a variance of 1/20 the parameter range resulted in successful evolution, which was the same result obtained in EvoS 4. Could this hold true as a general rule 9 , or is this a little early to be hypothesizing?
4.6 EvoS 6 4.6.1 Aim In EvoS4 and EvoS5 effective methods for evolving envelopes and modulation indices were discovered. With these we could build an FM instrument with an amplitude envelope and a modulation index envelope which would be able to produce a fair range 9
The “1/20 rule”?
4. System Development Variance 0.2
2
1 a
58
Normalised Shown in... Observations a Variance 1/50 Figure 4.11 Too small, no audible difference between parent and child.. 1/10 Figure 4.12 Too large, mutation no different to randomization. 1/20 Figure 4.13 Just right! Useful evolution obtained.
Normalised variance is the variance expressed as a fraction of the parameter range
Table 4.4: Effects of different variances in EvoS 5 of sounds. However, Chowning’s FM instrument [Cho73] has one more parameter that permits synthesis of a very wide range timbres - the ratio of carrier frequency to modulating frequency:
fc fm
=
N1 N2
EvoS6 aimed to discover an effective method of evolving this ratio. The ratio of fc to fm was actually being evolved in EvoS2 and EvoS3 as whenever the values of fc or fm were mutated, the ratio was changed. This form of mutation is undesirable because as stated before, we want our system to evolve timbres that are independent of pitch. The pitch of a note is usually determined by its fundamental frequency f0 . In FM synthesis, f0 is given by the following equation:
f0 =
fc N1
=
fm where N1 ; N2 are integers with no common factors. N2
Clearly, if fc and fm are mutated arbitrarily as they were in EvoS2 then f0 will change with each mutation. To see how this would be undesirable, consider the following example: John is a composer who has just come up with a brilliant melody. It is very simple, consisting of two notes: middle C followed by the A above this. John worked out the melody on his piano, but he now wants to play it with a ‘space-age brassy’ kind of instrument. I suggested he use EvoS to find his perfect sound. Taking my advice, John uses EvoS and finds a sound that he really likes. However, when he uses the parameter settings to play his melody, he finds it is out of tune with the piano part! Angered
4. System Development
59
by the time he has wasted, John goes to a traditional synthesizer where he knows if he presses the middle C key, that is the note he will hear. After many days of twiddling knobs he doesn’t understand, John finally settles for a sound that is not what he wanted - a very second rate ‘space-age brassy’ timbre. However, he is consoled by the fact that when he plays his melody it will be in tune with whatever other parts he has worked out for his composition. Although the above example is fictional, we would like to avoid anything like this happening in real life. For this to be the case, we need make sure the fundamental frequency f0 stays constant after mutation and even randomization so only the timbre of the sound is changed. To do this we simply mutate the parameters N 1 and N2 . When it comes time to synthesize the sound, we work out f c and fm according to the pitch (f0 ) that the user wants to hear. That is:
fc = f0 N1 and fc = f0 N2 The problem is that mutation of N 1 and N2 is not straight forward. If N1 and N2 are integers, the FM sound produced will be ‘harmonic’. That is, each frequency
present in the spectrum of the sound is a whole number multiple of the fundamental frequency. The human ear is used to hearing harmonic sounds in string and wind instruments, harmonic sounds are ‘nice’. If N1 and N2 are not integers then the FM sound produced is ‘inharmonic’. Examples of inharmonic sounds are drums and bells – they are usually ‘harsh’.
4.6.2 Implementation An EvoS6 individual is simply represented as:
! a = (N1 ; N2 ) In order to implement an effective procedure for mutating N1 and N2 we have to distinguish between the two domains of harmonic and inharmonic sounds. Since mutation should produce sounds that are similar to the parent sound yet different, the following guidelines should be followed:
When the parent sound is harmonic, most mutant children should be harmonic too, since an inharmonic sound is very different from a harmonic sound.
4. System Development
60
When the parent sound is harmonic, mutant children that are harmonic should only occasionally possess a different value of N1 or N2 from their parents. This is because within the harmonic domain, different ratios of N 1 and N2 produce quite different timbres. When the parent sound is harmonic, there is a very small chance that a mutant child could be a closely related inharmonic sound.
When the parent sound is harmonic, and the values of N 1 and N2 have been mutated, all common factors of these numbers should be eliminated. This ensures that you never get the situation of a child timbre being the same as the parent timbre but just transposed by a certain amount. For example if N 1 = 2 and N2 = 2, the resulting sound has the same timbre as when N 1 = N2 = 1, but it is just transposed up an octave.
When a sound is inharmonic N1 and N2 may not be integers. In this case, mutation is straight forward: a small normally distributed random variable is added to the parent settings. Of course, there is a very small chance that both N1 and N2 end up as integers after mutation. If this happens, the rules for harmonic mutation apply subsequently.
The mutate subroutine was modified to accommodate the above guidelines. The init subroutine which is used for initialisation of individuals and the randomization feature gave the individual an equal chance of being harmonic or inharmonic.
4.6.3 Results After a little fine tuning, N1 and N2 were being evolved successfully. If you started with a harmonic sound, the mutant children would generally be the same, only occasionally changing to another harmonic sound. This effect was achieved by having a harmonic variance10 that gave about 1 chance in 8 of changing the values of N1 and N2 . Starting with an inharmonic sound, each mutant child was different yet clearly 1 related to the parent. This was achieved by having an inharmonic variance 11 set to 20 the range. The useful range of N1 and N2 was found to be [1..10]. 10 11
N
N
The variance when 1 and 2 are integers. The variance of 1 and 2 when they are not integers.
N
N
4. System Development
61
4.6.4 Conclusion EvoS6 demonstrated an effective system for evolving the ratio of carrier frequency to modulating frequency. The parameters N1 and N2 are not trivial to mutate as they are discontinuous in the timbral space. It is interesting to note how the complexity of the mutation procedure blossoms in an attempt to provide a continuous timbral space.
4.7 EvoS 7 4.7.1 Aim EvoS 7 finally put all the pieces together. It implemented an evolution strategy for the complete FM instrument described in [Cho73]. To describe a particular timbre with this instrument, you had to specify a separate envelope for amplitude and modulation index, the modulation index range and the ratio of carrier to modulation frequency. Since this instrument was finally capable of producing semi-decent sounds, it was decided to measure its success by gathering the opinions of a few users. The users first assessed Arno van Goch’s P-farm [Goc96], then used EvoS 7. After this they were asked to comment on the merits and faults of each system.
4.7.2 Implementation Implementation of EvoS 7 consisted of just tacking together all the previous versions of EvoS and making them work in harmony. Representation As in previous versions, an individual is represented by a vector of real numbers. The genome in EvoS 7 was the most complicated so far:
! a = (nI ; xI :1 ; : : : ; xI :n ; yI :1; : : : ; yI :n ; I1 ; I2 ; N1 ; N2 ; I
I
nV ; xV :1 ; : : : ; xV :nV ; yV :1 ; : : : ; yV :nV )
The genome starts with the modulation index envelope. This is specified in the same way as the amplitude envelopes were in EvoS4: nI is the number of points in the envelope and (xI :i ; yI :i ) are the coordinates of the breakpoints. The modulation index envelope is followed by I1 and I2 specifying the range that this envelope operates over (as in EvoS5). N1 and N2 specify the ratio of carrier to modulator frequency as
4. System Development
62
in EvoS6. Finally the amplitude envelope is specified, also following the convention adopted in EvoS4: nV specifies the number of points in the envelope and (x V :i ; yV :i ) are the coordinates of the breakpoints. Of course the eval procedure was modified to accept individuals of this form. It had to be able to interpret each of the parameters in order to synthesize the sound and play it to the user. Mutation and Randomization Mutation and Randomization functions are accomplished using the techniques described in EvoS4 to EvoS6. The mutate subroutine takes an individual and splits it up, applying the mutation schemes of previous versions to each section. The amplitude and modulation index envelopes are mutated using the technique described in EvoS4. The modulation index range (I1 and I2 ) is mutated as described in EvoS5. N1 and N2 are mutated using the technique of EvoS6. A similar scheme is used to achieve the randomization feature through the init subroutine. User Interface The user interface used for EvoS 7 is still the same simple keyboard interface that was used in EvoS 2 (refer back to figure 4.1).
4.7.3 Results To illustrate the effectiveness of EvoS 7, figures 4.14 to 4.17 show an example of an evolution run. The figures show how each parameter in the EvoS 7 genome changed over the course of the evolution. First look at figure 4.14 which shows how I 1 and I2 change. Parents that the user has chosen are represented by a number - the generation. The evolution starts at generation 1 and ends at generation 15. A solid line joining two points means mutation was used to make this step, whereas a dotted line implies randomization. Looking at figure 4.14 you can see that mutation was used from generations 1 to 5, then a random jump occurred between 5 and 6. Again, mutation is used from 6 to 10 followed by another random jump from 10 to 11. Finally, generations 11 to 15 are traversed via mutation. Figure 4.14 is quite an exquisite example of evolution. The
4. System Development
63
Figure 4.14: EvoS 7 evolution: Modulation indices (I 1 and I2 ). sections of mutation exhibit a strong sense of direction, implying that the user could control the search through the sound space. This is contrasted by the large leaps of randomization which rapidly transport the user to parts of the sound space that would by unreachable via mutation alone. Now look at figure 4.15 showing how N 1 and N2 changed over the course of the evolution. In generations 1 to 5 we note that the sound is inharmonic 12. A small path is followed (via mutation) until the random jump occurs between generations 5 and 6. This jump lands the user on a harmonic sound 13 where it stays until generation 11. Recall from EvoS 6 that when the sound is harmonic, you don’t want it to change often because the harmonic sounds will be quite different from each other. This is why there is no movement from generation 6 to 10. The random jump to generation 11 also results in a harmonic sound, but from 11 to 12 and 12 to 13 there are some changes via mutation. Notice that the sound is still harmonic - this illustrates how the harmonic and inharmonic spaces are disjoint under mutation. Finally the evolution comes to rest from generation 13 to 15. Finally, examine figures 4.16 and 4.17 which show the amplitude and modulation 12 13
N 1 or N2 is not a integer.
Remember, an inharmonic sound results when either Harmonic sounds result when 1 and 2 are integers.
N
N
4. System Development
64
Figure 4.15: EvoS 7 evolution: Frequency ratios (N1 and N2 ). index envelopes respectively. Although it is hard to pick out much detail in these figures, you can clearly see the contrast between mutation and randomization. In both figures, the envelope shape changes gradually from generation 1 to 5 as it is mutated. At generation 6 the shape is quite different due to the randomization step. From generation 6 to 10, mutation changes the shape gradually again until the random jump at generation 11. Note again the marked difference in shapes between generation 10 and 11. Finally in generations 11 to 15, the shape is changed gradually via mutation. Note that each of these figures leave out a lot of information. For each generation, only the chosen parent is shown - you don’t see the many children that may have been auditioned and discarded. Imagine what figure 4.14 would look like if you did: around each parent point you would see a cluster of sounds that user had auditioned, only one of these (the one you can see on the graph) would be joined by a line to the parent. The point being made is that it is not a fluke that the first five generations in figure 4.14 form a line going vertically down - the user has tried other directions but decided ‘down’ is what they like the sound of best. If you appreciate that at the same time, around 14 other parameters are being mutated which also exhibit similar trends14 , then you can see that EvoS 7 (and artificial evolution in general) is really quite an effective tool for searching this large, multi-dimensional space. 14
Refer back to figures 4.15 – 4.17
4. System Development
Figure 4.16: EvoS 7 evolution: Amplitude Envelopes.
Figure 4.17: EvoS 7 evolution: Modulation Index Envelopes.
65
4. System Development
66
4.7.4 User evaluation To further asses the utility of EvoS 7 a number of people spent some time using both EvoS 7 and Arno van Goch’s P-Farm [Goc96]. Since P-Farm uses Roland’s Juno-106 synthesizer this allowed extra comparisons. The Juno allows you to manually adjust the sliders that control its parameters, so this method could be compared with evolution using P-Farm. Also, P-Farm can generate a population of completely random patches. Once again, this random method of finding good sounds could be compared with evolution using P-Farm. The users (Brett, Emma and Nick) have had little musical training, however they all have a strong interest in electronic music and enjoy listening to ‘off-the-wall’ sounds and timbres. Each user spent about an hour experimenting with the systems, the next sections summarise their experiences and opinions. Brett Brett found using P-Farm was easier than manually moving the sliders on the Juno. After ten generations he had found a few sounds which he was reasonably happy with. Brett thought it was very hard keeping track of the sounds that he liked. P-Farm presents a population of 32 sounds to the user who has to audition them all to find the ones they like most. Brett felt if there had of been a quick way to note sounds that you liked (for example, a fitness rating of 1 to 10) it would have been much easier to keep track of which sounds were good. Brett thought that using EvoS 7 was a lot easier than P-Farm. In EvoS 7 you are only presented with one sound at a time, and you have to make a decision as to whether it is better or worse than the current parent sound. Brett felt that this provided less distractions than P-Farm where you are constantly thinking about the 31 other sounds and whether they are better or worse than the current one. Brett noticed that EvoS 7 was slower than P-Farm when generating sounds for auditioning. P-Farm has a real-time response – as soon as you click on an individual you can play a tune with it on the keyboard. EvoS 7 however, takes a little while to generate the melody used for auditioning. The longer this melody is, the longer the delay. Finally, Brett did not like the fact that you couldn’t save in EvoS 7. If you got a sound you were reasonably happy with, you were too afraid to explore anywhere else as you may lose the sound you worked so hard for. To overcome this, Brett suggested a ‘stud-farm’ idea. As you go along exploring the sound space, if you find a sound you like, you can add it to your ‘stud-farm’. Sounds stored in the stud-farm can be retrieved at any time, so you no longer have to worry about exploring further and losing a good sound. Also, after a while the stud-farm will contain a portfolio of your favourite sounds. You could mate these favourites together (just like in a real stud-farm) to see if anything interesting
4. System Development
67
happens. Emma Emma used P-Farm for 8 generations and found two sounds that she was happy with. She found evolution with P-Farm was much more intuitive than just moving the sliders on the Juno without knowing what they did. A very interesting result however is that out of 32 completely random patches (one random P-Farm population) she found 4 sounds that she liked even more than the 2 she took 8 generations to evolve. When using EvoS 7 Emma thought that all the sounds it made were a lot harsher than the ‘nice’ sounds created with P-Farm. She felt that in order to use EvoS 7 effectively you have to know what kind of sound you are looking for to begin with. Emma also thought it would be good to be able to keep and go back to any sounds that you liked, but she also suggested an ‘undo’ feature. Sometimes when you are auditioning sounds very quickly, you make a mistake and accidently destroy a child sound that you wanted to keep. It would be good to be able to backtrack a few steps and rectify your mistake. Emma thought that to make an evolutionary sound system really useful you must be able to mutate samples. For example if I’m listening to a song and I hear a trumpet sound that I really like – I should be able to sample this sound and then mutate it into something similar yet different. Emma thought she would also like to take samples of all her favourite sounds, then breed these sounds together to make new and interesting sounds. Nick Nick thought that using P-Farm was much easier than adjusting the Juno’s sliders manually. He also found P-Farm evolution easier than randomly generating patches. Nick used P-Farm for 10 generations to evolve an “organ” type sound. Nick found using EvoS 7 much easier than P-Farm. He felt that EvoS 7 gave you more control over where the sound was going – you could see trends that were occurring. Although sometimes progress was a bit slow, he felt eventually you would get closer to the sound you desire. On the downside, Nick thought it was frustrating that you can sometimes go ‘backwards’, away from the sound you like. When you do this, there is no going back. Also, sometimes when you are being cautious you can miss an opportunity to get closer to your desired sound. Both these problems could be fixed with an ‘undo’ feature that allows the user to backtrack a number of steps and pick up where they left off. Nick disliked the slow speed of EvoS 7 and would have preferred more notes to test the sounds on. Finally Nick suggested a feature where you specify areas of ‘like’ and ‘dislike’. If you were in a bad spot in the sound space, you could tell the system and it would steer you away from that area in future. This
4. System Development
68
would stop you going round in circles, constantly encountering the same bad area of the sound space.
4.7.5 Conclusion A number of conclusions can be drawn from analyzing the user’s comments:
P-Farm demonstrated that evolution is more intuitive than adjusting synthesis parameters manually. This is especially the case when the user does not know how each parameter effects the overall sound. The user interface of EvoS is more effective than that of P-Farm. It seems presenting the user with one sound at a time provides less distraction than when they are faced with 32 sounds. The user interface of EvoS enables the user to hear how each mutant child is different from its parent. The sounds created by P-Farm are generally more pleasing than those of EvoS. This may be a result of the different synthesis models in use. P-Farm utilises a Roland Juno-106 which is a subtractive synthesizer while EvoS uses FM synthesis. Also, the Juno is a well designed instrument. The ranges on all the parameters have been set very carefully. As a result, it is very hard to get a bad sound out of the Juno. This is why Emma had just as much success with randomization as she did with evolution using P-farm. It seems that in order to get better sounds out of EvoS, it might be wise to experiment with different synthesis models. EvoS was much slower than P-Farm in generating sounds. Unfortunately, this fault in EvoS is not so easy to rectify. The way to improve speed would be to use a hardware synthesizer (like the Juno) that provides real time response. However a fair amount of driver software is required to interface any computer program with the synthesizer. Due to time constraints, this will not be possible. EvoS most certainly needs a ‘save’ feature. This would allow any parent or child sound to be saved so the user could return to it at any time. This feature might be extended to implement the ‘stud-farm’ that Brett suggested. Here, sounds that you have collected can be bred together to make new ones. EvoS also needs an ‘undo’ feature. This would enable the user to backtrack through a number of steps and rectify a mistake.
Overall, EvoS 7 performed surprisingly well when put to the test against P-Farm. If the suggested features are implemented EvoS will become an even more effective tool for interactive evolution of sounds.
4. System Development
69
So what’s next? Development of the artificial evolution system for this project was basically complete. What remained was to implement the users suggestions and then extensively test the system to measure its effectiveness. Emma’s suggestion of evolving samples was explored, and this is documented in Appendix A. Alas, this avenue did not prove successful. Instead, the user suggested improvements were implemented in the next version of EvoS – it was this system that was used as a basis of the experiments in the next chapter.
Chapter 5 Results 5.1 Aim Chapter 4 described the development of EvoS - a system that could evolve sounds. The aim was now to measure the effectiveness of that system. This chapter describes exactly how the effectiveness of the EvoS system was measured and what results were achieved. How can you measure the effectiveness of a system that relies so much on subjective opinion? This was a tricky problem which I thought about for a long time. In the case of an interactive system, at the very least you must acquire some user opinions. After a few different users had tried your system, you would get an idea of just how good it is. However, this thesis is not concerned with just how ‘fun’ or interesting a program is to use, but more importantly, whether or not artificial evolution is a useful tool for sound synthesis. In order to determine this, a comparative test was made. Users tested the evolution system, but also tested other (more traditional) methods of searching the sound space. Through the results of these tests, we will see not only how intuitive the evolution system was, but also how it rated as a search tool beside other methods. It was decided the comparison would occur between three search methods: Manual A manual search corresponds to specifying parameter values exactly and then listening to the resulting sound. This is akin to manually turning the dials on an analog synthesizer. To do this effectively usually requires knowledge of the synthesis algorithm in order to predict the effect of changing various parameters to achieve the sound you want.
70
5. Results
71
Random In a random search, parameters are assigned a completely random value and the resulting sound is listened to. This is the same as randomly spinning the dials of an analog synthesizer and hoping that where they land results in a good sound. It is a lottery - you might win the first time you try it, or you may spend years trying with no luck. Evolution This method of searching works via mutation and selection. It was described in detail in chapter 4 To compare these three search methods, a user interface was required for each of them. The next section describes the implementation of these user interfaces. Following this, the method of the experiment conducted will be explained and the results discussed and analysed. Finally, conclusions will be drawn from this analysis.
5.2 Implementation All of the search methods for comparison were implemented using MatLab. For an effective comparison, each search method used the same underlying synthesis algorithm: the FM synthesis algorithm of EvoS7. Since this was already implemented in MatLab, it seemed an obvious choice. MatLab also provides basic GUI 1 facilities that are easy, if at times tedious to use. The following subsections describe the implementation of each search method or tool.
5.2.1 Manual Tool Figure 5.1 shows a screen shot of the manual search tool user interface. Look at the left side of figure 5.1 and recall the genome of an EvoS7 individual 2:
! a = (nI ; xI :1 ; : : : ; xI :n ; yI :1; : : : ; yI :n ; I1 ; I2 ; N1 ; N2 ; I
I
nV ; xV :1 ; : : : ; xV :nV ; yV :1 ; : : : ; yV :nV )
The manual tool interface simply allows the user to type in a value for any of these parameters. Considerable work was involved in keeping parameters consistent. For instance, when the user changes the number of points in the amplitude envelope n V , the corresponding vectors xV :i and yV :i should shrink or grow to reflect this value. The user can hear the current sound by pressing the Play button. This causes the current settings of parameters to be read and then synthesized into a sound. Synthesis 1 2
Graphical User Interface. Refer back to section 4.7.2 for a more detailed explanation of the genome.
5. Results
72
Figure 5.1: Screen shot of the manual tool user interface. is achieved easily by using the existing eval subroutine from EvoS7. A small graph of the amplitude and modulation index envelopes is displayed beside the parameter settings. This assists the user in achieving the desired envelope. Also assisting the user is the status bar. This is the line of text at the bottom of the screen. It is frequently updated to notify the user of program’s status. The status bar has been implemented in all of the search tools. Figure 5.1 illustrates another feature common to all three search tools. Down the right-hand side is the Patch Store. This is an implementation of the “stud farm” idea3 . By typing in a name4 and pressing Save As:, the current parameter settings will be saved in the Patch Store. These can be recalled later by selecting the name of the patch you want5 and pressing Load. When you have a number of parameter settings recorded in your Patch Store you will want to save them all to disk. This can be accomplished by using the Save File... button. Conversely, to load a previously created Patch Store file you can use the Open File... button. Finally, when the user is finished using the manual search tool they can hit Quit 3
The stud farm idea was suggested by Brett - see section 4.7.4. For example, “Chowning 1” in figure 5.1. 5 For example, “Trumpet” in figure 5.1. 4
5. Results
73
and exit the program. Searching the sound space using the manual tool requires the user to have knowledge of how each parameter effects the sound - or at least have a feel for it. The user listens to the current sound and tries to guess which parameter needs adjusting to achieve the desired sound. The adjustment is made and the user listens again. If the resulting sound is further away from the desired sound, the change made to the parameter is reversed. Other changes are auditioned until a step closer is made. If the resulting sound is closer to the desired sound, other parameters may be ‘tweaked’ to bring it even closer. When the user is satisfied, the sound is saved in the Patch Store for later reference.
5.2.2 Random Tool The random search tool is much simpler than the manual tool. A screen shot of the user interface is shown in figure 5.2.
Figure 5.2: Screen shot of the random tool user interface The two buttons in the center are all that is required. Pressing Randomize generates a new patch using totally arbitrary parameter settings. The user no longer has to tediously enter all the parameters as they did in the manual tool. The downside is that there is absolutely no control over what the parameters will be. This randomization
5. Results
74
is implemented by using the existing init subroutine from EvoS7. When called, init simply returns an EvoS7 genome with completely random contents. The Replay button simply plays the current randomized sound and is implemented using the eval subroutine from EvoS7. Looking at the right hand side of figure 5.2 you can once again see the Patch Store. This performs exactly the same function as in manual tool. If the user hears any sounds they like, they can save them in the Patch Store for later reference. Note that the file formats used in all the search tools are compatible. Thus if a user found a nice sound in the random tool but wanted to ‘tweak’ it a little, they could save it in a file, load it up in the manual tool and there adjust the parameters manually to get the desired effect. Searching using the random tool is pretty straightforward. The user simply presses Randomize until an interesting sound is heard. The interesting sound can be saved in the Patch Store. To continue the search, the user presses randomize again. When the user is finished searching the sound space, they can save all their interesting patches in a file (using Save File..) and exit the program by pressing Quit.
5.2.3 Evolution Tool Last, but not least is the evolution search tool - shown in figure 5.3. The evolution tool is a GUI that replaces EvoS7’s keyboard interface. Looking at the left side of figure 5.3 you can see the Parent box. This box is divided in two halves. One half is the Play button which plays the current parent sound 6 . The other half is a Patch Store allowing the current parent sound to be saved or previous sounds to be loaded. Below the Parent is the Child box. This is also divided into two halves. The left half allows you to select a particular child and play it, the right half is a Patch Store allowing you to save the current child sound. As in EvoS7, children are created using Mutate and Randomize . Mutate mutates the current parent sound using EvoS7’s mutate subroutine, and Randomize creates a random child using init. The Replace button replaces the current parent with the current child. So through the three buttons Replace, Mutate and Randomize you have all the functionality of the EvoS7 system. Users can search through the sound space by mutating and replacing. When they are satisfied with any sound they can save it in the Patch Store. But what about the Partner box and that Mate button? 6
Playback is once again implemented with EvoS7’s eval subroutine.
5. Results
75
Figure 5.3: Screen shot of the evolution tool Mating Sounds Mating or recombination has previously been ignored in this study - the main focus has been on mutation. However, it would seem foolish to conduct an investigation of artificial evolution for sound synthesis without even experimenting with a recombination operator. To this end, it was decided to include one in this final incarnation of the system. Unfortunately as time was running out, this mating operator had to be kept extremely simple. The following paragraphs will describe its implementation. One point crossover The simplest form of recombination is one point crossover: one point along the genome is chosen at random, a child is formed by taking the genes of the first parent up to this point, and the genes of the second parent from after this point. For example, consider two parents genomes of the parents look like this:
! a and ! b which will form a child ! c.
! a = (a1 ; a2 ; :::; an 1 ; an ) ! b = (b1 ; b2 ; :::; bn 1 ; bn )
The
5. Results
76
A random crossover point i is chosen where will look like this:
i
2 [1; n
1].
Thus the child genome
!c = (a1 ; a2; :::; ai 1; ai ; bi+1; bi+2 ; :::bn 1; bn) But notice this assumes parent ! a goes first. If we decided that parent ! b should go first we would get a different child:
!c = (b1 ; b2 ; :::; bi 1; bi ; ai+1; ai+2; :::an 1; an)
a So for each crossover point i there are two possible children. Thus for two parents ! ! and b with n genes each, there are 2(n 1) different children that can be formed with one point crossover. Implementing one point crossover When this scheme of one point crossover is applied to the genomes of EvoS7, some major problems occur. One of these is caused by the fact that EvoS7 genomes do not a has a modulation index envelope have a fixed length. For example, imagine parent ! ! with two points (nI = 2) and parent b has one with ten points (nI = 10). If the crossover point is chosen to be one (i = 1), then the child will inherit n I = 2 from parent ! a , and all its following genes from parent ! b . This will be an invalid child nI will specify! two envelope points, but there will be ten envelope points following it, care of parent b ! OK, so this problem could be fixed up with some fancy coding and a bit of patience, but there were other problems as well7 . To avoid having to deal with all these, a much simpler scheme was devised: the genome could only be split at points where the results were well defined. This resulted in a ‘meta genome’ that looked like this:
! a = (Index Envelope; I1 ; I2 ; Freq Ratios; Amplitude Envelope) This scheme worked quite well and solved all of the previously mentioned problems. The downside of the scheme was that in terms of mating, the genome’s length was drastically reduced - there are only 4 possible crossover points. This means that for any two parents, there are only 2 4 = 8 possible children! Although it would have been desirable to have more combinations available, the crossover system seemed to work adequately. A MatLab procedure mate was written that implemented the scheme. The procedure takes two parent genomes, randomly chooses a crossover point and randomly chooses which parent will ‘go first’. The resulting child genome is the return value of the procedure. 7
x
y
Think about what happens if the crossover point is right in the middle of the ’s and ’s of an envelope.
5. Results
77
Mating in the evolution tool Now we can finally explain the mystery parts of figure 5.3. The Partner box allows you to load any sound previously saved in the Patch Store. When the Mate button is pressed, a child is formed by mating 8 the current Parent with the current Partner. This child can then be subjected to more mating or mutation using the previously described features of the evolution tool. When the user is finished searching the sound space via mutation, mating and randomization, they can save all their acquired sounds with the Save File... button. Once again, this is the same file format used in the random and manual tools - the user can change their sounds with the other programs by using the Open File... button. Finally, the user can press Quit which will exit the program.
5.3 Method After the three search tools were implemented, it was time to actually conduct the experiment and get some results. The experiment involved getting users to try each of the search tools. The users first tried the manual tool, then the random tool and finally the evolution tool. Before they used each tool, detailed instruction was given and the user was taken through an example of the tool’s use. Three statistics were recorded for each tool: Time The time that the user used the tool for. This was usually around 10-15 minutes. Sounds Auditioned The number of different sounds auditioned. Each time the user made a change and listened to a new sound, it was noted. Sounds Liked The number of sounds that the user ‘liked’ out of all the sounds the user auditioned. This statistic was collected at the end, after all the tools had been used. Along the way, the user was encouraged to save any sounds that were remotely interesting. After all the tools had been used, all the saved sounds for each tool were reviewed. The user specified in retrospect which sounds they really did think were interesting, useful or amusing - the number of these was noted. It was necessary to collect this statistic at the end because quite frequently in the early stages of the trial, the user would save a lot of sounds they thought were great. However, after using the other tools and seeing what other sounds they could get out of the algorithm, they would change their minds. 8
Using the aforementioned mate subroutine.
5. Results
78
The motivation for these statistics was to see how many good sounds you could find with each tool in a certain time. It might be found that with one tool, users find a lot of good sounds quickly, whereas with another, it takes a long time to find a few. For this comparison to be valid, it was important that each tool be used for an equal amount of time. However an ‘equal amount of time’ was not so easy to judge. For instance, the manual tool took a long time to get used to. Also when using the manual tool, people usually think a while, make a few deliberate changes and then listen to the result. These factors result in the user spending a lot of time with the manual tool and only listening to a few sounds. The random tool however is very simple and users get the idea straight away. Many sounds are auditioned in a very short time. To get a good comparison between these two tools one would like the manual tool to be used for longer, just to ensure that the user had ‘the hang of it’, and was not spending all the time just trying to work out how to use the interface. As a result, each tool was used until the user got ‘the hang of it’. Although this doesn’t seem like a very good metric of time, it proved to be the only viable option. As well as collecting the three statistics mentioned, after using the search tools, users were asked a question along the lines of: “If you needed some sounds (say, for a composition you were writing) and you were only able to use this algorithm and the three search tools, what strategy would you use to find the sounds you desired?” It was hoped by asking this question, further insight could be gained into which tool the users thought was most effective for searching the sound space. Their answer to the question and any other comments they had about the three search methods was duly noted. The whole process of instructing the users, collecting statistics and asking the question took about 1 hour. This is quite a large amount of time to ask volunteers to donate9 - because of this, just seven users were surveyed in total. Of these seven users, only two of the people surveyed were active composers of electronic music - people who have a real need for these kind of tools. The rest of the users, although having an interest in electronic music, would not have a need for the search tools. Although this may seem like a deficiency, it would be good to see if any of the search methods allows ‘normal’ people to explore the sound space effectively - people who have no knowledge of how the synthesis algorithm works, but simply know what sounds nice and what doesn’t. 9
Especially considering they were not getting paid!
5. Results
79
5.4 Results 5.4.1 Raw Data The raw data for the statistics Time, Sounds Auditioned and Sounds Liked is presented first. The results for the manual tool are shown in table 5.1; the random tool in table 5.2; and the evolution tool in table 5.3. Each table shows the data collected from each user and also the mean and standard deviation of this data. User Emma Dougal Rachael Tom Brett Chris Nick Mean
Time Sounds Sounds (minutes) Auditioned Liked 18 20 3 21 40 2 9 25 2 18 21 2 14 24 2 14 25 3 15 20 2 15.57 3.867 25.00 6.976 2.286 0.488
Table 5.1: Raw statistics collected for the manual tool User Emma Dougal Rachael Tom Brett Chris Nick Mean
Time Sounds Sounds (minutes) Auditioned Liked 10 40 5 4 40 6 5 40 4 6 24 5 5 41 3 8 36 4 5 43 3 6.143 2.116 37.71 6.396 4.286 1.113
Table 5.2: Raw statistics collected for the random tool
5.4.2 Analysis of Data Table 5.4 presents a comparison of the previous statistics – it compares the averages of each statistic for each search tool. It will be useful to analyse each line of this table.
5. Results
80
User Emma Dougal Rachael Tom Brett Chris Nick Mean
Time Sounds Sounds (minutes) Auditioned Liked 24 65 6 13 80 8 9 40 4 16 54 4 15 48 5 13 53 7 15 80 5 15.00 4.583 60.00 15.57 5.571 1.512
Table 5.3: Raw statistics collected for the evolution tool
1. 2. 3. 4. 5. 6.
Average Statistic Time Sounds Auditioned Sounds Liked Liked Time Liked Auditioned Auditioned Time
Manual 15.57 3.867 25.00 6.976 2.286 0.488 0.155 0.049 0.097 0.032 1.685 0.574
Search Tool Random 6.143 2.116 37.71 6.396 4.286 1.113 0.762 0.351 0.120 0.048 6.757 2.511
Evolution 15.00 4.583 60.00 15.57 5.571 1.512 0.395 0.142 0.095 0.022 4.185 1.230
Table 5.4: Mean statistics for each search tool.
5. Results
81
In the first line we see that the average Time spent using the Manual and Evolution tools was about the same. This implies that the tools were of similar complexity and users took about 15 minutes to get comfortable with them. The average Time spent using the Random tool however was much less - only about 6 minutes. This reflects the fact that the Random tool was much simpler than the others and users quickly became proficient in it. The second line of table 5.4 shows the average number of Sounds Auditioned with each tool. Sounds Auditioned for the Manual tool is quite low because the manual tool required users to think about what they were doing before modifying parameters. The value for the Random Tool is a little higher - it was easy to audition sounds with this tool, but users didn’t spend much Time with it. The Evolution tool had a much higher average than both of the others as auditioning sounds was easy (you didn’t have to think as much compared to the Manual tool), but users spent a lot of Time getting used to the tool (due to its complexity). Looking at the Sounds Liked statistic in table 5.4 we see a very positive result for the Evolution tool. On average, with the Manual tool, users found 2 sounds they liked; with the Random tool, they found 4; however with the Evolution tool they found 5.5 sounds that they liked. On the surface, we could conclude from this result that the Evolution tool performed the best and thus Artificial Evolution is the most effective way of searching the sound space. But doing this would neglect the facts that each tool was not used for an equal amount of time; and an unequal number of sounds were auditioned using each tool. If we analyse some ratios of these statistics we can gain further insight into the actual meaning of the data. The fourth line of table 5.4 shows the average of Sounds Liked divided by Time for each search tool - this gives us a measure of “Sounds Liked per Minute”. You can see that the Random tool scores highest here, with Evolution coming second and Manual last. This implies that using the Random tool you can find more sounds in a shorter amount of time. Does this mean that the Random tool is the most efficient for searching the sound space? This is in contradiction to the conclusion reached from looking at the Sounds Liked statistic alone. Furthermore, look at the fifth line of table 5.4 which shows the ratio of Sounds Liked to Sounds Auditioned. This could be another measure of efficiency - how many sounds do you have to audition before you get one you like? Again we see that the Random tool is the best, with the Manual and Evolution tools being about equal. This seems to imply that if a user auditioned an equal amount of sounds with the Random and Evolution tools, they like more of the sounds discovered with the Random tool. So again, the efficiency of the Evolution tool is questioned. Finally, line six of table 5.4 shows the ratio of Sounds Auditioned to Time. This, along with the Time statistic provides another measure of the tool’s simplicity - how
5. Results
82
many sounds can you audition per minute. The Random tool ranks best again - it is a very simple tool to use and encourages rapid auditioning of sounds. The Evolution tool comes second - it also encourages rapid auditioning, but it is not as simple as the Random tool. Lastly, the Manual tool - it is quite complicated and requires users to think before auditioning sounds.
5.4.3 An Alternative Analysis Before making any conclusions on the above analysis, let us consider another possibility. What if the number of Sounds Liked is not dependent on the search tool at all, but some other variable like Time? Instead of the trend being “people found less sounds with the Manual tool, and more sounds with the Evolution tool”, it could be “the more time people used any system, the more sounds they found.” We can see if this is the case by looking at a graph of Sounds Liked with any tool vs. the Time spent using that tool. This graph is shown in figure 5.4. Note that each point is labelled with a letter that identifies the tool used: E for Evolution; R for Random; and M for Manual.
Figure 5.4: Sounds Liked plotted against Time for all search tools. Apart from points of the same search tool being clustered, there is no trend in figure 5.4. This implies that Sounds Liked is definitely not dependent on Time spent.
5. Results
83
For example, if someone used a search tool longer than another person, you could not predict that they would find more sounds. On the other hand, look at figure 5.5, which shows Sounds Liked plotted against Sounds Auditioned. Again, each point is labelled with a letter to identify the search tool used.
Figure 5.5: Sounds Liked vs. Sounds Auditioned for all search tools. A clear trend is shown in figure 5.5: no matter which tool is used, more Sounds Auditioned implies more Sounds Liked. Before we were trying to argue that one particular search tool would produce more Sounds Liked than another. This graph goes against this argument, claiming that it doesn’t matter which search tool you use, just how many sounds you audition. For comparison, figure 5.6 shows Sounds Liked plotted against the search tool used. Here there is also a clear trend: the Evolution tool produces the most Sounds Liked, the Random tool produces less, and the Manual tool produces even less again. So which trend is more dominant? Is Sounds Liked a function of Sounds Auditioned (figure 5.5); or is Sounds Liked a dependent on which search tool is used (figure 5.6)? An attempt was made to answer this question by fitting polynomials to the two graphs and checking confidence intervals, however there were too few data points to
5. Results
84
Figure 5.6: Sounds Liked plotted against search tool get meaningful results. Instead, these numerical results should be taken with a grain of salt. Although conducting a test like this seemed a good idea at the start, drawing concrete conclusions from the data proved to be quite difficult. It might be more instructive to go straight to the horse’s mouth and find out what the users actually thought about each of the search tools.
5.4.4 User Opinions Recall that as well as collecting the statistics, each user was asked to answer this question: “If you needed some sounds (say, for a composition you were writing) and you were only able to use this algorithm and the three search tools, what strategy would you use to find the sounds you desired?” This section presents a short summary of each users answer to this question and any other comments they may have made.
5. Results
85
Emma To find sounds, Emma would use all three search tools. She found the manual tool very confusing at first, but it then became fun as she understood it. The random tool was the easiest to use and you could get quite good sounds from it. The evolution tool was also confusing at first - it was much more complicated than the others. She liked the mating feature and got quite good results from it, although she thought the evolution tool could be more straight-forward. Overall, she thought the tool used would depend on the situation. Sometimes you might be in a rush, and so use the random tool to get sounds quickly. At other times you may want to slowly refine sounds by mutation. She thought it would be best to have a system that combined all three tools. This way you would enjoy the advantages of each. Dougal For finding sounds, Dougal thought the evolution tool was the best. He thought it was much easier to use than the manual tool. He noted (as did many other users) that the evolution tool contains all the features of the random tool and more - it would be stupid to use the random tool when the evolution tool is available. Dougal had a lot of fun using the evolution tool, and wanted to continue even after the survey had ended. Rachael Rachael would also use the evolution tool to find sounds. She thought it was quite easy to use, except sometimes you forgot to save sounds 10. Rachael thought that if you understood the algorithm in depth, you would be better off using the manual tool. This would give you exact control over the sound you wanted. Tom Tom thought he would use the manual tool to find sounds. He liked the manual tool because you could see the envelopes that were being used. Also, when using the manual tool he knew the sounds generated were his and not some random creation that the computer cooked up. Tom thought that the evolution tool was good for refining sounds. He thought a good strategy to find sounds would be to first create something 10
This was due to a poor interface design... my fault!
5. Results
86
with the manual tool and then work on it and refine it in the evolution tool 11 . Brett To find sounds, Brett would first use the random tool and get a number of raw sounds that had desirable characteristics. He would then take these to the evolution tool and start working on them. He really liked the Mate feature and thought it was successful at combining the traits of two different sounds. He thought that Mutate was very good for fine tuning a sound. Brett liked the random tool because its simple interface did not distract you from listening to the sounds. This is why he would use it first, even though all its features are contained in the evolution tool. Brett thought the only use for the manual tool would be to ‘tweak’ sounds that you obtained via other means. Chris Before Chris’ comments are presented, a special note should be made: unlike other users, Chris is an active composer of electronic music. His compositions involve the use of many different synthesizers, both digital and analog. Because of this, he has an in depth understanding of many synthesis algorithms. Chris has been composing for over five years - he is the kind of person that would really utilize the search tools being assessed here. As a result, special attention should be paid to his comments. Chris strongly stated that the evolution tool was the most efficient way to search for sounds. Using the evolution tool, his search strategy would involve first generating some random sounds. These would be refined a bit using mutation and any vaguely interesting sounds would be saved. Next, Chris would mate the saved sounds in different combinations and hopefully obtain his desired sounds. Chris would have liked to see the envelopes of sounds in the evolution tool. He believed this would provide a visual feedback, indicating to the user what part of the sound had changed. He also would have liked a control that varied the degree of mutation. Chris thought it was good having a variety of ways to change the sound - in the evolution tool there were three: randomization, mutation and mating. Ideally though, Chris thought it would be good to enable manual modification of parameters if you needed it. This would create a ‘super tool’ with all the features of the three existing tools. The ‘super tool’ would provide the composer with the most flexibility. 11
Tom thought that all the programs could do with catchier names. He expected something along the lines of “SoundGen 2000” rather than “evolution tool”.
5. Results
87
Finally, Chris would have liked a ‘loop’ feature. This would constantly replay the current sound and enable you to sequence drums or other sounds on top of it. The loop feature would help you audition sounds; you could see if a sound fitted in with the rest of your composition. It would also make using the system a lot more fun; you would really be making music as the sounds mutated and evolved. Nick Nick is also an active composer of electronic music. Although not as experienced as Chris, he would still have a use for the search tools presented here. Special attention should also be paid to Nick’s comments. Nick had two strategies for finding sounds. The first was like Brett’s: the random tool would be used to get a few interesting sounds and then these would be refined using the evolution tool. The second strategy would involve using the manual tool to set the desired envelopes - this is easier here as the envelopes are displayed. Then the evolution tool would again be used to refine the sounds and set the other parameters. Nick saw the evolution tool mainly useful in refining sounds obtained through other means. Nick got some good results using the mutation feature of the evolution tool. Using it he was able to define a small subspace of sounds that he really liked. He thought it would be really cool to be able to interpolate sounds within this subspace. Ideally, this would be done in real time, while performing a composition. Nick was a bit disappointed in the underlying synthesis algorithm and quickly got a feel for all the sounds it could produce. He thought it would be good to use an algorithm that could produce a wider range of timbres.
5.5 Conclusions Taking all of the previous discussion and analysis of results into hand, we can formulate a number of conclusions:
Drawing conclusions from the statistics collected proved difficult. This was due to each tool not being used for an equal amount of time and an unequal amount of sounds being auditioned. This couldn’t be helped because some of the tools took longer to get used to, and some of them encouraged more auditioning of sounds. Also, with a small sample space it was difficult to tell which variables were actually effecting the number of sounds liked.
5. Results
88
We might tentatively say that based on the statistics, the Random tool proved to be most efficient for searching the sound space. The Random tool produced more interesting sounds in less time, and with less auditioned sounds than both the others. The Random tool was also the easiest to use, with users getting used to it much quicker than the other tools. The Evolution tool found more sounds than the Manual tool in less time, but on a sounds liked vs. sounds auditioned measure, they were about equal. The Evolution tool was easier to use than the Manual tool. Remember that the Random tool is a subset of the Evolution tool - the Evolution tool provides all the functionality of the Random tool and more. This implies the good result for the Random tool mentioned above is also a good result for the Evolution tool. The users comments generally enforce the above two points. The Random tool is good for finding sounds quickly, but it doesn’t have any control - you just wildly jump around the sound space. What is needed is a finer level of control that allows sounds to be ‘tweaked’ to get them sounding just right. Both the Manual and Evolution tools provide fine grained control or ‘tweakability’, but most users preferred the Evolution tool as it was easier to use. Even with a simple implementation, the Mate feature of the Evolution tool proved quite successful. Three of the users obtained good results with this feature. A definite bonus is Chris’ strong support for Evolution as his tool of choice. Being a active composer, he may be representative of the people who would gain most use out of these tools. Some users expressed the utopian philosophy that “all the search tools were good.” In this conclusion, you can’t say that one tool is better than the others, because they each have their unique advantages and disadvantages. Each tool is the perfect choice in certain situations.
In summary, the success of the Random tool was a bit unexpected - this simple method seemed to perform better than the more complex Manual or Evolution tools. However, the Random tool does not have the fine grained control required of users it is here where the Evolution tool excels. The Evolution tool enables users to tweak sounds but is much easier to use than the Manual tool. What’s more, you don’t need to understand the underlying synthesis algorithm in order to use the Evolution tool. If we note that the Random tool is a subset of it, the Evolution tool can definitely be regarded as the most efficient method for searching the sound space - combining rapid sound location and fine grained control. In short, artificial evolution is a useful tool for sound synthesis.
Chapter 6 Conclusion 6.1 Conclusion This investigation has explored the application of artificial evolution to sound synthesis. It has been shown that interactive evolution is a useful tool for exploring the sound space of synthesis algorithms. The above conclusion was reached from building and testing a system that evolved parameters for a simple FM synthesis algorithm. A preliminary version of the system was assessed by 3 users. It was also compared with the only similar work known to the author - van Goch’s P-Farm. Against PFarm, EvoSperfumed well. Its main differences were a focus on mutation rather than mating, and a much smaller population size. The users preferred these features and rated EvoShighly. The only drawback of EvoSwas the quality of sounds it produced this was a limitation of the FM synthesis algorithm. In addition, a more objective study was conducted that compared artificial evolution to more traditional methods of parameter adjustment. Three tools were constructed in order to compare manual, random and evolutionary methods of searching the sound space of the FM synthesis algorithm. Seven users took part in the survey, two of which were composers of electronic music. The results of the study suggested that the random method was the most efficient way to search the sound space. However, it did not provide the fine grained control required by users to ‘tweak’ sounds. The evolutionary method proved easier and more efficient in providing this control than the manual method. This is emphasised again when it is realised that the random method is just a subset of the evolutionary method. Both composers and non-composers found the evolutionary method useful. This 89
6. Conclusion
90
shows that artificial evolution provides a tool, not only for people who understand the underlying synthesis algorithm, but also non-experts who would still like to experiment with sound synthesis.
6.2 Other lessons learnt Along the way, some other valuable lessons were learnt - usually the hard way:
Evolving sounds wasn’t as easy as first expected! At first I thought I could just throw a Genetic Algorithm at a synthesis algorithm and fantastic things would happen. Sadly, this was not the case - the task involved a lot more careful design. Suitable ranges for each parameter had to be determined, usually by trial and error. One example of these difficulties was the problem of evolving the carrier to modulator frequency ratios1. Trying to make the parameters N1 and N2 seem ‘continuous’ was hard and not simply a matter of blindly applying an Evolutionary Algorithm. The interactive evolution technique developed did not prove successful when scaled up by a large amount. In order to surmount the shortcomings of the FM synthesis algorithm used, artificial evolution was applied to an alternative algorithm2. With around 20 parameters, it worked fine - with around 2000 parameters, it failed dismally!
6.3 Further Work Although much has been learnt during the course of this project, as always, there is still much that could be gained from research in the future. Some unexplored avenues include:
1
More experimentation could be done with self adaption in the EvoSsystem. Looking back at EvoS3 where the trial of self adaption was conducted 3 , the variances and ranges chosen for f2m , f2c and I2 seem ridiculous! It would be good to go back and see if some better results could be obtained by using some sensible values4.
See section 4.6. This experiment is described in appendix A. 3 see section 4.3. 4 You could apply the 1/20 rule that was developed later (see section 4.5.4). 2
6. Conclusion
91
As was suggested by a number of users, it would be good to implement a ‘variable-variance mutation’ feature. This would allow the user to vary the amount by which a mutant child differs from its parent. For example, as the user approached their desired sound they would decrease the variance so mutants would lie closer to the parent sound - thus they would be able to ‘home in’ on the desired sound. The mutation procedure could be further extended to allow certain parameters to be fixed as the user desired. For example, if a user knew they didn’t want to mutate the amplitude envelope further, they could ‘fix’ it. In subsequent mutant children, the amplitude envelope would remain constant while the other parameters were being optimised. The EvoSsystem would benefit with an increase in speed. Currently, there is a significant delay encountered when auditioning sounds. This delay could be eliminated by using a faster implementation language, for example using Csound instead of MatLab. Better still, real time response could be obtained by using dedicated hardware synthesizers in an approach similar to van Goch’s P-Farm [Goc96]. A large problem that needs to be dealt with is the limitations of the current synthesis algorithm. FM synthesis only provides a limited range of timbres. It would be good to adapt the system to an algorithm that can produce a much wider range of sounds. Furthermore, it would be good to make the system modular enough to be adapted to any synthesis algorithm the user cares to experiment with. Be warned however, this is no easy task. A preliminary attempt solving the above problem was made (see Appendix A), but it was not very successful. However, using artificial evolution for additive synthesis should not be disregarded. A mating operator seems quite feasible and the mutation operator could be made to work with a bit more experimentation. The payoff for creating such a system would be large: evolving samples of real sounds into unreal mutant relatives is a prospect to be relished by musicians, composers and anyone with a taste for the bizarre. I believe the work presented here follows in the footsteps of Richard Dawkins. The system combines a simple synthesis algorithm with an EA whose main operator is mutation. The next step is to follow in the footsteps of Karl Sims. This would require the evolution of symbolic lisp expressions that would be interpreted as sounds instead of images. Achieving this would solve the problem of limited-timbre synthesis algorithms – the evolution would be “open-ended.” Another large project would involve integrating the EvoS system into a larger compositional framework. This would better serve musicians and composers by combining artificial evolution with existing musical composition software.
6. Conclusion
92
In the short term, progress toward this goal could be made by implementing a ‘looping’ feature which allows drum patterns to be sequenced and played alongside of evolved sounds. It would also be good to provide access to all search methods from the same interface - manual and random, as well as evolution.
Finally, we have seen that artificial evolution for sound synthesis provides a useful tool for composers - who use the system for finding sounds that will be used in compositions; and also non-composers - who use the system for entertainment and experimentation. It would be to see if the system is also useful for researchers who explore new sound synthesis algorithms. This would involve getting some researchers to trial the system and give their opinions.
And there are, without doubt, many other avenues of future research related to this work that I have not mentioned. Don’t let this stop you from pursuing them!
6.4 A final note Maybe it is not really important that one method of searching the sound space be better than another. Each method has its own appeal depending on the circumstances and the needs of the user – all three methods presented here make up a whole. Ultimately we are trying to build tools that enhance human creativity – unblock it – set it free. I look forward to the day that a machine can interpret my vision and make it a reality.
Appendix A Evolving Samples A.1 Introduction After the EvoS eexperiments had been conducted (chapter 4) it was decided to try evolving some more complicated sounds. The sounds from EvoS were interesting, yet very few were suitable for practical use - for example, in a composition. A user that evaluated the EvoS system suggested that a system that could mutate sound samples would be very useful to composers1 . But how could you mutate samples? One practical way of mutating any pre-recorded sound is via additive synthesis 2. Recall that additive synthesis can produce any possible sound by using a separate sinewave oscillator for each harmonic in the sound. Each oscillator has a separate envelope controlling its amplitude and pitch deviation. This is a bonus as we have already had a lot of experience in mutating envelopes – it should be a piece of cake. However, additive synthesis is computationally expensive, with most sounds typically requiring 8 or more harmonics for accurate reproduction. As was found with EvoS, MatLab is already slow implementing a couple of oscillators and envelopes for FM synthesis. Clearly, MatLab would be unsuitable for experimenting with additive synthesis at interactive rates. All is not lost however, the Csound program has a very quick implementation of additive synthesis and also another major advantage: a heterodyne filter analysis tool. This tool can analyse any audio sample and return a data file containing the amplitude and pitch envelopes of each harmonic. Another Csound command can then be called which reads the analysis file and uses the amplitude and pitch envelopes to synthesize the sound. Using Csound, an evolution system can be built very easily by following these steps: 1 2
Emma suggested this - see section 4.7.4. Refer to section 2.2.3.
93
A. Evolving Samples
94
1. Sample the sound you wish to evolve. 2. Analyse the sample using Csound’s heterodyne filter analysis tool. 3. Read and modify the analysis data file. This can be done with an external program like MatLab. The modifications you make represent the mutation of the sound. 4. Synthesize the modified analysis file using Csound. Play the synthesized sound to the user. The user can decide whether they like this sound (discard the original analysis file), or would rather try another mutation of the original (discard the mutated analysis file). 5. Repeat steps 3 and 4 until a satisfactory sound is obtained.
A.1.1 Implementation The first step in implementing the evolution system described above was to read and interpret the Csound analysis files. Due to the simple structure of the files, this proved quite easy to do with MatLab. And using MatLab it was easy to visualise the results. Figures A.1 and A.2 show the data of a particular analysis file.
Figure A.1: Example of a heterodyne analysis file, the amplitude envelopes
A. Evolving Samples
95
Figure A.2: Example of a heterodyne analysis file, the frequency envelopes You can see that the amplitude and frequency envelopes are quite detailed – each has about 100-200 points. The next (and hardest) task was developing a procedure that sensibly mutated these envelopes. At first an almost identical strategy to that used in EvoS 4 was applied3 . Each envelope breakpoint was mutated by a small random amount. A variance 1 of the amplitude range was used to control the amount of mutation. This reof 20 sulted in some envelopes being mutated nicely, while others were badly distorted. The problem was that some envelopes were much more insignificant than others. Looking at figure A.1 you can see that the amplitude of the higher harmonic envelopes is much less than the lower harmonic envelopes. The lower harmonic envelopes were 1 of their amplitude range), but the being mutated nicely (because the variance was 20 higher harmonic envelopes were badly distorted (because the variance was many times 1 of their amplitude range). greater than 20 To overcome this problem, an adaptive variance was used for mutation. Each envelope was taken on its own, the range of amplitude values was calculated and a 1 of this value. This variance was then used to control variance calculated that was 20 mutation for that envelope alone. Using this scheme, nicely mutated envelopes were obtained, see figures A.3 and A.4 for example. 3
Refer to section 4.4.2.
A. Evolving Samples
Figure A.3: Example of a nicely mutated amplitude envelope
Figure A.4: Example of a nicely mutated frequency envelope
96
A. Evolving Samples
97
However, there were still problems with this scheme. Figure A.5 shows a mutated amplitude envelope that is very noisy. This noise is undesirable and seemed to occur
Figure A.5: Example of a badly mutated amplitude envelope when breakpoints in the envelope were clustered close together. Although this problem could be rectified with more complex mutation schemes 4 , due to time constraints this was not possible. As most envelopes were being mutated well, it was decided to press on and experiment with evolution using this mutation scheme. After the mutation procedure had been implemented in MatLab, the rest was easy. Evolution was performed ‘manually’ by first mutating an analysis file in MatLab, then using Csound to play the resulting sound. It would be a simple matter to construct a user interface that automated this task.
A.1.2 Results For this experiment, evolution was conducted with three main samples: an organ, a bass guitar and a voice sample. The results were very disappointing. The problem was that all mutants sounded the same. For a given parent with two mutant children, it was impossible to tell the two children apart. The obvious solution to this was to 1 to a larger factor. However this didn’t help matters at increase the variance from 20 4
For example, filtering the noisy envelope.
A. Evolving Samples
98
all. No matter which mutants you chose, the sample just ended up sounding like it was being played underwater - the result was the same for the organ, the bass guitar and the voice sample. When smaller variances were tried, not only were mutant children indistinguishable from each other, but you couldn’t even tell the difference between the parent and the mutant child! Was this due to the badly mutated envelopes described before (figure A.5)? Certainly, the underwater/bubbling sounds produced could have been a result of rapidly oscillating, noisy envelopes. But even if this weren’t the case, I suspect that you still would not be able to tell the mutant children apart from each other. The problem is really the increased number of dimensions in the sound space. In EvoS7, there were about 20 to 30 parameters being mutated at a given time. Here, there are 20 envelopes (10 for amplitude and 10 for frequency), each with 100 to 200 breakpoints which are all being mutated at once. This is a total of 2000 parameters - a sound space with 2000 dimensions. The human ear just cannot detect small differences in this space, so a human user cannot possibly hope to choose a direction and follow it. To test this theory a small experiment was conducted with a very simple set of amplitude and frequency envelopes. Four envelopes were used (2 amplitude and 2 frequency), each with four breakpoints. This give a total of 16 points - a sound space with 16 dimensions. These were mutated and evolved using the exactly the same procedure as above. It was found that in this case, you could tell the difference between mutant children. In fact, it was quite easy to control the evolution and steer it in the direction you desired. It seemed that with 16 dimensions, evolution worked fine but when you scaled this up to 2000, it fell apart. Maybe this result can be excepted as evidence that this technique doesn’t work when the number of dimension is increased above 20 or 30.
A.1.3 Conclusion As no useful results were obtained from this experiment, the idea of mutating samples had to be scrapped - time was running out. This was a shame because if you could make the idea work, it would be great. Possibly all it needs it a more complex mutation procedure5 , but I suspect not. More likely, you need to find a way of reducing the dimensional space to about 20 parameters. This might be achieved by swapping envelopes between harmonics instead of mutating each point in every envelope. It is also a shame to drop this idea as mating samples would have been fairly easy to implement. To mate two samples you could just randomly swap some envelopes of one sound for envelopes of the other sound. 5
Maybe one that involves filtering to get rid of the noisy envelopes.
Bibliography [B¨ac95] Thomas B¨ack, Evolutionary Algorithms in Theory and Practice, Oxford: Oxford University Press, 1995. [Boo87] Michael Boom, Music through MIDI, Redmond, Washington: Microsoft Press, 1987. [Cho73] John M. Chowning, “The Synthesis of Complex Audio Spectra by means of Frequency Modulation”, Journal of The Audio Engineering Society, Volume 21, Number 7, pp 526–534, 1973. [Cso98] Csound - sound synthesis software. Csound is free for educational and research purposes. The lastest version is avaiable via ftp from ftp.math.bath.uk in the /pub/dream directory. Other Csound information and resources can be found at the Csound Front Page, http://www.leeds.ac.uk:80/music/Man/c_front.html [CE95] Kai Crispien and Tasso Ehrenberg, “Evaluation of the ‘Cocktail Party Effect’ for multiple stimuli within a spatial auditory display”, Journal of the Audio Engineering Society, Volume 43, Number 11, pp. 932–941, November 1995. [CH96a] Ngai-Man Cheung and Andrew Horner, “Group Synthesis with Genetic Algorithms”, Journal of the Audio Engineering Society, Volume 44, Number 3, pp. 130–147, March 1996. [CH96b] San-Kuen Chan and Andrew Horner, “Discrete Summation Synthesis of Musical Instrument Tones using Genetic Algorithmms”, Journal of the Audio Engineering Society, Volume 44, Number 7/8, pp. 581–592, July/August 1996. [Daw86] Richard Dawkins, The Blind Watchmaker. Harlow: Langman Scientific & Technical, 1980. [Daw88] Richard Dawkins, “The Evolution of Evolvability”, in Artificial Life, Christopher Langton, Editor, Addison-Wesley, 1988.
99
BIBLIOGRAPHY [DJ85]
100
Charles Dodge and Thomas A. Jerse, Computer Music: Synthesis, Composition and Performance, New York: Schirmer Books, 1985.
[Goc96] Arno van Goch, “P-Farm”, an experimental version of this program is available from http://www.xs4all.nl/˜avg/pfarm.html [Goc98] Arno van Goch. Personal Communication. August 1998. [Gol89] David E. Goldberg, Genetic Algorithms in Search, Optimisation and Machine Learning, Reading, Massachusetts: Addison-Wesly, 1989. [HA96] Andrew Horner and Lydia Ayers, “Common tone adaptive tuning using genetic algorithms”, Journal of the Acoustical Society of America, Volume 100, Number 1, pp. 630–640, July 1996. [HB96] Andrew Horner and James Beauchamp, “A genetic algorithm-based method for synthesis of low peak amplitude signals”, Journal of the Acoustical Society of America, Volume 99, Number 1, pp. 433–443, January 1996. [HBH93] Andrew Horner, James Beauchamp and Lippold Haken, “Machine Tongues XVI: Genetic Algorithms and Their Application to FM Matching Synthesis,” Computer Music Journal, Volume 17, Number 4, pp. 17–29, Winter 1993. [HG91] Andrew Horner & David E. Goldberg, “Genetic Algorithms and Computer Assisted Music Composition”, Proceedings of the Fourth International Conference on Genetic Algorithms [Hol75] John H. Holland, Adaption In Natural and Artificial Systems, Ann Arbor: The University of Michigan Press, 1975. [Hor95a] Andrew Horner, “Wavetable Matching Synthesis of Dynamic Instruments with Genetic Algorithms”, Journal of the Audio Engineering Society, Volume 43, Number 11, pp. 916–931, November 1995. [Hor95b] Andrew Horner, “Envelope Matching with Genetic Algorithms”, Journal of New Music Research, Volume 24, Number 4, pp. 318–341, December 1995. [Lev93] Steven Levy, Artificial Life: The quest for a new creation, London: Penguin, 1993. , In Richard K. Belew and Lashon B. Booker, editors, San Mateo, California: Morgan Kaufmann Publishers Inc., 1991. [Mat98] MatLab computing software, The MathWorks, Inc., http://www.mathworks.com/products/matlab/ for information.
see more
BIBLIOGRAPHY [Moo94] Jason H. Moore, “GAMusic 1.0”, available via ftp fly.bio.indiana.edu in the /science/ibmpc/ directory
101 from
[Moo98] Jason H. Moore. Personal Communication. May 1998. [Mus98] Music Machines, website at http://machines.hyperreal.org [Ope88] Peter Oppenheimer, “The Artificial Menagerie”, in Artificial Life, Christopher Langton, Editor, Addison-Wesley, 1988. [Pre92] Jeff Pressing, Synthesizer Performance and Real-Time Techniques, Madison, Wisconsin: A-R Editions Inc., 1992. [Sim91] Karl Sims, “Artificial Evolution for Computer Graphics”, Computer Graphics, Volume 25, Number 4, pp. 319–328, July, 1991. [Sim93] Karl Sims, “Interactive evolution of equations for procedural models”, The Visual Computer, Volume 9, Number 8, pp. 466–476, 1993. [Smi91] Joshua R. Smith, “Designing Biomorohs with an Interactive Genetic Algorithm”, Proceedings of the Fourth International Conference on Genetic Algorithms, In Richard K. Belew & Lashon B. Booker, editors, San Mateo, California: Morgan Kaufmann Publishers Inc., 1991. [Smi94] Joshua R. Smith, Evolving Dynamical Systems with the Genetic Algorithm, Honors Thesis, Williams College, Williamstown, Massachusetts, 1994. [Wil88] Scott R. Wilkinson, Tuning In: Microtonality in Electronic Music, Milwaukee: Hal Leonard Books, 1988. [YH97] Jennifer Yuen and Andrew Horner, “Hybrid Sampling-Wavetable Synthesis with Genetic Algorithms”, Journal of the Audio Engineering Society, Volume 45, Number 5, pp. 316–330, May 1997.