A Preliminary Theory of Sound Design

Nathan Ho

2023-05-24

This post is my attempt at explaining my own philosophy of sound design. It’s not in final form, and subject to amendment in the future.

The type of sound design I refer to is specific to my own practice: the creation of sounds for electronic music, especially experimental music, and especially music produced using pure synthesis as opposed to recorded or sampled sound. These ideas presented might have broader applications, but I have no delusions that they’re in any way universal.

The theory I expound on isn’t in the form of constraints or value judgements, but rather a set of traits. Some are general, some specific, and my hope is that considering how a sound or a piece relates to these traits will take me (and possibly you) in some directions not otherwise considered. Some of the traits contain assertions like “if your sound does X, your audience will feel Y,” which in a composition may be employed directly, carefully avoided, or deconstructed. You’ll also note that the theory is concerned mostly with the final product and its impact on the listener, not so much the compositional or technical process. (Friends and collaborators are well aware that I use highly idiosyncratic and constrained processes, but those are less about creating music and more about “creating creating music.”)

No theory of sound design will replace actually working on sound design. Sound design isn’t a spectator sport, nor a cerebral exercise, and it has to be practiced regularly like a musical instrument. Reading this post alone is unlikely to make you a better sound designer, but if it’s a useful supplement to time spent in the studio, I’d consider this work of writing a success.

I will deliberately avoid talking about the topics of melody, harmony, counterpoint, consonance vs. dissonance, and tuning systems, and I’ll only talk about rhythm abstractly. There are many existing resources dedicated to these topics in a wide variety of musical cultures.

Transients

It is hard to overstate the perceptual importance of the initial transient of a sound object. A tiny click or burst of noise added to an attack, or a fade in, can make a huge difference.

Some questions to ask about a sound include: how sharp is the attack? Are there multiple discrete transients? Do parameters such as timbre, pitch, etc. change at the transient? Which direction? How fast?

Contrast

Contrast refers to not just differences in musical parameters such as pitch, register, brightness, transient, modulation speed and depth, etc. but also how quickly these things change.

Lack of contrast will put the audience into a more passive state of listening. Changes that are very frequent and/or similar to each other will be perceived as a texture. In between these two extremes, constrast will catch the listener’s ear.

Regularity

An obvious axis of regularity is steady tempo vs. random pulses. In modular and modular-adjacent environments (especially text-based languages) it’s tempting to think of regularity as a dichotomy between grid-based and completely random rhythms, but there is a whole world in between – slight humanization, acceleration and deceleration, steady rhythms with interruptions, etc.

Regularity also happens at the microsound level; silencing every other pitch period of an oscillator evokes a radically different effect from randomly silencing each period at a 50% probability.

Directionality

While contrast refers to all forms of change, directionality is about reaching a goal. Reverse cymbals and risers leading up to a downbeat are classic examples.

Directionality can even happen in the absence of change. In music with a steady pulse, our ears are conditioned to anticipate meter and hypermeter. For many listeners, a 4/4 beat with 4-bar groupings will suggest a new section at the end of 16 bars, even if there isn’t particularly strong change.

An offshoot of directionality is axiality – motion in opposite directions, either simultaneously or sequentially. (I borrowed this high-falutin’ term from Brian Ferneyhough.) Note that sequential axiality need not be continuous or shaped like a triangle wave; there can be a sudden jump along with the reversal of direction (e.g. 4 3 2 1, then 4 5 6 7).

Failing to properly command directionality is a perennial issue in my own music, especially when I use algorithmic techniques. If you’ve ever written a patch and wondered, “but how do I turn it into a piece with a beginning, middle, and end?” you may be having directionality issues. However, this isn’t advice, and I recommend speaking to a licensed clinical musicologist.

Space

This one’s fairly obvious, but I will point out that there are more ways to evoke space than just reverb in post. There’s also long decay times, frequency coloration, wide multichannel image, amplitude modulation of individual partials, complex resonances that mimic those of physical instruments, and chorus-like detuning. Reverbs themselves have tramendous variety outside of the standard modulated Lexicon/Alesis sound, with many interesting “bad” reverbs, both acoustic and simulated.

Vocals

The human brain is absurdly good at speech processing, and even vague and crude imitations of speech can instantly grab the ear. Vocal formants automatically stand out even when not specifically emphasized, and if they fail to sound natural or human, they can stand out even more.

Fidelity

Fidelity is more culturally bound than most of the traits here, but so many forms of reduced audio quality are instantly recognizable. Examples of reduced fidelity include waveshaping distortion, sharp filters, noise, vinyl crackle, wow and flutter, aliasing, bitcrushing, MP3 chirps, and dynamic range compression. Not all low-fidelity effects are downstream – techniques such as chiptune arpeggios, timbres considered cheesy or cliche, etc. can recall older eras of music and technology as well.

Verticality

This belongs to a cluster of terms like “arrangement,” “layering,” etc. but I ended up with “verticality” because it’s more specific and concrete than those. Verticality is a measure of change that happens simultaneously across all layers as opposed to individual parts.

If a piece mostly comprises changes to individual layers, a sudden global change will attract the listener’s attention. Steve Reich’s mid-era pieces, starting with “Music for Mallet Instruments, Voices, and Organ,” often feature evolving repetition with a small number of sudden, global changes.

Disorganized conclusion

An influence here is Disney’s 12 Principles of Animation. The 12 Principles are a great read for any electronic musician, as many of them apply beautifully to the design of modulation signals, a topic I plan on writing about more eventually.

For synthesis, getting past the stage of merely good patches and progressing to thinking about mid- and high-level traits like these is a sort of “coming of age” for sound designers. I feel that only recently (past year or so) something clicked for me and I went from designing patches to composing music. The technical process of synthesis has gradually felt more automatic and I can concentrate more on what I’m trying to say with the sounds. Writing this little manifesto is a fun exercise to clarify my thinking – if writing is your thing I encourage you to try writing your own theory of electronic music. Send it to me, also.