Introduction

My earliest memory of playing video games begins in the Mushroom Kingdom, the virtual world in which the game New Super Mario Bros. Wii [2009] takes place. Upon launching the game, a joyful piano tune greeted me with the Mario Theme, immediately pulling my attention into the fantasy world. From a grooving melody that inspired running to musical hits where every enemy in the game would take a moment to dance, I remember dashing recklessly through each level, propelled forward by shakers and tambourines. Until I entered my first enemy castle. My steps immediately slowed to the sound of snares, an organ, and a cello melody, my whimsical sprint replaced by warning bells that indicated danger ahead. Instead of witnessing this new environment, I felt one with my small red character, felt that I was in danger just by entering the fortress. Shortly, I would obtain my first "star power", becoming invincible and sprinting forward at breakneck speed to the groove of an electric bass and synth pad. At the time, I didn't know why I was running, but I immediately felt the urge to move and make progress.

I entered the video game space at a time when some game scores were composed entirely of MIDI instruments—electronically reproduced samples of real instruments—while others were incorporating recorded orchestras in their soundtracks. For example, Nintendo's New Super Mario Bros. DS [2006] uses many synthesisers of real instruments including drum sets, organs, and orchestral instruments. Released one year later, most tracks in Super Mario Galaxy [2007] are played by a combination of a recorded symphony orchestra and layered synths. As a composer, these grand orchestral scores hold the greatest influence on my compositional style, the music often featuring conspicuous ostinatos, clear thematic material with character and setting associations, and melodies that unfold over an extended period of time. As I've developed as a musician, composer, and gamer, my fascination with these musical features of narrative video games has only grown. Why did exploring a snowy mountain accompanied by cello melodies and glockenspiel ostinatos make me feel cold and cosy? What urged me to sprint through some levels, yet tread slowly and deliberately through others? How could I spend hours exploring under the sun of a game's fantasy world, then look outside and realise the moon was already high in the night sky?

Of course, not all game tracks are designed to represent the same environment; the sonic and visual atmosphere often change drastically throughout a game. While lush forest themes (such as Through the Forest from Kirby's Return to Dream Land (Return to Dream Land[2011] or Moss Grotto from Hollow Knight: Silksong (Silksong[2025]) might feature winds and plucked instruments such as banjos or harp, dark mechanical scenes (such as Underworld from Return to Dream Land or Underworks from Silksong) utilise more prevalent reverb and low, densely voiced frequencies. When travelling from one game environment to another, I would immediately feel this shift in atmosphere despite my bedroom remaining the same temperature, the same brightness. How could I be so drawn into a story on a screen that my heart would start racing if my character was in danger? Why would listening to a game's music track bring me to tears months after having finished the game? These questions—both as a composer inspired by evoking scenes of the natural world and as a player who has been on the receiving end of such music—have led me to wondering about the composer's potential for shaping the player's experience within specific game environments. In this thesis, I will investigate what musical techniques composers of music for narrative video games might employ to transport the player into the game world, a concept known as immersion. How do these composers invite listeners to not only picture forests, but also feel as if they've stepped into an enchanted woodland without ever leaving their desk?

More broadly, what is music's role in creating atmospheres and representing space through sound? What tools are available to composers who seek not only to convey emotion through music, but also to place an image in the listener's mind, to invite the audience into a new realm of their making? While there is an interesting psychological analogue to this question around quantifying the extent to which music boosts player immersion, this thesis focuses on the role of the composer in this process. Using narrative video games as a medium with clear environments and narrative experiences, this thesis investigates how composers work to augment player immersion, this sense of being fully transported into a new world.

Video Game Music

Developing Technology

Inseparable from video games since their birth in arcades is sound design. Someone walking into an arcade is greeted by the "waka" sound of Pac-Man's travels through a digital maze, or the science fiction "whoosh" of a fired laser in Galaga as a small, pixelated spacecraft fights aliens. Players would visit arcades and experience games as part of a collective "gaming" community, playing games not as a way of experiencing a meaningful story, but simply as a mode of entertainment centred around completing game-specific tasks. For the earliest video games (from arcade systems in the 1970s through Nintendo's Nintendo Entertainment System (NES) released in 1983) sound chips were limited both in how many audio channels they could play at once and in how much physical space they could take up within game consoles [Burke 2024: 164]. At first, these chips were capable of outputting only pure electrical signals created in the same way the timbre of "synths" are made. Music written with this limited instrumentation is sometimes referred to as "8-bit music" or "chiptune". Due to this limited sound technology, composers were forced to adapt to writing for many timbres that hadn't been played before in live performances, creating an audio space saturated with the beeping of synths that has become synonymous with the sound of early video games.

As sound chips developed, the range of instrumentation available to video game composers grew in turn. Playing samples of live instruments became possible, as can be heard in the scores to many games on Nintendo's Super Nintendo Entertainment System (SNES) (1990), which was one of the earliest game consoles to feature music that explored a wider range of textures and timbres [Burke 2024: 170-74]. This allowed composers to shift away from solely synthesisers to sound worlds without instrumental limits including a full symphony orchestra or jazz big band. Alongside sound chip development, graphics cards and video game culture were also evolving, leading to an expansion of games beyond the single-task arcade games. When playing these games, players would now find themselves immersed in realistic skies or caves accompanied by music composed to represent these new environments.

This expanded compositional opportunity can be seen in a comparison of two tracks from Nintendo's "Kirby" game series. The track Green Greens from Kirby's Dream Land [1992] on the Nintendo Game Boy by Jun Ishikawa contains only synthesised audio. Working within these early limitations in instrumentation, Ishikawa makes excellent use of the available technology, focusing on creating clear melodic, rhythmic, and harmonic lines. Listening to the track, one can easily discern the high-pitch synth which carries the melody from the staccato waveforms and lower register arpeggios that create rhythm and harmony. While some tracks in this game utilise a sound channel dedicated to the bass line, Ishikawa writes here a more complicated texture, leaving no channels free for the bass alone (of the Game Boy's four sound channels, one is used for the melody, one for the rhythm, one for embellishing counter-melodies, and one for the harmony and bass). Instead, he uses the harmony line to jump to low pitches on strong beats, creating an implied bass line. Throughout this music's two sections, it is always possible to discern each individual line, but the limitations of the technology mean that there is no significant timbral change throughout the track.

Nineteen years after the release of Kirby's Dream Land, Ishikawa worked on the score to Kirby's Return to Dream Land, where the track Green Greens returns. This time, the range of instruments is widely expanded. The rhythm section is no longer one snare-like synth, but now consists of congas, shaker, snare, timbales, and crash cymbal. Along with this percussion expansion, Ishikawa's re-write of Green Greens features percussion breaks where all instruments except the rhythm section drop out. An electric bass plays a funk bass line, the melody is passed between a flute, clarinet, and synth, and the harmony, played by a hand pan and synths, is less prominent in the mix. Despite growing sound chip technology and the ability to incorporate the timbres of physical instruments, Ishikawa still features synths not only in the harmony, but also as an instrument carrying the melody for primary sections of the track. These synths were not a temporary timbre used in early games that was abandoned the moment technology improved, but have foundational ties to the roots of video game music, an association that many composers of music for narrative games still draw upon forty years after the release of the NES.

Storytelling and Narrative

Video game music is ultimately created to accompany a game, which itself has many other elements. In this thesis, I will be particularly analysing the music in "narrative games", games that feature cutscenes—moments in the game where the player watches a story unfold without performing any inputs for a period of time—and situate the player's experience of the story as a central component of the gameplay experience. In narrative games, music may play a different role than in other games where the objectives might be solving puzzles or battling other players. Many games are not fully situated within one category or game genre, but instead incorporate many elements as needed: a narrative game might involve puzzles, battles, and building segments. I will discuss games where the narrative aspect is an essential component of the player's experience—while the game may incorporate or feature these other gameplay features, the player is largely driven by an urge to progress through the story.

In many media, the audience is an external observer of a story, often growing invested in the characters and narrative, but never directly interacting with the film or literature. Narrative video games offer a unique shift to this method of storytelling, allowing the player to make decisions that have direct consequences on the story. As Daniel and Sidney Homan write, video games are a way of "sharing stories" rather than "telling" stories [2014: 169]. The listener does not just conceive of the story in their mind, the narrator places them into the story world, turning the tale into a dialogue between storyteller and audience, a unique experience for each individual.

Narrative video games take place within sets of immersive environments which themselves have evolved with developing game technology. American media scholar Henry Jenkins proposes that as opposed to telling stories linearly (like novels or films), video games engage in environmental storytelling in the same way that real world immersive environments such as amusement parks tell stories [2004]. In an amusement park, park goers step into a fantastical realm where all of their surroundings are tailored towards creating an experience. The world outside the theme park fades, replaced by a real fantasy where every environmental element is specifically designed to keep the audience immersed in the fiction. One of the earliest narrative games that aimed to create a semblance of this sensation was Zork, a text-based game released in 1977 with no visual graphics that allows the player to experience a story by reading and inputting short replies to lines of text. Since Zork, games have developed to tell stories in ways more similar to cinema, often beginning with a cutscene that grounds the narrative before the player gains any control of their character. These narrative games tell stories in fictional worlds within which the player may become immersed, suspending the player in an imagined world in much the same way as amusement parks surround park goers.

As the number of narrative games grew, the music accompanying these games evolved from early, classical music inspired game scores. Composers began to emulate film music, incorporating thematic ideas and timbres that bring a cinematic feel to the game [Garner 2024]. One game with such themes is Ori and the Will of the Wisps (Will of the Wisps[2020], which incorporates its main theme into emotional cutscenes such as bringing a friend back to life accompanied by the track Ori, Embracing the Light. In these narrative games, the music is not a passive auditory component of the medium, but is meant to draw the player into the scene, conveying the story's emotional connotations by inviting the player to take part in a narrative experience. In the past fifty years, video games have evolved from solely a gameplay-based medium akin to the arcade to an emotional art form similar to the cinema, emotionally involving the player in the story and allowing them to be immersed in the game as a narrative beyond just the mechanical play elements [Garner 2024].

Elements of Video Game Sound

Instead of being defined by typical stylistic aspects, "video game music" is instead better understood by function: the term is broadly used to describe any music written for a video game, just as the term "film music", though sometimes varying in style, describes any music written for a film. Video game music is sometimes written within a pre-existing genre, such as the Cuphead [2017] soundtrack by Kristofer Maddigan, which is almost entirely jazz inspired by big band, ragtime, and barbershop quartets.[1] Just as often, though, video game music blends influences from a multitude of music styles. For instance, the track Master Kohga Battle from The Legend of Zelda: Tears of the Kingdom [2023] showcases a diverse mix of styles: instruments that may point to traditional Japanese music such as 太鼓 (taiko), (koto), and 三味線 (shamisen) are heard alongside ones reminiscent of metal styles: synths, electric guitars, electric basses, and a drum set. This music steps beyond the traditional notion of genre, often resulting in greater analytical complexity. As a result, several scholars have built methodologies that provide additional techniques that may be helpful when thinking about this music, discussed at the end of this introduction.

There are many similarities between video game music and music for film, though a primary distinction comes from the nonlinearity of video games. Music written for both games and film is unable to be placed within a single genre, often features orchestral instrumentation and, in the case of narrative games, supports the audience's experience of following the story. Due to these similarities, many composers (such as Christopher Larkin, Lorne Balfe, and Michael Hoenig, to name a few) specialise in writing music for both film and video games. The main difference in how music is listened to in these media is the nonlinear nature of the game. Whereas a film composer has complete control over what music accompanies each visual frame, video game composers have no way of predicting exactly what scene the music will accompany as players explore the narrative in any way they wish [Collins 2008: 4]. This nonlinearity means that the time at which the player hears a specific part of the music is only partially determined by the composer—the music responds to the gameplay, not to the director's will [Summers 2016: 37]. Whereas a film composer meticulously syncs music to visual scenes, the game composer instead writes a number of music tracks that will be triggered by distinct narrative cues throughout the gameplay.

In the context of video game music, this collection of all music tracks heard throughout the game is called the "score". The number of tracks can range from just one to well over a hundred, but is typically around 30–60. While within the game these tracks might loop as long as the player is within a given region—their beginning and ending times are more determined by the player's pace rather than the composer's discretion—the published tracks are often shorter, one to ten minute pieces that can be listened to in isolation of the game. These can be the music from a title screen or world screen, or the tunes that play in each "level"—distinct segments of the game where the player remains in a similar environment while pursuing a particular quest. For open world games that feature a range of environments and narrative arcs, these tracks are likely to be tied to different environments, boss battles, or cutscenes. In general, every time a game undergoes a significant shift in atmosphere or setting, a new music track will be used: this could be entering a level, obtaining a new ability, or clicking pause and exiting to the title screen.

In addition to the score, the sonic space of a video game is often saturated with many nonmusical elements such as atmospheric sound or gameplay-related sound effects. These effects are distinctly seen in the game Will of the Wisps, where there are many sound effects tied to the player's actions such as an attack sound, a jump sound, a dash sound, etc. In addition, the game features detailed sound design tied to the unseen elements beyond the screen: distant birds chirping while in a forest, atmospheric wind rustling trees, the rushing of waterfalls. While this sound design plays an important role in the creation of a virtual world and the immersion of the player, they are not a primary focus of this thesis, which focuses on the composed score.[2]

Ludomusicology

Ludomusicology is the subfield of musicology dedicated to studying the relationship between music and play. The field was largely established in 2008 by Karen Collins' Game Sound: An Introduction to the History, Theory, and Practice of Video Game Music and Sound Design [2008]. Since then, ludomusicologists such as Mark Grimshaw, Kiri Miller, and William Gibbons have contributed to the field by exploring topics such as the connection between game music and technology, gamer ethnography, and psychological and phenomenological approaches to sound in games [Kamp, Summers, Sweeney 2016: 1-2]. With this rise in publications, ludomusicology has shifted from a field that initially inspired scepticism to one now widely accepted by musicologists [Fernández-Cortés, Cook 2021: 15]. In 2012, Michiel Kamp, Tim Summers, and Mark Sweeney founded the Ludomusicology Research Group that holds an annual conference for the field. Since its establishment, a number of issues central to ludomusicology have emerged that address the analysis of game music, the links between game interactivity and sound, and "the relationships between game music and art music traditions" [Kamp, Summers, Sweeney 2016: 2-3]. Summers lists fifteen questions he challenges himself to answer throughout his work [2024: 17-35], some of which are important considerations for this thesis, such as authorship, the role of the player in both shaping and experiencing the game, and the ties between game imagery and sound.

One question that Summers poses is "How does your work engage with issues of authorship, and the activity / power attributed to various agents in the game production? How do you negotiate between using composer/creator testimony and the trap of the intentional fallacy?" [2024: 21]. Summers describes how, often, it is very difficult to point to a single composer as the sole author of a piece of music in a video game. Sometimes, a large team works on the sound design, and audio creators may be left completely uncredited, making it difficult to attribute music to any single author. In addition, how should the artistic direction of the game's director, who has creative input on the end result of any audio, be accounted for? Summers also warns about falling into the trap of intentional fallacy: "using the author's intention as the primary means of judging and interpreting the artwork" [21]. The driving question of this thesis is investigating how music composed for narrative video games enhances immersion for gamers. The questions Summers raises, however, indicate that thinking of a single composer as creating the score to a game is often not entirely accurate. When discussing a given track, it will be important to account for not only composer intent, but also the impact of musical features on the qualities of immersion.

In her guide to writing game music, composer Winifred Phillips discusses the team-based quality of creating a video game soundtrack. She describes the different roles needed to produce a game's audio, though the actual filling of these roles varies based on game size and budget (sometimes, several tasks are performed by one individual). A music director oversees the many separate elements that must eventually combine, and a producer communicates with the composer and the game development team to keep the art aligned with the project. In addition, an audio director oversees the combination of dialogue, music, and sound effects in the game, working with a sound designer who collaborates with the programmers to tie the game's audio into the gameplay [2014: 135-44]. Phillips' description of the process affirms that analysing video game music is more complicated than identifying the composer's intent or style, but must more broadly account for the game's objective for the sound in a given scene, and the team's role in creating that overall sound. For games with smaller budgets, many of the music and audio roles listed above may fall solely to the composer, requiring proficiency with audio mixing and/or mastering [Phillips 2014]. As a video game composer collaborates with the game developer and design team, they develop skills beyond composition to ensure that the music and sound are perfectly integrated with the rest of the project.

One last question that Summers poses is the role of the player in video game music: "How is the authorial agency of the player accounted for in your work? What is omitted from a single perspective on the musical source?" [2024: 21]. As Summers writes, the player is in some ways performing the music that they are hearing. Controller inputs may dictate when the game music starts and stops similar to how a musician controls the beginning and end of sound with a bow on a string or a breath. At the same time, the player is a listener to the music. How is the player's experience of the score shaped by their role as both performer and listener? By investigating the link between player immersion and game immersivity, this thesis aims to contribute to the branch of ludomusicology concerned with the connection between sound and phenomenology. Drawing on scholarship that analyses music's relationship to topics such as interactivity and attention, later chapters will discuss how composers use musical representation to target the game-theoretical concept of immersion.

Methodologies for Analysing Video Game Music

Since video game music is described less by common features and more by function, many methodologies have been proposed for thinking about video game music including Summers' "Methods of Analysis" [2016: 33-53], Sean Atkinson's case study analysing music that represents the experience of flying [2019], and Kamp's Four Ways of Hearing Video Game Music [2024]. All methodologies highlight the required fluidity of analysis in approaching video game music.

Summers writes that while an analyst may approach this music with established tools (such as harmonic analysis or thematic development), many of these techniques can be complicated by elements unique to game scores. For instance, video game music often features many looped and layered game cues, making chord analysis needlessly complex and not very useful for understanding how the music is functioning [2016: 39]. Atkinson further highlights the importance of incorporating the game's narrative in music analysis, writing that "analyzing out of context can lead to interpretations that do not coalesce with the way the music is used in game" [2024: 57]. To illustrate this, he examines the music of flying sequences in Final Fantasy IV and The Legend of Zelda: Skyward Sword [2011] [Atkinson 2019]. In these case studies, Atkinson does not first set up the narrative, then separately discuss the music, but rather interleaves important game events with their associated musical cues followed by interpretations of those musical features as they relate to his theme of soaring.

Kamp provides a more general method for thinking about the functions of video game music, identifying four main ways of listening to music that accompanies different parts of the game: background music, aesthetic music, ludic music, and semiotic music. He describes background music as music that is meant to set the atmospheric foundation of the game. This is music that, even when paid attention to, does not give specific insight into the gameplay [2024: 65]. By contrast, aesthetic music does not inform the gameplay, but is instead meant to be noticed and to convey small pockets of beauty that enhance the playing experience [108]. Whereas aesthetic music invites pausing and taking a moment to appreciate the beauty of small moments, Kamp writes that ludic music urges the player to play. This is music that is closely tied to the gameplay, unfolding in parallel with the player's actions [110]. Last, semiotic music is music that directly communicates information needed for gameplay. Whereas the other three forms of music accompany the gameplay, listening to these musical cues in video games will directly aid the player in their experience of the game [143]. For example, in the Stubbornness section of Celeste's [2018] "Chapter 9: Farewell", there are a series of platforms that appear and disappear on beat to a bell playing on quarter notes. For the player, listening to the soundtrack is essential to being able to easily complete this level.

Summers, Atkinson, and Kamp all emphasise the importance of context when analysing video game music. These scores do not exist in isolation—to be heard on a concert stage with no other cues—but rather exist in an ecosystem flush with visual design, story, and emotion. With this in mind, in this thesis I will analyse the scores to games that I have played and loved,[3] games that have drawn me into their worlds so fully that the waking world would seem as a dream. There are many, many fantastic video game scores spanning a wide range of musical styles and artistic colours, and my aim has been to choose games that I feel are representative of this variety. As I analyse the music in these games, I will first establish the defining features of the environments or experiences that the music accompanies. Each example will then include a brief description of the narrative events accompanying the game scene interwoven with an analysis of how specific musical features representationally connect to perceived aspects of that environment.

In Chapter 1, I will first establish two concepts that are central to the thesis: immersivity and environments. What does it mean for a player to be immersed? What are the roles of the composer and game designer in building immersivity? What do I mean by a video game environment, and what components contribute to crafting such an environment? Chapter 2 describes the role of musical representation of the environment in building a game's immersivity. How can the music enhance the player's sense of place in the world, situating them within these fantasy environments? The game environments of the sky and the cave are used as case studies for analysing musical representation of the environment. Chapter 3 describes music's role in representing the player's experience. As the player progresses through the overall narrative of a game, they engage with a range of quests and phenomenological experiences. How does the music immerse the player in the journey of their fictional avatar? This chapter uses the experience of adventuring and of conflict as case studies. The conclusion summarises the main findings of this investigation, discussing how musical representation of environment and experience enhance player immersion throughout the game narrative.