Immersion and Environment

A lone knight falls into a quiet cave, where all is still save for the distant scuttling of long-forgotten insects. Each footstep echoes in the cavernous space, blending with a far-away wind and the soft patter of water dripping from stalactites. Exploration reveals hints of an ancient civilisation: a door in the shape of a beetle's carapace, the ruins of abandoned structures and sign-posts. The knight enters the remnants of a village to the soft notes of a piano and a wind that remembers days of old, everything beyond the dim street lights shrouded in fog. A mere four buildings remain of what must once have been a prosperous settlement. Travelling deeper into the underground armed with only an old nail, the knight finds remnants of life—hostile insects that have minds only for malice. They dance to old forms of combat, memory of a warrior's movements returning to the rising tension of staccato strings and harmonic atmosphere.

A player controls this knight's every move with the manipulation of joysticks on a controller, their mind and body becoming one with the swing of the avatar's blade. Within minutes, the player finds themselves a part of this world, not only a witness to the cave, fog, and remnants of a civilisation, but a wanderer in this abandoned place. They have become immersed in this fantasy world, so connected to the knight they control and to the music and sonic atmosphere of the fictional environment that their consciousness is transported from the real world into the video game Hollow Knight [2017].

In this chapter, I will investigate what it means to be "immersed" in a narrative video game in order to build a sense of what composers work towards when scoring these games. After establishing what a video game "environment" is by beginning with an ecological basis for the term, I will explore how players experience these varied game settings. For scenes where player immersion is the goal, what is the music's role in building immersivity, and what features create the environment within which the player will ultimately become immersed? By establishing the concepts of immersion and environment, subsequent chapters will be able to meaningfully discuss how forms of musical representation target immersivity's constituent elements in specific environments and experiences.

Immersion

In non-academic settings, the concept of immersion is not unfamiliar. A sound system might advertise itself as having "immersive sound", or an artist might describe elements they add to a work as making their creation more "immersive". For example, Bose advertises their "Bose Immersive Audio" as "a sound that surrounds you" [Bose], and a content creator by the pseudonym "Scar" describes immersion in the game Minecraft as: "Making the space feel alive" [GoodVodsWithScar 2025: 7:05]. These colloquial uses of the term suggest that elements that either three-dimensionally encompass the player—such as audio panned to reflect the location of the sound source in the game—or hint at a living world beyond the user's direct field of view contribute to immersivity. How, and why do these sonic elements lead to greater immersion? Is a three-dimensional, audience surrounding atmosphere required for creating immersion? How is the highly individual experience of the player accounted for?

Though immersion is not very clearly defined, there is agreement that "to be immersed" is to feel as if one has entered the game world. As Janet Murray—professor of literature, media, and communication—writes, immersion is "the sensation of being surrounded by a completely other reality ... that takes over all of our attention" [1997: 99]. As a player engages with a game, immersion occurs when their perception of the real world is overshadowed by deep focus on the game and a sense of having shifted their consciousness from the real into the fantasy [Elferen 2016: 32-33]. While immersion may appear to varying extents in many types of video games, it is particularly common in narrative games within which the player fulfils a primary role in the progression of a story. Within these narrative games, many scholars agree that the music and sound design play a significant role in supporting immersion, ensuring that the player is anchored in the world and story as they engage with the ludic elements of the game [Summers 2016: 59].

In writing music that targets player immersion, composers primarily think about connecting the player's feelings to the world and narrative. This connection runs deeper than a composer setting out to simply make every player feel the same sense of "happy" or "sad", allowing each player to instead respond to the game uniquely. As composer Gareth Coker scores emotional scenes in games, he describes that "I want the music to help you connect with whatever it is you are feeling when you're watching the scene ... for some people the music will feel happy and joyous, and you might feel happy tears, but for some it might feel like sadness. It depends on how you relate to the character" [Howard 2021]. The music attempts to bring the player closer to the game both cognitively and emotionally so that experiencing the narrative is an aesthetic, impactful experience rather than a passive, disconnected one. Coker gives the player space to interpret the game on their own, working only to build bonds between the player and story that will result in this more meaningful experience. To gain a better sense of this story, Coker plays through these games himself, writing that the process of playing through the game gives him "an idea of tempo, arrangement, pacing and weight" [Cornell 2021]. Reflecting on composing music for the game Hollow Knight, Christopher Larkin similarly describes how a primary compositional goal was crafting a sense of the game world's overall melancholy tone. Through the music, Larkin wanted to create "the feeling that something was here, and it is no longer" [Shamaly 2025]. This music conveys a more complete sense of the game world, communicating information that may not be gleaned from visuals alone: Larkin uses sonic atmosphere to immerse the player in a world of history beyond that seen on the screen. It is clear that composers are not attempting to manipulate or force the player into responding to the game in a fixed way. Instead, Coker and Larkin express wanting to induce "feeling" in the player that emerges as a result of being more connected with the game world. Through music, these game composers are able to create powerful ties between the human playing the game and the range of possible emotions created by the narrative, allowing each player to experience immersion uniquely.

Psychologists have attempted to quantitatively measure immersion in video games through the use of questionnaires that gather information about a number of different qualities such as "emotional involvement, cognitive involvement, realism, sensory involvement, control, challenge, and social presence" [Dombrovskis, Ļevina, Ruža 2025]. One of the earliest studies was Witmer and Singer's "Immersive Tendencies Questionnaire", which attempted to quantify user presence—"the subjective experience of being in one place or environment, even when one is physically situated in another"—in virtual environments [1998: 230]. The questions on their questionnaire were fairly direct, such as "Do you ever become so involved in a video game that it is as if you are inside the game rather than moving a joystick and watching the screen?" [1998: 234]. A later questionnaire titled the "Game Immersion Questionnaire" similarly attempted to quantitatively measure immersion in games by having participants rate various experiences on a scale from 1–5 such as "I often forget the passage of time while I am in the digital space," or "While I am in the digital space, it seems to me that everything that happens there, happens to me" [Cheng, She, Annetta 2015].

While these questions very specifically ask the players about the quality of their immersion, an aesthetic quality of the immersive experience is lost by consciously reflecting on the gaming experience. Shifting between being immersed in a game to filling out a questionnaire creates a disconnect by virtue of being forced to analyse that experience from the real world. Playing a game in the context of the questionnaire grounds the player in the real world, preventing a fully immersive experience from taking place—by trying to measure immersion, these questionnaires affect the immersive experience. While these attempts at quantifying immersion lend insight into how psychologists think on this phenomenon, the methodology of the questionnaire as an attempt to quantify immersion leaves room for improvement.

Music's ability to build immersivity is heavily influenced by the player's individual literacy in video game music. Tim Summers describes drawing upon this literacy as using "general musical signs and/or references to other media and cultural touchstones that are already well-established" [2016: 60]. Literacy acts as a shared musical language between composer and player—as a player plays narrative video games, they learn and recognise musical tropes that appear in specific scenarios. As Melanie Fritsch describes, this literacy consists of a combination of music encountered in other games, in human culture more broadly, and in the context of the game technology [2016: 96]. By drawing on these musical features to give specific connotations to distinct segments of the game, composers of narrative video game music can build player investment in the fictional world and connect the player to the game's story through affective ties. As the music keeps the player engaged with the game for an extended period of time, they begin to find themself immersed in the environments and narrative. This section describes these aspects of investment, affective influence, and sustained attention as connections between the player and game, building an understanding of the goals a composer might have when attempting to craft an immersive experience through music.

Investment

When a player sits down to play a game, they are engaging in a transition of consciousness from the "real" world into a fantasy. Murray metaphorically illustrates this transition, writing that "immersion is a metaphorical term derived from the physical experience of being submerged in water" [1997: 99]. Before immersion takes place, the player is still grounded in the real; they are aware of the sounds and smells around them, of the temperature of their hands, and the game remains just moving images on a screen. As this player begins to further engage with the game, however, their consciousness shifts from Murray's metaphorical air into the water. Their focus on the real world dims, replaced by acceptance of the setting, characters, and sounds provided by the fiction [Ermi, Mäyrä 2011: 94]. Many times, I have found myself immersed in a game world for several hours, then surprised afterwards at how famished I was in real life, something I hadn't noticed at all while playing. Once a player has passed through this transition period, they find themselves more invested in the game world than in the real world. The fantasy is no longer an object that exists in their physical space, but the real has become a background to the fiction. By gaining this investment in the fictional world, the player opens the door to immersion.

Developing investment in the game world first requires the player to suspend their disbelief of the fictional aspects of a game. Much of these fantasy worlds could never exist in our world (eating a special mushroom doesn't immediately double one's height, as it does for Mario), but this cannot become a barrier to investment. When a player decides to play a game, they accept that the game might not be realistic, but that that is not a hindrance to the gaming experience. In his thesis on suspension of disbelief, Douglas William Brown argues that suspending disbelief allows the player to move beyond feeling simply "sympathy" for the game characters to feel "empathy" [2012: 205]. Brown explains that initially, the difference between the game and the real world forms a substantial barrier (sometimes called the "fourth wall") to immersion. He argues that by inviting a player to suspend their disbelief, games can blur this barrier, opening the path to immersion.

It would initially seem that this suspension of disbelief is a quality that the player must bring to the game, making immersion fully dependent on the player. While this has some truth (a player who attempts to point out every flaw in a game cannot possibly become immersed in it), most players enter games with the expectation that they will be suspending their disbelief. Samuel Taylor Coleridge, who coined the term "suspension of disbelief", believed that this suspension is a submission that the reader makes to the author [1817]. Coleridge suggests that interacting with media is always naturally accompanied by a suspension of disbelief. Brown points out that this suspension of disbelief may instead be seen as a "challenge from author to reader", where the player is challenged by the game to imagine more of the world than really exists based on hints created by the game [2012: 61]. Brown concludes that players will eagerly take on this challenge and suspend their disbelief, thereby inviting an immersive experience to occur. Becoming invested in a game does not happen immediately, and while the first step is the ability to suspend one's disbelief, being able to disbelieve the elements of a fantasy world does not immediately imply investment, much less immersion.

Investment is a deeper level of suspending one's disbelief. As a player begins to grow invested in a narrative game, they start to care about the narrative progression of the story and the results of segments of gameplay. Being an interactive storytelling medium, video games tell stories in a way in which the player becomes an active participant in the unfolding of the narrative. Investment in that story develops naturally as the player witnesses narrative progression as a result of their actions—the game shifts from being an object to be witnessed to becoming an experience. Becoming invested in the game's world and narrative is therefore essential for immersion to take place. Without a sense of importance in their quest, the player cannot meaningfully experience the sensation of being a part of the game world—investment tears down the barrier between real and fiction, fully inviting the player to believe in and take part in an immersive experience.

This investment in a game is ultimately a cognitive phenomenon. It relies on the player believing in the plausibility of the game and allowing themselves to be drawn into the world. The screen displaying the game becomes the player's entire world, and awareness of their real surroundings is replaced by focus on the game's environment. This dimming of the real world is accomplished by overtaking the player's senses, at once blocking out sensation of the real world (their vision becomes fixed on the game, and the sound design replaces sounds of the real world) and creating sensations tied to the experiences of the fictional world [Munday 2007: 57]. To open the door to immersion, then, is to allow the reality of a game world to take over one's perception of the world. Moving through life, we are naturally highly invested in the world around us. In order to become immersed in a narrative video game, that investment in the real world and our physical surroundings must be replaced by a similar investment in the game's fictional setting. As will be investigated in Chapter 2, music plays an important role in creating a greater sense of fictional surroundings, promoting investment through musical representation of game environments.

Affective Influence

While a player being invested in a game is necessary for immersion to take place, investment alone will not create an immersive experience. Scholars of video game music and of immersivity agree that for immersion to truly take place, the game must also elicit an affective response in the player. As Professors Lennart Nacke and Mark Grimshaw define, affect is "a discrete, conscious, subjective feeling that contributes to, and influences, an individual's emotion" [2011: 265]. This affect can describe both emotions such as sorrow or joy as well as the player's state of being: one might experience the quickening of their pulse when in danger, or the sense of calm that comes from a place of safety. Affective ties are emotional bonds that the game creates between the player and the game narrative, characters, and world. By forming this connection, the game more fully links the player to the story, shifting the gameplay from watching a tale unfold to being emotionally tied to that progression. Affect turns the cognitive experience of being invested in the development of the narrative into a deeper, emotional involvement with the game beyond mental attention and mechanical input.

Writing on music's role in audience immersion, Professor of Music Isabella van Elferen proposes the "ALI model" for investigating immersivity. This model describes three impacts of music on immersion: affect, literacy, and interaction (direct involvement with the music in a game) [2016: 35-39]. Elferen argues that game composers use knowledge of the expected literacy of the player in order to consciously target affect. In other words, the music's primary goal is to elicit an emotional response through reference to a shared musical vocabulary. As she writes, "listening to music cannot but stir emotions, connotations or identifications" [35]. Through shared experience of hearing music in set contexts throughout films and video games, musical tropes draw on player literacy to induce a targeted affective response. When hearing low marcato strings at a fast tempo alongside a brass melody, for instance, players might picture an action scene, while a high solo violin and soft piano may indicate sombreness.

Psychologists and musicologists have also investigated the link between music, affect, and immersion. Studying player emotional response to playing games with and without sound, Nacke and Grimshaw argue that affect leads to higher attention in the game, which results in greater immersion in the world. They write that the direct impact of some aspects of music (timbre is cited as one example) on the emotions is still unclear [2011: 267, 276]. Similarly investigating the connection between affect and immersion through interviews with gamers, Emily Brown and Paul Cairns found that "gamers who did not feel total immersion talked of lack of empathy and the transfer of consciousness" [2004: 1299]. From these psychological studies, it is clear that an affective influence is correlated with increased immersion, and conversely that not experiencing immersion is associated with feeling unable to emotionally connect with the video game. As Nacke and Grimshaw identify, however, it is initially unclear whether immersion leads to an affective response, or if an affective response creates immersion. They propose that there is a "feedback loop" between the two, where "the game itself takes on an emotional character that reacts to the player's affect state and emotions and that elicits affect responses and emotions in turn" [2011: 277]. By creating affective ties between the player and game, composers can trigger this loop and invite the player into an immersive experience, turning the mechanical process of manipulating a controller into a meaningful time of play that emotionally draws the player into the narrative.

Composers intentionally build affective ties to connect the player more deeply with their character. Composer Winifred Phillips writes that to reach a state of deep immersion, players "must fully commiserate with the emotional turmoil that the characters suffer as they face various dilemmas and predicaments during the course of the game" [2014: 60]. This suggests that a game's influence on affect is not created passively as the player experiences the story, but is rather an active effort by the composer to specifically target players' emotional involvement throughout the game. As Phillips describes, being immersed in the game depends on this emotional connection between the player and their game character. By forming affective ties between player and narrative, the game gains a reality that it would not otherwise have, more fully transporting the player into the fictional world.

Several gamers cite music as being responsible for their experiencing powerful affective responses. One example of such music is the track Guardian Battle from The Legend of Zelda: Breath of the Wild (BotW[2017], which plays when an enemy called a "guardian" begins to attack the player. One player describes their experience with this music: "When the music changes and the ... beeping starts I still start sweating" [u/doomtoothx 2022]. In real life, the player isn't in any danger,[1] yet their body responds as if they are, signalling a deep connection between player and game character. As music invites the player more fully into these moments, the game becomes a time when the player experiences adventure and fiction in a way that cannot be achieved in the real world. Players also describe the ending of Ori and the Will of the Wisps (Will of the Wisps) as triggering significant emotional reactions [u/jxdie04 2024]. In an ending cutscene, the main character (whom the player controlled throughout the game) must sacrifice themself to heal the game world, the scene accompanied by an orchestral variation of the main theme, Ori, Embracing the Light. Many players on Reddit report crying upon reaching this ending, and one user, u/Sharion_inuyatt describes that "After the end of the second game I can't see or listen to any music from this game without wanting to cry." [u/jxdie04 2024]. A second user, u/Zerir writes that "I didn't start crying until a couple nights later, but it gets to most everyone eventually..." [u/Sub_Omen 2021]. It is clear that through experiencing music while playing these games, players build powerful affective ties to the story and characters to the point that a tragic ending elicits tears even several days after having finished the game. When successfully connecting with the player's emotions during these experiences, narrative games more easily invite immersive experiences. In Chapter 3, I will discuss how musical representation of these experiences specifically targets this connection by enhancing a sense of place in narrative video game environments.

Sustained Attention

While investment and emotional influence form the basis for immersion, being briefly immersed does not mean that the player has an immersive experience. One can imagine playing a game for three minutes, then holding a conversation with a friend, briefly returning to the game, leaving to make a cup of tea, etc. Clearly, this experience routinely breaks the player's investment in the world, and affective ties to the characters cannot be meaningfully sustained. In this case, the player might be playing the game, but they are not immersed in it. Sustained attention is a third quality of immersivity, illustrating that immersion is not only a result of interaction with a game, but is also time-dependent. As a player offers their time to a narrative game, they are drawn into an immersive experience that can last for hours in which investment in the real world is replaced by investment in the game world and affective ties to the narrative and characters.

Brown and Cairns propose that this time dependence is linked with distinct stages of immersion, and that "the amount of time, effort and attention required from the gamer increases for more immersive experiences" [2004: 1299]. They identify three discrete levels of interaction with a game: engagement, engrossment, and total immersion. Using these stages to illustrate immersion's time dependence, Brown and Cairns propose that the more immersed a player becomes in the game, the more they will lose track of time. They point out the "fleeting nature of total immersion" [2004: 1300], identifying how the player's attention needs to remain on the game for an extended period of time for immersion to take place and continue. Any distraction from the game world may easily shatter this state by reminding the player that a world exists outside the narrative, thereby pulling them from the fantasy. For composers and game designers, building immersivity cannot be separated from an attempt to retain the player's attention while playing the game.

Variable in this time-dependent immersion is the length of time required for an individual player to feel immersed. Experienced players may immediately feel themselves drawn into an unfamiliar world, quickly transported into the fantasy. Players who approach games with less experience opening up their imaginations to fictional worlds might find that it takes a much longer period of time to feel immersed. It falls upon the game developer and composer to craft the world in such a way that experienced players will immediately accept and place themselves within the world, and new players will more quickly accept the fiction. This immersivity is created by a combination of all the game's elements, including the music, visuals, narrative, and gameplay aspects. While this variety of options is available for game designers seeking to build immersivity, this thesis focuses specifically on the composer's ability to invite immersive experiences through music.

Music plays a significant role in keeping the player attentive on the game. A study on the therapeutic applications of music found that listening to music before completing a task "reduced distractibility" on that task [Morton, Kershner, Siegel 1990]. In video games where the player engages with different environments and experiences, music has the ability to keep the player focused on the gameplay and story, sustaining attention in the fiction. This sustained attention in turn deepens immersion as the player remains suspended in the game world for an extended period of time. A later study on the reaction of the brain to listening to a musical symphony found that the parts of the brain associated with attention peaked at transition times between symphonic movements, further connecting the experience of music and its absence to aspects of attention [Sridharan et al. 2007]. These studies illustrate the immersive potential afforded to composers as they score different game experiences. While how it does so varies across games, music has the ability to increase player attention on the narrative, sustaining attention in the fantasy world and maintaining an involvement in the progression of the story. When a game's soundtrack perfectly supports the gameplay, narrative, and visuals, the player finds themselves not only watching a screen, but surrounded by a sonic space that creates a more complete sense of the fictional game world.

Aside from only keeping player attention on the game, music can also affect how players perceive the passage of time. In one study investigating the role of music on time perception and immersion, Timothy Sanders and Paul Cairns found that "music reduces the experienced duration of playing a game but not the remembered duration", and that "the addition of music can make playing more or less immersive depending on whether the music is liked or not" [2010: 167]. These results highlight that when written well, music has the ability to deepen immersion by shifting the experience of the passage of time. Music leads to experiencing time more quickly, indicating both a focus on the medium and less perceived time that the player's attention must be suspended for. As Brown and Cairns identify, keeping this attention on the game over time is essential for maintaining immersion. Ultimately, the composer and sound design team have control over holding the player's attention through sound, at once replacing the sounds of the real world with music that artistically represents the fiction.

Flow State

One distinction that bears mentioning is the subtle difference between a player being "immersed" and a player entering a "flow state". While immersion involves building investment and empathy with the fictional setting, characters, and story, a flow state is entered when the player is so engrossed in the game that mechanically inputting controls becomes second nature. In this flow state, one does not necessarily need to be immersed in the game: a player can enter this state while playing an arcade game that has no story elements. Immersion, by contrast, is the sensation that one is submerged within the world of the game as an active participant, created by affective ties and elements that hint at a greater world beyond the frame of view of the player. Games with well-developed settings and difficult mechanics (such as Hollow Knight) can induce both immersion and a flow state, but games that require high-level mechanics without strong world building (such as Tetris [1984]), are more likely to induce a flow state without creating immersion.

In a brief psychological review of reported experiences of both flow and immersion, researchers argue that neither of these concepts have universally applicable definitions, and that even where some may attempt to define one as a combination of quantified cognitive and physiological responses, it is nearly impossible to define these concepts as an unchanging phenomenon experienced the same way by all players [Michailidis, Balaguer-Ballester, He 2018]. The authors propose that due to this ambiguity, there is no difference between flow and immersion. In practice, however, these terms are colloquially used quite differently. One player on Reddit describes flow state as "being able to take complex, precise, and skillful action more quickly and precisely than conscious, thoughtful action allows" [u/Milskidasith 2023], a concept tied to mechanical ability, rather than cognition. By contrast, immersion is more tied to the setting and characters, with one player commenting that "you try your best to 'transport' yourself to the world setting" [u/[deleted] 2014]. While flow and immersion have similarities, there are clearly subtle differences in what player states they describe. Both concepts describe a high level of engagement with the game, but whereas flow state is more associated with an ease of play through perfect mechanical control, immersion represents a more meaningful experience where the player is invested in a narrative and emotionally impacted by the game's story events. At the deepest level of immersion, the separation between player and character disappears as they find themselves fully transported into a fantasy world.

Investment, affective influence, and sustained attention are needed together to create an immersive experience for the player. Though scholars have proposed many ways that game developers can intentionally build immersivity, immersion ultimately remains dependent on the player: if a player wilfully prevents themselves from becoming invested in the game, they will not be able to achieve immersion. In addition, one can imagine that for new gamers, the unfamiliarity of a controller might serve as a significant obstacle to immersion—if one is putting all of their energy into figuring out how to move and jump, what attention is left to becoming invested in the story and characters? In games that succeed at crafting immersivity, this real-world struggle can be translated to a struggle in the fictional world. In these situations, it is not the physical player struggling to manipulate the controls, but their character struggling to properly wield a sword, learning to fight. The game can frame the player's difficulty as a moment where player and character learn and grow together, forming a stronger connection between player and character in pursuit of narrative progression. Music plays an essential role in forming and maintaining this connection, strengthening affective ties between player and game while maintaining focus on progression of the narrative. To induce immersion, however, the player needs a world to be immersed within. Analysing how composers can contribute to immersivity therefore requires a sense of the game environments that form the fundamental components of the game world.

Environments

As our brave, solitary knight travels deeper into Hallownest, the virtual world in which the game Hollow Knight takes place, they discover that not only does this kingdom consist of old stones tinged blue, but parts of the world are lush with vibrant greens or overgrown with pink crystals that radiate power. One moment, the player finds themselves surrounded by blue-grey pebbles, ornate fences, and plants that have long since faded to pale whites and greys. The ghostly sighs of a past nearly forgotten are accompanied by a string section playing slow harmony, the music an echo of a distant room. The knight wanders further, discovering a new region where life still thrives. Thick foliage covers every surface, drops of water patter through distant rooms, and elegant archways poke through twisting vines. Gone are the slow strings, now replaced by a dancing harp ostinato, a viola melody, and the buzzing of distant insects. As the player enters the region called Greenpath, shown in Fig. 1.1, they discover life in this forgotten realm, an environment lush with both visual and sonic movement.

The region 'Greenpath' from Hollow Knight.
Fig. 1.1: The region "Greenpath" from Hollow Knight.

From these tranquil forest glades to active volcanoes spewing fire into the sky, video games range through an extraordinary number of different settings. A player can step straight from the depths of a jungle into rocky mountains, fully shifting their surroundings in moments. Despite these drastic changes in their character's location, the player never moves in the real world, yet game developers that target creating immersion want the player to feel as though they have. As Henry Jenkins writes,

When game designers draw story elements from existing film or literary genres, they are most apt to tap those genres - fantasy, adventure, science fiction, horror, war - which are most invested in worldmaking and spatial storytelling. Games, in turn, may more fully realize the spatiality of these stories, giving a much more immersive and compelling representation of their narrative worlds [2004].

For narrative video games, these environments are an essential component of situating the narrative within an imagined world. Creating a setting is the foundation for telling a story, and to immerse the player within the story is to submerge them within the game environment.

In video games, players typically explore these environments in stages, beginning in a starting location then discovering more fantastical regions. This discovery is often led by the game's narrative—such as venturing to the top of a mountain peak to recover an essential relic—but can also be a result of the character's curiosity in exploring their surroundings regardless of ongoing story moments. When entering these environments, the game creates a sense of place through a combination of a number of features that both situate the player in the fictional world and respond to the player's actions. For instance, if a player rushes past a bush in a game, the bush will often rustle or drop leaves, rather than remain static. It is the collection of these different symbolic and physical elements that create the player's sense of place in these environments [Nelson, Ahn, Corley 2020: 237], creating a living virtual space within which the player can interact and play. To then "be someplace" within a video game is to have one's character be physically located in a specific environment, forming a sense of being surrounded by that environment's visual, sonic, and narrative features.

The term "environment" comes from an ecological perception of the world, thinking on all the elements that surround us in nature. Most fundamentally, animals take in information of the world around them in terms of what that environment can afford them: what elements of their local world might help them, what might be waiting to cause harm, and what can be safely ignored [Kamp 2024: 772]. From this ecological approach, video game environments are defined by the collection of features that make a region unique, including visual and sound design, shifts in gameplay, and environment-specific puzzles or obstacles: Is the player surrounded by trees that they must climb? By large boulders they need to bypass? Is water and swimming a central mechanic to the player's environment?

Nesting

Psychologist James Gibson breaks down environments into a set of elements on different perceptive scales by defining nesting. "For example, canyons are nested within mountains; trees are nested within canyons; leaves are nested within trees; and cells are nested within leaves" [1986: 5]. Nesting allows for a description of a game's environment at different scales of detail. For example, while playing through a two-dimensional game such as Super Mario Bros. Wonder [2023]—shown in Fig. 1.2—the player may briefly notice the rolling hills and rocky formations in the background, but not heed them further as they play through the game, since they have no impact on the gameplay.

Level design of the level 'Welcome to the Flower Kingdom!' from Super Mario Bros. Wonder.
Fig. 1.2: Level design of the level "Welcome to the Flower Kingdom!" from Super Mario Bros. Wonder.

Removing that backdrop, however, would significantly impact the player's perception of the world, as can be seen in Fig. 1.3, where the game suddenly feels far less complete, and it is much more difficult to convince oneself that the story takes place in a wider universe. Though the player largely ignores the broadest nesting level of the environment, it is essential to their perception of the world.

The level 'Welcome to the Flower Kingdom!' from Super Mario Bros. Wonder with the background removed.
Fig. 1.3: The level "Welcome to the Flower Kingdom!" from Super Mario Bros. Wonder with the background removed.

While some nesting levels are essential to perception of the environment, others merely enhance the player's sense of place. In a game such as BotW where there is less emphasis on finishing a level and the player has the opportunity to freely explore the world as they pursue the main quest, the player can explore different characters' diaries or journals. One such diary entry reads "I wonder if I'm coming down with something. I'll ask Grandmother for some medicine tomorrow" [2017]. While this diary has no impact on the gameplay or any decision the player might make, it speaks to the non-playable characters in the game having experiences and lives beyond those witnessed throughout the narrative, enhancing the player's sense of a broader world. The positive impact of these smallest nesting details on the creation of a more complete sense of environment has been affirmed by several players on Reddit, who report that these small details lead to a more complete sense of a world [u/Mrs_IrrSoft 2023]. In this case, the player's attention is brought to a narrower level of nesting, demonstrating that even details that bear no impact on gameplay can contribute to the player's sense of the broader game world.

Focus and Periphery

Many of Gibson's nesting elements can be separated into what I define to be either focal or peripheral components of the environment. Focal elements are those aspects of one's surroundings that may have a directly positive or negative influence on the gameplay: the player must be aware of the game's terrain, any dangerous spikes or enemies, or of a power they might collect. The peripheral elements are all other aspects of an environment—the trees, bushes, and stones—that are not immediately important for the player, but that build one's sense of surroundings. As a visual example, Fig. 1.3 isolates the focal elements of the first level in Super Mario Bros. Wonder, while Fig. 1.2 shows both the focal and peripheral elements. In defining these components, I intend to simplify Gibson's ecological discussion by describing each element as either directly impactful to the gameplay or existing in the background. This allows for a more general description of what a set of elements accomplishes within video games, rather than breaking down environments into a dozen nested levels.

Both the peripheral and focal auditory spaces consist of a combination of music and sound design. Music most often plays a role in the periphery, creating an environment without being a prime focal point for the player. Michiel Kamp defines such music as "ambient music" [2024: 773], which is music whose purpose is to create an atmosphere without being foregrounded in the player's attention. An example of this ambient music will be described at the end of this chapter with a discussion of Kwolok's Hollow from Will of the Wisps, where the player's main goal is to return life to a slowly dying forest. Music in this periphery is not always ambient, however. Kamp identifies that game scores also include "action music" [773] that may indicate a combat cue or dangerous moment for the player. For example, at a different point in Will of the Wisps, the player is Escaping a Foul Presence, and a musical cue involving string pizzicato, low brass, and a flute prefaces danger marked by tremolo strings and low brass shouts which urge the player to run and escape. While the music does not bear a direct impact on gameplay in such a scene, it clearly signals peril to the player, sonically reflecting gameplay elements. Music is often peripheral, though it can also act as a focal component of the environment: Kamp's description of semiotic music would be best described as a focal component, due to its direct impact on gameplay [143].

Sound design often occupies a more focal sonic presence. Many sound effects are generated by the player (such as those sounds tied to jumping, attacking, using a special ability, etc.), and are mixed at a stereo location that follows the player's position on the screen so that to the player, these sounds appear as though they originate from the character. These sound cues are a sonic confirmation of the player's actions. When they press a button on their controller, they receive instant feedback from the game. While players do not need to hear these effects to progress in the game, I label these sounds as focal components due to their direct ties to essential gameplay elements: often, hearing these sounds assists the player. Some sound effects also exist in the peripheral space, such as echoes of water drops heard distantly within a cave or the rustling of leaves in a forest, creating atmosphere without being foregrounded in the player's attention.

In this peripheral context, what is music's role? Is music meant to represent the atmosphere displayed on the screen in the same way Debussy's La Mer is meant to depict a sense of water? Should hearing video game music in isolation be able to create an atmosphere? Or is music instead meant to merely support the visuals, ultimately relying on the game's visual component to be fully appreciated? Composer and filmmaker Michel Chion describes the relationship between sound and visuals as an "audiovisual illusion", in which "a sound enriches a given image so as to create the definite impression" [1994: 5]. Chion argues that sound is needed to bring the visuals to life and that, without sound, many visual effects on the screen lose a sense of materiality. This reliance on sound is slightly overstated (even from silently observing the still Fig. 1.1 at the start of this section, the reader can infer a great deal about the environment), but the role of audio that Chion is identifying should not be discounted. As he points out, "Most falls, blows, and explosions on the screen ... only take on consistency and materiality through sound" [5]. As many visuals in real life are accompanied by sound, there is a sense of something missing when witnessing the same visual without auditory information. Even in the peripheral setting, music fills this gap, conveying information about the quality of the world beyond what is displayed by the visuals. In doing so, music enhances a sense of surroundings, at once representing the environment, supporting the visuals, and adding information to the environment that would not otherwise be communicated.

Elferen proposes that to accurately convey this information, music must appeal to audio-visual literacy: even in isolation from an environment, music evokes a sense of visuals by drawing on tropes that have appeared in that audio-visual environmental context [2016: 36-37]. For instance, in the desert village Gerudo Town from BotW, the instrumentation—a sitar, a number of wooden flutes, a snare, a triangle, and varied hand percussion—is likely to conjure the image of a desert-like environment. This image draws from stereotypes used extensively in Hollywood, which have been reinforced across countless media depicting deserts in both film and video games [Park 2024: 496]. As Kamp points out, though, the music can also provide supplementary information not gleaned from the visuals alone: "A score to a desert environment can either suggest that the desert is African or Middle Eastern through the use of augmented second intervals ... something that graphics or sound effects do not necessarily do" [2024: 776-77]. Despite often being non-diegetic, audio bolsters the player's three-dimensional perception of the fictional world through connecting this environment to ones they've encountered before in other media, expanding sense of place beyond the player's range of view. Instead of each video game region presenting a completely new scene for the player to learn and accept, game developers utilise shared literacy to call back to traits of environments found elsewhere. When players encounter a desert, many have pre-conceived notions of what the climate might be, what plants may reside there, and what musical timbres are likely to be present. By drawing upon this literacy, music as a peripheral element specifically communicates information beyond the screen.

Environment and Immersivity

For immersion to occur, the player needs something to be immersed within—the environment [Munday 2007: 56]. Created from a combination of an investment in the world and narrative and affective ties between the player and the experience of their character within the game world, a game's immersivity is dependent on the extent to which a game invites the player into its world. While many parts of a narrative video game—such as visual design and plot—contribute to this belief in the fantasy world, music plays a particularly important role in the development of the player's greater sense of the fictional environment. As Summers writes, music acts "as a medium to accentuate or develop our understanding of the game beyond the other communicative layers of the text" [2016: 72]. How music communicates information about the game environment to the player changes as the player adventures through different regions. The use of a sitar in the score might easily immerse a player in a desert due to the tropes of film scores, but could feel out of place in an open sky due to a lack of literacy encountering such a timbre in this environment. Investigating how composers may target player immersion in a video game therefore requires an analysis of the environment that the player will be immersed within. Once a sense of the environment has been established, it then makes sense to analyse how immersion takes shape within that environment.

These video game environments are physically defined by their combination of focal and peripheral elements, which work together to contribute to the player's sense of place in a wider world. Music most often plays a role in the periphery as composers use musical representation to surround the player by these environments, rather than just two-dimensionally presenting them with a thematic setting. As Kamp argues [2024], an analysis of the musical elements that boost immersion within certain environments must not be separated from the player's role within those particular game scenes. As will be seen in Chapter 3, players do not simply exist passively within these environments, but rather experience the game's narrative progression. Immersion within an environment relies on the player building affective ties to that narrative influenced by what the player is doing in the game and where they are situated. For example, an open field may inspire frolicking, while pits of bubbling lava could induce trepidation and urge the player to step carefully. By the numerous ways that scholars such as Kamp and Elferen have proposed that video game music can be analysed in different contexts, it is clear that players are meant to experience video games differently in different situations, and that in these contexts the audiovisual elements of the game contribute to crafting a particular affective response [Kamp 2024].

An Example – Kwolok's Hollow

To illustrate how composers might write to create an environment that invites immersion, I end this chapter with a specific example from Will of the Wisps, scored by Gareth Coker. As the player pursues their quest of returning light to the forest within which the game takes place, they pass through a swampy, vibrant grotto named "Kwolok's Hollow". This is one of the earliest environments encountered in the game, in which the player is meant to slowly discover more of the game world while fighting slightly more challenging enemies. A still image of this environment is shown in Fig. 1.4. As the player explores this region, they are accompanied by the similarly titled music Kwolok's Hollow, which uses a wide array of percussion instruments such as rattles and bells along with an orchestra primarily featuring winds and low strings.

The game environment 'Kwolok's Hollow' from Ori and the Will of the Wisps.
Fig. 1.4: The game environment "Kwolok's Hollow" from Ori and the Will of the Wisps.

A number of focal and peripheral elements contribute to the player's sense of this environment. Visually, the colour palette of this region immediately establishes a dark, damp atmosphere. The screen is saturated with deep shadows, green life, grey stones, and blue fog filling the horizon line, indicating that the player is in some deep underground region where there is enough water for plants to grow, but no sunlight. The focal elements of this environment consist of the terrain the player (the bright figure at the centre of Fig. 1.4) stands on, spikes throughout the terrain (visible in the top right), and enemies that the player may encounter (not pictured). These elements have a direct impact on the player's gameplay, informing what terrain the player might traverse or avoid. Sonically, the focal elements of this environment primarily consist of sound effects generated by the player moving, jumping, attacking, etc. (many of which are accompanied by vibrations of the controller). When the player inputs to the controller, there is an immediate visual, audio, and tactile response that connects the player to their game avatar. Some visual peripheral elements include the stones and brambles in the background, the bright orange flowers, and the distant pillar seen only as a vague shadow. The player's attention is not foregrounded on these elements, yet witnessing them in periphery plays an important role in grounding the player within this cavernous environment.

Sonically, the music is largely situated within this peripheral space—no harm will come to the player directly by virtue of ignoring the music. In writing for this environment with the track Kwolok's Hollow, Coker uses many percussion rattles and bells played with high reverb that, despite the music being non-diegetic, feel as if they could be the scuttling of distant insects echoing through the game's caverns. This reverb mimics the echoes that one would hear when in a cave in real life, creating the impression that the sounds the player is hearing have sources outside the game's field of view. By representing physical aspects of this space using music, Coker more fully situates the player within the environment, forming a perception of the foundational aspects of Kwolok's Hollow.

Connecting to more experiential aspects of this region, Coker writes many overlapping ostinatos and repeated lines, some of which pulse on quarter notes, some eighth notes, and some with more complex rhythms. One ostinato, played on chimes, is shown in Ex. 1.1, displaying a repeating descending line that grounds the music in B minor.

Chimes ostinato for 'Kwolok's Hollow' from Ori and the Will of the Wisps.
Ex. 1.1: Chimes ostinato for "Kwolok's Hollow" from Ori and the Will of the Wisps.

Throughout this track, Coker often writes extended melodic lines where the harmony remains on the tonic, as seen in Ex. 1.2. This harmonic stillness urges careful movement, contrasting with the forward motion created by the repetition of the ostinato. In this region of exploration, the constancy of the overlapping ostinatos urge the player to adventure and explore, while the harmony and melody played in a low register instrument such as the bass clarinet keep the player grounded in this earthy environment. As the music expands the fictional world beyond a two-dimensional screen into a realm that surrounds the player, the player's perception of the reality of this environment grows in turn, bolstering a sense of being submerged into the game world.

Theme for 'Kwolok's Hollow' from Ori and the Will of the Wisps.
Ex. 1.2: Theme for "Kwolok's Hollow" from Ori and the Will of the Wisps.

By often giving the melody to a bass clarinet and pizzicato bass, Coker places the player's sonic attention in a low register to reflect the depth of this environment. One melodic feature which contributes to this sense of grounding is the frequent return to the B1 note, which has a gravelly timbre when played on the bass clarinet. Writing the melody for a low frequency instrument in harmonic minor, Coker promotes a feeling of unease that reflects the depths of the grotto, the shadows that hide potential dangers, and the trepidation that accompanies exploring an unfamiliar region. The narrow frequency range foregrounded in many parts of this track adds to this unease, forming a density much like that created by the enclosing stone walls of the cave. The use of pitch range, harmonic rhythm, and ostinatos target an affective response in the player, at once encouraging exploration while warning the player to be wary of the dark and the unknowns that may lurk in the shadows.

Player immersion in a narrative video game is the sense of being transported into the game world. This phenomenon is created by a combination of player investment in the world and affective ties with their character and the story sustained over an extended period of time. Investment first requires the player to suspend their disbelief, and is deepened by elements of the game that lead the player to caring deeply about their character and the narrative. As they develop this investment, affective ties are strengthened by music that emotionally links the player to the game. For composers, music has the ability to more fully welcome the player into the game world, suspending their attention within the fiction to create immersive experiences.

This immersion ultimately requires an environment to take place within. These environments consist of a combination of focal and peripheral elements—groups of features that create distinctions between different segments of the game world. Composers can use musical representation to build the player's sense of being surrounded by this fantastical world, creating a location in which immersion can take place. In the following chapter, I will investigate how composers can use musical representation to create a stronger sense of environment, using two environments—the sky and the cave—as case studies for thinking on what musical techniques contribute to greater connection between player and game.