How to turn heads: Positional audio in immersive learning
16th November 2016
The opportunities of 360 video and virtual reality (VR) are in designing highly immersive experiences for people to explore.
The ability to transport ourselves in this way extends the possibilities for engaging learners. But if the sound doesn’t align with the visuals, our brains will not be properly convinced. 3D positional audio greatly improves the delivery of a truly immersive experience. Combined with the visuals, it is the sound that creates that ultimate feeling of presence.
Our human ears pick-up audio in three dimensions and our brains process multiple cues to identify where a sound is and what it might mean. One of the most basic cues is how close something is and where it’s positioned, which we do far more effectively with our ears than our eyes. The auditory cortex is where all this information is processed. Other parts of the brain then identify a sound by comparing it to our memories of what have heard before, and then we decide our response.
Sounds can make us move – they can cause us concern, so we move away, or create a little intrigue that we look toward. Sound is able to help direct a narrative and guide the learner. Many people are still new to trying 360 video or VR material, so they may not realise they can look around. Early sound cues are a great way to get that movement happening.
In one of our recent 360 video projects, we applied a mono audio approach and compensated for the absence of full surround sound by adding bolder visual cues into the footage in post-production. But full 3D positional audio is capable of connecting sounds to objects as they move through a setting. So, even if something has moved out of sight and might be behind the learner, they can still be very aware of its presence and be compelled to turn around to see it. Conversely, if the sound is of poor quality and the learner is having to strain to hear, or the sound doesn’t fit correctly with the visuals, learners will quickly become fatigued and disengaged from the reality you are trying create. Great sound is the key to that ultimate suspension of disbelief that other methods can’t match.
Binaural recording is the creation of this realistic 3D sound. In production, it uses a range of small microphones placed around a model head. Because that head has density, these microphones capture sound as it would be received by human ears. Binaural recording captures the difference between the sound that a right ear hears and the sound that a left ear hears. Another aspect it can accommodate is how sound waves are affected by a head and body at the centre of the scene. This level of sound capture, when aligned correctly with the visuals, mentally guides us to where we need to be looking.
To get an idea of the way this type of recording works listen to this sample through some closed ear headphones.
Manipulating the audio and adding it to a virtual world requires head tracking and 3D software to ensure the environment changes with the person in the experience.
Beyond binaural, ambisonics is a surround sound technique that can accommodate sound sources above and below the listener as well as in the left/right horizontal plane. For now, all these spatially orientated audio options are best experienced with compatible headphones. It is the higher-priced headsets from Oculus, Sony, and HTC that have the processing power to deliver, but Google Cardboard is also developing the ability to support better sound. In the mean time, these sound experiences don't convert easily into an open space environment using speakers. But people are working on that too, so watch this space…
Kate Nicholls, Head of Learning Innovation shared Sponge's experience of creating 360 interactive video at DevLearn 2016, watch the recording of the session below: