Spatial audio is a fantastic way to increase immersion for 360 video. By enveloping the viewer with binaurally rendered sounds that change depending on their head-movements, the creative opportunities for immersive story-telling expand still further. But creating a spatial mix can be technically challenging at times, especially without much detailed information and no one-fits-all type workflow solutions going around. Luckily there are some tools available that make life a bit easier. One of these is the Spatial Workstation by Facebook (previously developed by Two Big Ears). These set of plug ins work in most major DAWs such as Pro Tools HD, Nuendo and Reaper, and allow you to position and rotate sounds around the listener’s head while the encoder allows you to mux your final audio mix with your 360 video. With the Spatial Workstation’s latest release, PC users are now also able to take advantage of these plug ins and are able to export 1st order ambisonics for most major VR platforms, including Facebook, YouTube and Samsung Gear VR.

One  of the most challenging aspects of working with spatial audio is making your mix work consistently across multiple platforms, as each platform not only has its own specific delivery specifications, but also decodes ambisonic audio slightly differently, resulting in often inconsistent sounding mixes. The engineer will need to take the additional steps to test and fine tune the mix for each platform, which can be time consuming. The results however are worth it and certainly add an extra dimension to any 360 production by giving the viewer the added sensation of not just being able to ‘look’ around but also to ‘listen’ around them.


Recording ambisonics on set requires a soundfield microphone such as the Zoom H2n or Sennheiser Ambeo VR Mic, which produce 4 channels of audio (A-Format) along 4 axes: W – omnidirectional, X – front-back, Y – left-right, Z – up-down. These 4 channels of audio are later converted to B-Format and collectively capture the wave field on a sphere around the microphone. Using a soundfield microphone alone however will most likely not be enough to capture important sound sources such as dialogue with sufficient clarity. It’s therefore strongly advised to also record each actor individually using lavalier microphones. Using a shotgun microphone on a boom pole is in most cases unfortunately not an option on 360 shoots as the boom operator will always be in shot. During post-production, the additional dialogue tracks, sound effects and ambiences can then be added as mono sources and mixed spatially to embellish, or at times even replace what was recorded on set.

When it comes to planning and budgeting for audio post, it should be remembered that all the usual rules of film audio production still apply, but with the additional steps of exporting and testing mixes across all intended delivery platforms. Also bear in mind that the audio engineer is often the last link in the chain and responsible for rendering the final video with the finished audio mix on it, seeing as during the muxing stage, spatial metadata needs to be injected for platforms such as YouTube in order for the server to recognise and process the spatial audio track. Only then can the content be uploaded and fully tested.

This is a guest post by Marc – SpeedVR’s virtual reality & 360 video audio manager