How spatial audio works in DeoVR
Spatial Audio is used to create sounds that come from specific points in the virtual world.
Spatial audio (also referred to as 3D audio or 360 audio) is an audio experience where the sound changes with the movement of the viewer’s head. 3D audio effects manipulate the audio waves produced by stereo speakers, surround-sound speakers, speaker-arrays, or headphones.
This involves the simulated placement of virtual sound sources anywhere in a simulated three-dimensional space, including behind, above or below the listener. To account for direction, distance, and environmental factors, content creators produce this type of “spatially oriented” audio experience when creating soundtracks. Spatial audio uses an ambisonic audio format. We plan to expand spatial audio support in DeoVR player with upcoming ExoPlayer integration in January 2022.
Ambisonics is a full-sphere surround sound format: in addition to the horizontal plane, it covers sound sources above and below the listener.
Ambisonics offer a matrix coding system for all stages of sound processing: A-format for recording microphone signals, and B-format for studio sound processing. The Ambisonics system offers full compatibility with mono, stereo, and surround systems, and can also provide a "periphonic" system (from the Greek word meaning "sound around corners") by adding pitch information.
The main signal matrixing formats are the following operations:
- The A-format is formed by signals obtained from four (or more) cardioid microphones arranged on a tetrahedron as follows (see Fig. 1): the left frontal is LF, the right frontal is RF, the left rear is LB, the right rear is RB. These signals can also be generated by appropriately panning mono-signals from distributed microphones.
- B-format uses signals that are obtained by sum-difference transformation of A-format signals:
W=0.5(LF+LB+RF+RB) This signal is obtained by adding all the signals in phase. B-format signals can be generated directly using the combination of microphones shown in Fig. 4a: three microphones with a "figure-of-eight" directional characteristic, oriented in three perpendicular directions of X, Y, Z signals and one non-directional microphone (extended MS stereophony), which produces a W signal (see Fig.2).
Ambisonics are a widely employed format for capturing, transmitting and decoding spatial audio, employing 1, 4, 9,16 or 25 channels
Let's take the example of an 8-channel Voyage Audio Spatial Mic as an example of how to make a Spatial Audio TBE (Two Big Ears) for the DeoVR player.
The TBE format is an 8-channel format employed by Facebook's Spatial Workstation and part of the "spatial audio" 8+2 channels available for Facebook's 360 videos. As shown here below, TBE is basically 2nd-order Ambisonics, where one of the Ambisonics 2nd-order channels (R) has been removed, for reducing the number of channels to 8.
1) Open the recorded audio file in Reaper. This file was only recorded on the microphone, so it is Ambisonic A-format.
2) To convert it to B-format you need to open the free plugin firm microphone. To do this, press FX > Voyage Audio > Spatial Mic Converter > Add.
3) Adjust the settings as required.
4) Everything is ready. Now you can process the master track using effects like equalizing, compression, limiter, etc.
5) Render the audio: Choose Stems in the menu, put 9 channels manually (4 channels for 1st order) and press Render
6) Next we need to bind the audio track to the video. To do this we enter the FB360 Spatial Workstation, select:
- Output Format: FB360 Matroshka (Experimental)
- Spatial Audio: B-format (2nd order ambiX)
- Spatial audio file: select 9-channel audio
- Head-Locked Stereo: empty or choose stereo audio file (if needed)
- Video file: select the video file
- Press Encode
You can see and hear how this works in DeoVR here: The Conscious Outlaws - I'd Rather Be Homeless Than Home with You (acoustic version)
Find full spatial audio DeoVR specs at https://deovr.com/app/doc#spatial