Public defence in Acoustics and Audio Signal Processing, M.Sc. Christoph Hold
Public defence from the Aalto University School of Electrical Engineering, Department of Information and Communications Engineering
When
Where
Event language(s)
The title of the thesis: A Parametric Spatial Audio Compression Codec for Higher-Order Ambisonics
Thesis defender: Christoph Hold
Opponent: Dr. Jan Skoglund, Google, USA
Custos: Prof. Ville Pulkki, Aalto University School of Electrical Engineering, Department of Information and Communications Engineering
This thesis explores the coding of spatial audio content, which is relevant to various technology and entertainment sectors, including music production, virtual reality, teleconferencing, and other media applications that demand high-quality spatial audio experiences. The research specifically focuses on Higher-order Ambisonics (HOA), a format that represents audio scenes in the spherical harmonic domain (SHD).
The primary goal of this research was to develop a spatial audio codec capable of overcoming the challenges associated with high channel counts and data requirements, thereby making high-quality spatial audio content more accessible. As the demand for high channel-count spatial audio increases within the technology and entertainment industries, the need for more effective coding techniques becomes crucial.
The study introduced a modified spherical harmonic transform strategy for HOA signals, enabling directional analysis, modification, and reconstruction. A key result was the development of an audio compression strategy that achieves perfect reconstruction of low-order SHD components while using parameterized resynthesis for higher-order components. Additionally, the research developed SHD post-processing techniques to enhance the output quality of the codec based on input parameterization.
The main achievement of this study was the creation of an advanced spatial audio codec for HOA, which significantly improves upon traditional multichannel audio codecs. The developed codec maintains high perceptual quality while reducing the required transport data to a small percentage of the original input audio data. The research also proposes the inclusion of input parameterization side-information, allowing for efficient coding of high channel counts without sacrificing quality.
In conclusion, this research advances the state of the art in spatial audio coding for the HOA format. The quality and efficiency of the proposed audio codec have the potential to support broader adoption and integration of HOA in diverse media applications.
Keywords: Spatial audio, Ambisonics, Audio coding
Thesis available for public display 10 days prior to the defence at: https://aaltodoc.aalto.fi/doc_public/eonly/riiputus/
Contact:
[email protected] |
Doctoral theses in the School of Electrical Engineering: https://aaltodoc.aalto.fi/handle/123456789/53