Processing for the future of broadcast audio
- With Kevin Hilton
Audio processors are now expected to cope with more channels while ensuring broadcast sound signals keep to loudness guidelines. Contributing Editor Kevin Hilton looks at how current technology is keeping up with this and what broadcasters are expecting from current devices...
The basic principle of audio processing - to improve or maintain the sound quality as well as controlling the volume level on the main broadcast output - has remained a constant from the earliest days of television and radio to the present day. Advances in technology have boosted the sophistication of the procedure, notably the introduction of DSPs (digital signal processors) for both professional and consumer equipment, which has brought more flexibility and control.
DSP and other new techniques have improved not only the quality of the sound but also the ease of operation in broadcast centres and playout facilities. Technology is also enabling broadcasters and facility staff to deal even more efficiently with both distribution technologies and regulatory requirements that have been implemented in TV and radio over the last ten-plus years.
Key amongst these have been audio over IP (AoIP) and loudness, with each bringing additional issues for broadcasters to consider, particularly in more recent years as the move towards streaming has further complicated programme distribution and changed viewing habits considerably. "As AoIP becomes increasingly common, the number of audio channels available in a production environment has greatly increased," comments Kyle Wilken, Vice President of Firmware at Cobalt Digital.
Wilken adds that this gives "more creative flexibility", which can be an advantage as broadcast organisations now have to offer more options for viewers with specific needs. "For example, there is a requirement to maintain compliance with the 21st Century Communications and Video Accessibility Act," he explains. "Local stations may need to overlay audio descriptions of locally inserted information, such as text crawls. We continue to support various audio codecs, for both encoding and decoding, and ensure that we're able to carry sufficient numbers of channels."
Orban is best known for its OPTIMOD range of radio audio processors but has also produced units for television. These include the now discontinued OPTIMOD 8685 surround sound loudness controller. Peter Lee, Senior Vice President of Global Sales and European Operations for Orban, comments that a new TV processor will be introduced during 2024 but it will be a "purely stereo and SDI" unit, although it might also feature AES67 AoIP interoperability.
While AES67 features as part of the new OPTIMOD 5950 radio processor, along with SMPTE ST 2110, AES3, composite and digital MPX (stereo multiplexing for digital broadcasting) and analogue inputs and outputs, Lee observes that AoIP and surround sound are still not fully adopted in TV. "Many stations are SDI, not even HD-SDI," he says. "In the Philippines and Indonesia, where we supply a lot of TV audio processors, it's all SAD-SDI, more or less analogue. Even in the Netherlands, it took a long time to go HD-SDI, with upscaling from analogue. Every country is different but the Netflixes, Amazons and Apple TVs have forced people to get higher quality."
As it is for other audio processor developers, loudness control is now a major consideration for Orban, not only in TV but also FM and DAB radio, as well as for streaming. "We're all bound to maximum levels these days," Lee says. "It all has to be controlled. In radio and streaming it's actually better controlled than in television, where you get programme leaders [promos and trailers] that are so loud. What surprises me is that if you've got four stations on one network, each one has a different level."
Larry Schindel, Product Manager at the Telos Alliance, observes that loudness processing standards such as ATSC A/85 and EBU R128 have been applied to over-the-air (OTA) broadcasts for more than a decade. "Overall, broadcasters have things well in hand for their terrestrial signals," he comments. "What has changed in recent years is the growth in streaming platforms. Programming streams to the same levels as OTA and OTT content is fine if viewers watch everything on the same device - typically a TV set - but -23dB or -24dB LUFS [Loudness Units relative to Full Scale] isn't loud enough when watching on a mobile device, especially in a noisy environment."
Schindel notes that "most of the major streaming platforms" have now settled on -16dB, which puts them closer to podcasts (-18dB) than OTA broadcasts (-23dB or -24dB depending on region). Streaming has also brought new challenges in terms of where audio processing is carried out along the distribution chain. "For pure terrestrial OTA signals, those that go from a transmitter over the air to an antenna in the viewer's home, processing is still one of the final stages in the transmission path," he explains. "Those signals are generally very well controlled in terms of loudness. With MVPD [multi-channel video programming distributor] services, such as cable and satellite, or OTT delivery methods, things get more complicated, primarily due to local ad insertion, where the station's original commercial content is replaced by provider-specific inventory. Since that occurs downstream from - and independently of - the OTA signal, it introduces the potential for loudness shifts if the inserted content isn’t properly managed. The same is true for local stations that stream their content, as online ads are often geo- and user-targeted and aren't always as carefully produced or managed."
Orban's main point of processing is at the head-end, with the signal going into the encoders and then on to the output distribution paths. "All the processing is done in the studio," says Peter Lee. "We make sure there are no overshoots and that it all sounds good at very low bit rates. The bit rates are so low today but people want to broadcast stereo at a high quality. So, if you're running AAC+ at 44kb/s through an OPTIMOD, you can still get a fairly good hi-fi quality. For television we have the OPTIMOD 1101s or the 6300, both of which are used by many TV channels."
The ability to process audio signals in different locations on a network is something that has long been applied through the programme making stage of broadcasting. Many audio mixing consoles are now arranged on the basis of having the control surface in the studio or control room while the processing racks are located in a separate apparatus area, which could be some way away. Lawo is among the console companies that has pioneered this approach, starting with the mc² range of desks and now further expanded through the A__UHD Core.
"With this separation in place - and provided both the console and the processing core can communicate with each other over IP - a signal ingested in location A can be mixed in location B, which actually means the mixing commands generated there are executed by the DSP core in location C," explains Christian Struck, Senior Product Manager for Audio Production at Lawo.
"Open-standards IP furthermore allows a given signal to be transmitted to a variety of destinations for individual processing. Thanks to multicast streams, this no longer involves analogue - or digital - signal splitters. Depending on the expected deliverable, it may be necessary to tailor a mix to specific specifications - such as slightly differing LUFS settings for the various loudness formats. But do not expect as many mixes as there are delivery formats: this would run counter to the desire to produce more with less."
Lawo's audio processing has been software-based - running on FPGA (field programmable gate array) hardware - since 2018. Struck comments that, as well as this, "the cloud is probably at the back of everybody's mind", although there is still some uneasiness about latency due to the ingress and egress stages involved. Kyle Wilken at Cobalt Digital acknowledges that "a lot of audio processing is now done in software" but also highlights latency as a consideration: "This is especially the case for live production, which means that dedicated on-premises hardware is often still employed."
Larry Schindel at Telos observes that software is now increasingly either replacing hardware systems or being used in conjunction with them. "Processing for recorded content is easily handled by software, such as our Minnetonka AudioTools Server platform," he says. "It can be run on a local server or in the cloud. Cloud-based hosting is an attractive choice for users who would rather not buy and maintain their own hardware. Processing in the cloud is also scalable, meaning it's easy to spool up extra resources when needed.
"Managing loudness with software also means hardware-based transmission-path hardware like our Linear Acoustic AERO-series processors don't have to work nearly as hard, allowing broadcasters to back off on the overall processing. Real-time processing isn't going away, though. It remains necessary for live local content and as an overall safety net before transmission and distribution."
Today's audio processor may not necessarily always be a reassuring physical, hardware presence in the control room or mixing suite but it is now more essential than ever in ensuring good sound quality and consistent output levels across the wide spectrum of distribution chains that make up modern broadcasting.