Inclusion and Accessibility

  • With Adrian Pennington

Inclusion and Accessibility

Inclusion and accessibility takes centre stage

Over the past few years, the film and television industry has finally begun to focus wholeheartedly on inclusion and accessibility. As a result, the demand - and the viewer expectation - for high-quality captions and subtitles across all platforms and content continues to increase - by Adrian Pennington...

Broadcasters, producers, content creators, and streamers are looking for expert service, fast-turnarounds, superior customer service, and a partner who can help them keep current with regulations.  To meet this rising need, Take 1 recently became part of the Verbit family of companies and has partnered with VITAC, North America’s largest captioning provider, to provide greater access - captions and subtitles as well as transcription, audio description, and dubbing - in the media and entertainment sector. 

“The combination enables us to offer a variety of new, enhanced, and expanded services and products, including live captioning, keep clients up to date with the latest technologies and innovations, and work as a true one-stop shop for all access needs,” says Louise Tapia, Take 1 CEO. 

“Getting all your accessibility requirements from one provider means working with one point of contact, one billing department, and most importantly, receiving consistent quality in a single workflow from deliverable to deliverable and from project to project. Using one vendor also can save money as many offer volume discounts for larger projects or when multiple services are ordered.” 

Stanza is the captioning and subtitling software application from Telestream. It was created to address the challenges of the high initial cost of obtaining a broadcast-quality captioning tool, explains Ryan Iorns, Captioning Product Manager. 

“Stanza provides a low-cost entry point for organizations requiring high-end captioning capabilities by offering a subscription-based business model.” 

To help address the challenges of remote working, the client-server deployment model of the product allows captioning editors to work from any location from a simple browser-based editing console, regardless of where media files are stored.  

“Stanza uses the Telestream GLIM engine to play back original high-res media instantly without any need to waste time transferring huge files across networks (on-prem or remote) or to spend time and energy creating proxies. 

It includes optional access to the AI-powered Timed Text Speech auto-transcription service which supports over 100 languages. Stanza also integrates with the Vantage Timed Text Flip text transcoder and processor to provide automation for captioning workflows. 

Stanza is built on the Telestream Media Framework, the same technology that underpins several Telestream products and services such as the Vantage Media Processing Platform. The Media Framework includes format and container support developed from over twenty years of experience, having been tested in some of the most challenging broadcast use cases around the world. 

Stanza uses the advanced IMSC 1.1 profile of TTML as its native format, and supports complex Unicode scripts, bidirectional text, vertical text layouts, Ruby, and other features needed for subtitling in all languages. 

And Stanza supports all modern export formats, including embed captions into media files, subtitle overlays and burn-ins, SCC & MCC caption files, and TTML, WebVTT, SRT, and EBU-STL subtitle files. 

Most broadcasters looking to utilize captioning and subtitling have three main goals, outlines Bill McLaughlin, Chief Product Officer, Ai-Media. 

Firstly, they want to caption more content than before as they expand their offerings into over-the-top streaming. Secondly, they want to power this through APIs and the cloud, without increasing on-premises infrastructure and human workflows. And thirdly, they want to leverage new technologies to reduce per-hour caption production budgets. 

Ai-Media's end-to-end captioning and subtitling solutions allow broadcasters to tick all these boxes. Its iCap Alta IP video encoder provides a resilient workflow for captions and subtitles across both compressed and uncompressed IP video, and it's a fully virtualised, API-powered pure software system. iCap Alta integrates with Ai-Media’s Lexi automatic live subtitling solution, which offers high accuracy and reliability at a compelling price point.  

“Broadcasters can use Lexi as a 100% automated captioning solution through a SaaS subscription,” McLaughlin says.

“Or with Ai-Media’s Smart Lexi, they can leverage a hybrid automated captioning solution with added quality enhancement, management and review from our experienced broadcast services team. Lexi and Smart Lexi are also available across pre-recorded or VOD content with a simple API workflow.  

“When you add these solutions together, broadcasters finally have a complete end-to-end solution that offers full coverage, high quality captions and subtitles at low cost. And not only that, one that supports a modern cloud-based approach that allows broadcasters to fully leverage automated workflows. 

“Trusted by the world’s leading networks, Ai-Media is the perfect partner for broadcasters looking to caption and subtitle their content. Since acquiring EEG Enterprises in 2021, we have supercharged our service processes through automation, cloud and IP video to deliver ever-increasing captioning accuracy and cost-efficiency.

Ai-Media is today a one-stop shop of captioning solutions and the only vendor that offers all the software, hardware and human services broadcasters need, in one place.” 

With a growing, aging population comes more hard of hearing and deaf people, who consume media across all kinds of platforms beyond just the regulated over-the-air model. People expect to have their content captioned, regardless of where and how they consume it.  

To aid content creators and distributors in their efforts to make their content as broadly accessible as possible, ENCO is building out a powerfully scalable Cloud version of enCaption, its Automated Speech Recognition (ASR) product. This introduces a microservices-based containerised processing and caption management environment that’s designed to scale and flex to myriad of Cloud-based captioning workflows (while also continually improving its on-premise and new hybrid-Cloud offerings too).  

“What’s more, a new and highly robust API allows for third party integration of automatic captioning into various automation architectures, MAM’s, and more,” says Bill Bennett, Media Solutions & Accounts Manager, ENCO Systems. “As always, custom word libraries can be added to support uniquely spelled names or terms, from both manual and automated ingest methods.  

Automatic transcription is a lot like automatic captioning with ASR and brings the benefits of searchable text files, helpful for on-the-spot live interviews to aid commentators and producers by helping to instantly call up the transcript and skim for key words needed to dive deeper into a story, or to help them find those unique sound bites hidden deep within a recorded interview.  

Automatic translation is also becoming increasingly crucial in an ever-shrinking world, so much so that ENCO recently acquired TranslateTV, a company specializing in fast and accurate on-premise English to Spanish translation. With many more languages available by Cloud, ENCO’s enTranslate product can concurrently generate dozens of different language versions of what’s said, live on-the-fly, to an incredibly diverse audience worldwide, in real-time. 

VoiceInteraction has been continually developing its core speech processing technology, while also expanding its coverage to new production and distribution workflows. Adding to an advanced new decoding strategy that allows for increased accuracy in unprepared speech, speech translation is now produced for live sources, enabling multiple subtitle languages per source stream.  

“Our underlying proprietary speaker identification and language identification modules were overhauled, for new classifications produced with lower latency and higher accuracy,” says Head of Marketing, Marina Manteiga. 

Audimus.Media has been traditionally associated with an external closed caption encoder/Teletext inserter device to add the automatic subtitles to the SDI video signal.

For markets with restricted budgets, CTA-708 captions can now be encoded into SDI signals while still offering a caption monitoring output, depending on the card used. Given the ongoing transition to IP-based production workflows, Audimus.Media can now operate as an ST 2110-40 captioning stream generator, with native SMPTE 2110 support added. 

“To cope with market specificities, VoiceInteraction has been expanding the native formats that can be produced and sent as a contribution to MPEG-TS multiplexers or muxed by Audimus.Media into a MPEG-TS stream: DVB-Teletext, DVB-Subtitling, ARIB B24, ST 2038:202, and ETSI EN 301 775. New supported transport protocols were also added: SRT and RIST can be used as input or output. 

“One of the longstanding challenges for our customers has been the distribution of live captions in their VOD platforms, with seamless audio and video synchronisation. Our latest product update combines encoded video recording with an embedded editor, exporting the subtitles into any NLE with automatic clip markings.” 

Digital Nirvana recently announced upgrades to its Trance self-service SaaS application to accommodate ease of use and in line with latest trends in the media production environment.  

Trance 4.0 can be either integrated within a media company’s workflow or used as an individual platform to generate and review transcripts with the aid of ASR and export them in various different formats for a variety of different use cases.  

Russell Vijayan, Director of Product at the company explains: Professional captioners can now easily convert the transcripts to time-synced closed captions using a combination of parameters as well as NLP technology to adhere to grammatical requirements.  

Users can further use a combination of MT, lexicon algorithms, and different presets to localise the captions into different languages, view them in a dual-pane tab for review. Enterprises will largely benefit from new features including elaborate account management and real-time account monitoring. 

Vijayan says the enhancements to Trance are based on customer requests and give users more outstanding capabilities and a better user experience.  

For instance, the new stand-alone Transcription app can be used to upload media assets and quickly access highly accurate, time-coded, speaker-segmented, automatic transcripts in the transcription window. Users can now get their content quickly transcribed, reviewed, and exported as a simple SRT for display as captions or time-coded VTT, JSON or other formats to ingest into various Web platforms or MAM systems. 

Automatically synced timecodes can be re-adjusted using spectrogram and manual inputs and the proper nouns and grammatical elements can be automatically adjusted based on requirement.  

Considering there are different display parameters for different languages, the ability to define a new set of parameters to split captions other than the source language is added. Users can also import an existing caption file to generate localised text.  

The new version comes with automatic checks for a list of parameters where users can identify any non-compliance with publishing platforms guidelines.