The field of media technology is changing rapidly. Technical advancements like UHD, VR and IP based streaming are challenging established AV technologies but also the technical side of subtitles and captions. How can existing requirements still be met and how can new technical opportunities be used for innovative services? Without doubt the subtitling tech community is at a turning point.
The SubTech 1 symposium is deeply grounded in the believe that conversion and harmonization are still possible.
Meet engineers and technical experts from standard organizations, studios, broadcaster, subtitling vendors or streaming providers. See what has been established, what is coming up and how to manage transition.
Opening Remarks: Michael Hagemeyer (IRT, Managing Director)
Host: Andreas Tai
Frans de Jong
European Broadcasting Union (EBU)
How has subtitling become so popular in Europe? What are the current challenges, especially for broadcasters?
Despite the development of newer and more capable subtitling standards, US closed captioning remains a hot topic for subtitling.
This presentation will explore the various technical and legal reasons behind the continuing importance of US closed captioning standards, and what new standards will need to do in order to eventually replace the current standards.
Korean Broadcasting Systems (KBS)
In South Korea, terrestrial UHD broadcasting using the ATSC 3.0 standard officially began at the end of May 2017. Since the HD and UHD programs are the same except for their video quality, the UHD programs reuse captions from the HD programs which are produced using live stenography. We implemented the ATSC3 closed captioning system, both for MMT (MPEG Media Transport) and ROUTE (Real-time Object Delivery over Unidirectional Transport), which are transport protocols in the ATSC 3.0 standard, and verified it by testing on commercial ATSC 3.0 UHD TV receivers.
Peter Cherriman, Frans de Jong, Pierre-Anthony Lemieux, Jason Livingston, Stefan Pöschel, Martin Schmalohr, Peter tho Pesch
Sandflow Consulting (supported by MovieLabs)
The IMSC family of W3C Recommendations is an application of the Timed Text Markup Language (TTML) for worldwide subtitles and captions on the Internet and the Web. It is the result of an international collaboration, and brings together popular profiles of TTML, including EBU-TT-D and SMPTE-TT. The initial version of IMSC (IMSC 1) was published in early 2016, and has since been widely adopted across the entertainment ecosystem, including by the SMPTE Interoperable Master Format, MPEG CMAF, ATSC A/343, and DVB A174. A complete test suite and multiple open source implementations are available.
This session presents an overview of IMSC, and discusses its roadmap, including the ongoing development of IMSC 1.1, which adds support for HDR, advanced Japanese language features and stereoscopic 3D.
20th Century Fox
Creating an archive of subtitles for a library of content is a difficult task, especially when working with so many different subtitle vendors. This presentation will discuss how Fox creates consistent IMSC 1.0 subtitle files regardless of which vendor the subtitle originates from. The IMSC 1.0 specification and style constraints Fox has chosen has allowed them to repurpose the same subtitle files for archival workflows which require stylized subtitles, and downstream platform workflows where the subtitle rendering is variable. This presentation will also discuss why these subtitles are appropriate for use in SDR and HDR dynamic ranges.
Subtitles on all ARD channels and catch-up videos can be customized by the user, offering viewers with hearing impairments increased enjoyment of television content. The presentation provides an overview of HbbTV subtitling standards and introduces a real-world implementation for subtitles on interactive TVs based on HbbTV technology. We will show our implementation and its functionalities, describe the technical details, and explain how other TV stations can benefit from this technology by using our development for their own personalized subtitles.
Subtitles are often considered as an afterthought instead of being a first-class citizen. In this talk Marc Brelot from GPAC Licensing will expose how the muxing of TTML in GPAC MP4Box was implemented. This is a unique occasion to look back and design better tools for distributing subtitles.
Overview of the subtitles format implementation in VLC media player, and a special focus on new subtitltes formats.
Red Bee Media
ASR has been used in caption production since 2001 to produce live subtitles through speaker-dependent respeaking. For the majority of this time, the use of fully-automated speaker independent ASR has been unable to deliver sufficient quality for production use. In recent years we have finally reached a stage where speaker independent ASR is able to bring productivity benefits for offline caption production. It is increasingly being looked at in the live space as well. How can an organisation identify the best ASR for a given task? How do you incorporate these processes into current production methods and entirely new solutions?
This presentation will illustrate the challenges in integrating these new technologies and the path of future development. It will also cover the use of related AI processes like alignment, NLP processing and machine translation.
Since the introduction of closed captions on analogue television systems around 40 years ago the technology available to consumers has been completely transformed, yet the technology we use in the broadcast chain to deliver subtitles has changed surprisingly little.
In this presentation we will look at the audience benefits of adopting new standards and consider the impact of doing so throughout the broadcast chain, using the BBC as a case study.
BBC Research & Development (BBC R&D)
The authors present work undertaken to verify the transform for use with HLG high dynamic range video in Appendix P of the W3C Timed Text Markup Language version 2 working draft specification which is likely to be used for subtitling in traditional and IP-based video services. Further they present test results investigating whether there is a perceived brightness variability for subtitles using the TTML conversions when displayed over different HDR video content.
Peter tho Pesch
Institut für Rundfunktechnik (IRT)
When subtitles are provided in a 360° environment, they are typically fixed at the bottom of the screen and centered horizontally. This is fine for common 2D TV, but maybe is not the best strategy for 360° media.
The presentation will show alternative approaches for the presentation of subtitles in 360° and VR media, which comprises first results from the EU-funded project "Immersive Accessibility". The main focus of the presentation will be put on the technical aspects of this (new) environment. New features also throw up new challenges for implementations. What are the technical gaps and implementation options we have today? The presentation will address these questions and give an outlook of what may come.
Jerome Blanc is EVP Compression Products at Anevia. With a strong technical background and deep knowledge of video compression, he has co-developed Allegro DVT’s first H.264 video transcoder, which have become Keepixo, now part of Anevia group. Jerome holds an Engineering Degree and a PhD in computer vision and image processing.
Dr. Marc Brelot is a senior engineer at GPAC Licensing with 20+ years of experience. Marc has a PhD in Multimedia. He co-founded several companies in the field of video production and HbbTV
BBC Research & Development
Dr. Peter Cherriman is a Senior R&D Engineer at the BBC Research and Development’s London laboratory in the UK. His main areas of work include access services, PSI/SI signalling and digital television testing. He has been an active member of DVB for many years and currently chairs the subtitle technical working group TM-SUB.
Frans de Jong holds a Masters degree in Information Theory from Delft Technical University. He has worked in the media industry all his life, both in hands-on (video editor, broadcast engineer) and in development roles (system architect, technical consultant). Since 2003 Frans has worked at the EBU’s technical unit, focusing on production technology topics, such as (U)HDTV, System Integration, Quality Control, Loudness, and Access Services.
Jean-Baptiste Kempf is President of the VideoLAN non-profit organization and one of the lead developers
of the open source VLC media player.
He is a 34-year old French engineer and has been part of the VideoLAN community since 2005. Since then, he has worked on or lead most VideoLAN related projects, including VLC for desktop, the relicensing of libVLC, the ports to mobile operating systems, and various multimedia libraries like libdvdcss or libbluray.
He also created the legal entity of VideoLAN, a French non-profit organization, in 2008.
Jean-Baptiste has also been working in various video-related startups, and founded VideoLabs, a company focusing on open source multimedia technologies.
Yunhyoung Kim received his Bachelor of Science in Electrical and Computer Engineering from Sungkyunkwan University in 2004, and Master of Science in computer science from Korea Advanced Institute of Science and Technology (KAIST) in 2007.
He is a research engineer in Technical Research Institute at Korean Broadcasting System (KBS) since 2007. He participated in standardizing several DTV standards such as Open Hybrid TV (OHTV; hybrid broadcast broadband TV standard in Korea) and Assistive Services for the Vision and Hearing Impaired. His research interests are an efficient metadata management and a content delivery via both the terrestrial broadcasting network and the Internet. He is also interested in economic analysis of content business as a Ph.D student in Information and Industrial Engineering at Yonsei University since 2014.
20th Century Fox
Dave Kneeland, Director of Technical Development and Support for 20th Century Fox, has 14 years of experience in digital media post production. In his current role, he writes and distributes the file based specifications for the post theatrical workflows.
He also works with all of the vendors who create files on behalf of Fox, and ensures that they meet the high standards of the studio regardless of what file type they are creating.
Dr. Pierre-Anthony Lemieux is a Partner at Sandflow Consulting, where he works with Hollywood and Silicon Valley clients on worldwide standards and software R&D. His expertise covers the entertainment technology ecosystem, from content authoring to consumer devices, including audio, video, timed text, file formats, and content protection. In recent engagements he has represented clients at industry forums, authored specifications, and implemented software libraries for media processing. He currently chairs SMPTE TC 35PM, co-chairs the HTJ2K activity within JPEG (ISO SC29 WG1), and serves as editor of the W3C IMSC family of Recommendations for worldwide subtitles and captions.
Dr. Pierre-Anthony has a Ph.D. in Physics from UCLA, and a B.Sc. from McGill University. He is a SMPTE Fellow.
Jason Livingston has been developing closed captioning and subtitling software and standards since 2008. He participated in the SMPTE 2052 standards committee to develop a solution for US closed captioning requirements, and helped with the FCC's legal efforts to mandate caption quality standards for broadcast and accessibility standards for Internet video. His team won a 2015 Technology and Engineering Emmy Award for their work on Standardization and Pioneering Development of Non-live Broadband Captioning, and today their software lies at the heart of many TV broadcasters’ closed captioning solutions for both broadcast and broadband.
Hewson Maxwell leads a small R&D team for Red Bee Media focused on the integration of Automatic Speech Recognition, Deep Learning, and other emerging technologies in to Access Services production. He has worked in Broadcast Technology for over 15 years in a variety of Engineering and Technical Management roles.
BBC Design & Engineering
Nigel is a product manager in the BBC, looking after the engineering strategy for making audiovisual content accessible, including subtitles or closed captions, audio description and signing. He has been involved in this area for over five years, working on large and small procurements, delivery of internal facing technology changes, and contributing the BBC's expertise and knowledge of the UK audience's accessibility requirements to open standards groups.
He currently co-chairs the W3C's Timed Text Working Group and the EBU's Subtitles in XML group, and has contributed to several other international standards groups. Before moving into the field of access services Nigel worked as an enterprise architect in the BBC's corporate technology centre, and lead development of an EPG data authoring system for the BBC's Research and Development department.
Dr Mario Montagud is a Senior Researcher at i2CAT Foundation (Barcelona, Spain) and a part-time Professor at the University of Valencia (Spain). He received a PhD degree in Telecommunications (Cum Laude Distinction) at the Polytechnic University of Valencia (UPV, Spain) in 2015. His topics of interest include Computer Networks, Interactive and Immersive Media, Synchronization and QoE (Quality of Experience). Mario is (co-) author of over 60 scientific and teaching publications, and has contributed to standardization within the IETF (Internet Engineering Task Force). He is member of the Technical Committee of several international conferences, co-organizer of the international MediaSync Workshop series, and member of the Editorial Board of international journals. He is also lead editor of “MediaSync: Handbook on Multimedia Synchronization” (Springer, 2018) and Communication Ambassador of ACM SIGCHI. He is currently involved in the following EU H2020 projects: ImmersiaTV, VR-Together and ImAc.
Johannes Schmid co-founded MIT-xperts GmbH in 2001 after receiving his master's degree in computer science from TU München. After working for IRT, MIT-xperts now concentrates on products for broadcasters (interactive TV playout, EPG playout, conformance recording and analysis), but still keeping a close relationship with IRT.
Institut für Rundfunktechnik (IRT)
Peter tho Pesch graduated from Emden UAS in 2008 with a Dipl.-Ing. Degree in Media Technology.
During his studies, he focused on the subjects signal processing and broadcasting and gathered professional experience during several practical periods between 2005 and 2008.
In 04/2011, he joined IRT and has since participated in various EU projects including HBB4ALL, where IRT is leading the subtitling pilot activities. As part of IRT's Accessibility Team he is now responsible for IRT's activities in the field of accessibility services in immersive media with a focus on subtitling. Since 10/2017 Peter tho Pesch collaborates with various European partners in the EU project “Immersive Accessibility” where he evolves new approaches for accessibility services in immersive media.
Remo Vogel is a research engineer at RBB Innovationsprojekte, an small research & development unit within the ARD group. For RBB and ARD he acted as technical coordinator for various online services with a focus on innovative projects. Remo has expertise in online video production and delivery, and was responsible for RBB's media library. Remo is a supporter of technological and media convergence especially towards online services on television, a concept he actively follows in connected TV & radio projects. He is currently participating in European Union research projects and internal RBB/ARD projects where his main area of focus is the adoption of cutting edge technologies into the world of broadcast.
Institut für Rundfunktechnik (IRT)
Andreas Tai is engineer and project manager at the Institut for Rundfunktechnik in Munich. He works on subtitle technology and access services. He also chairs the EBU Timed Text Group and is responsible editor of the EBU-TT Part 1 and EBU-TT-D specification. Andreas has a diploma in political science and a Master’s Degree in Applied Informatics.