The other day, a peer in the streaming community reached out via our Unified Certified Slack channel (our forum for exchanging advice with our software course grads). He had a question about live transcribed subtitles.
He’d managed to get his subtitles ingested next to the video, but he couldn’t perfectly align his subtitle fragments with the video encoder. And that kind of alignment is important. Complicating matters, there were SCTE markers (often called Timed Metadata) involved. These enable monetization models such as AVOD and FAST.
The streaming dev knew that Unified Packager can split and merge subtitle cues across segment boundaries, when packaging the original WebVTT or TTML into the desired presentation format.
[Side note: we don’t recreate subtitle tracks from scratch, since all styling information and editorial changes should be made before packaging using the relevant encoder or subtitle tooling. Just some best practices there.]
So the person’s question was:
Can Unified Streaming’s software align or “sync” the segment boundaries of the subtitle and video track when packaging?
And our answer was:
Yes it can!
So we got into it with him. Because — let’s be honest — when we collaborate with customers, partners, and our Certified community members, we’d much rather ensure their story is a success story. Than not.
We dove into the intricacies of the issue: aligning segment boundaries between video and subtitle tracks. The discussion revolved around the testing of a workflow. We focused particularly on ensuring that the segment boundaries of the text track align with the target output segment length.
The testing phase
The dialogue pushed on. We inquired: had his workflow been thoroughly tested with Unified Origin (live)? And did any issues arise concerning the alignment of text track segment boundaries with the defined target output segment length?
(Typically, when we’re trading info with a client to get to the heart of the problem, we volley questions at our collaborators to whittle down the causes.)
The Unified Certified grad said he had tested the workflow, providing an example that showcased a video track with 1.92-second segments and an output fragment length set to 7.68 seconds. He also shared a subtitle track, ingested as 1‑second segments.
A brief discussion followed, about the alignment of splice points and segment boundaries. [Just for kicks: a splice point is a specific time in a stream that corresponds to an IDR (Instantaneous Decoder Refresh) frame that is signaled as a sync-sample. The splice point offers the opportunity to switch the livestream to a different clip, seamlessly. Splice points can be used to cue (ad) replacement and insertion opportunities.]
A rare moment of confusion
At one point, our friend expressed uncertainty about the correctness of splicing, contemplating a potential issue. After further testing with a splice point on a non-exact second timestamp, though, he concluded that the splicing was indeed correct.
Segment length considerations
Then he asked about the possibility of outputting subtitle segments at 1.92 seconds or 0.96 seconds. We emphasized the importance of aligning incoming segment lengths with audio/video to ensure synchronized timelines. He proposed using 0.96-second subtitle segments. This worked well, because Unified Streaming products can accurately split and merge target segment lengths to multiples of the input length.
Those ol’ timestamp alignment challenges
The conversation touched upon challenges in aligning segment boundaries without those boundaries necessarily representing the same timestamps. He highlighted the desire for both the audio/video encoder and subtitle ‘generator’ to create segments that start in time at multiples of the segment length since Unix epoch. This will guarantee a common timeline relative to the Unix epoch.
Potential issues, solutions
We told our peer about a potential issue with splicing, where rounding issues could cause 0 duration WebVTT cues if the splice point is exactly at the end of the cue. (This issue, however, was addressed in release 1.12.12.) We also notified our peer about a previous issue related to the packaging of content containing both SCTE 35 split opportunities and subtitles, which we recommended they factor into their testing.
He acknowledged having observed glitches on Apple devices and expressed relief that the current checks seemed satisfactory.
Summing the sync up
The discussion concluded with both parties acknowledging (and dare we say, appreciating?) the complexities of ensuring seamless synchronization between video and subtitle tracks.
Of course, at Unified we are committed to monitoring and addressing potential issues to guarantee a smoother playback experience. And, yes, we’re committed to helping people with the ins and outs of complex streaming processes.
Aligning segment boundaries and timestamps is a nuanced thing. Since it requires collaboration, we talk things through until a solution’s found.
Things come up. Always. What we aim for, then, is what we deem the best-case scenario: a reliable and stable workflow, and everyone — from content deliverers to end users — satisfied and happy.
Tell us what you’re syncing
Have a question about sync issues? Looking for streaming products and solutions that work? Contact sales@unified-streaming.com.
More of a DIY-er? Respect! Book a personal demo here with one of our experts. Free trials are available here.
Or wait — you wanna get certified using our portfolio? That can happen for sure. Here.