Wednesday, October 01, 2014

Experiences in using python and ffmpeg to transcode broadcast video

This is a rough guide to implementing a transcoding pipeline from broadcast video to multi-bitrate mobile-friendly video, using ffmpeg. No code (the code's not mine to give away).

I've successfully implemented a transcoding pipeline for producing multi-bitrate fragmented mp4 files from broadcast DVB input. More concretely, I'm taking in broadcast TS captures with either mpeg2 video and mp2 audio (SD, varying aspect ratio) or h264 video and aac audio, both potentially with dvbsub. From that I'm producing ISMV output with multiple bitrate h264 video at fixed 16:9 aspect ratio and multiple bitrate aac audio, with burned subtitles. The ISMV outputs are post-processed with tools/ismindex and with the hls muxer.

There are number of limitations in ffmpeg that I've had to work around:

  • I haven't gotten sub2video or async working without reasonably monotonous DTS. Broadcast TS streams can easily contain backward jumps in time (e.g., a cheapo source that plays mp4 files and starts each file at 0). The fix is to cut the TS into pieces at timestamp jump locations and using '-itsoffset' to rewrite the timestamps and then concatenate. I'm using the segment and concat muxers for that.
  • Sub2video doesn't work with negative timestamps, so I use '-itsoffset' to get positive timestamps
  • For HD streams, I need to scale up the sub2video results from SD to HD. Sub2video doesn't handle the HD subtitle geometries. I'm not enough of an expert to know whether that's the issue, or whether that's just the way it's supposed to work with SD subs (typical) with HD video.
  • For columnboxing, I use the scale, pad and setdar video filters. These work fine, but their parameters are only evaluated at start, so I need to cut the video into pieces with a single aspect ratio first and concatenate later.
  • Audio sync (using aresample) gets confused if the input contains errors, so I need to first re-encode audio (with '-copyts') and only after that synchronize.
  • The TS PIDs are not kept over the segment muxer, so I given them on the command line with '-streamid'.

I've automated the above steps with a bunch of python, using ffprobe to parse the video stream. The result manages to correctly transcode about 98% of the TS inputs that our commercial transcoding and packaging pipe fails to handle (it would probably do even better on the rest).

Hope this helps somebody trying to accomplish the same thing.

No comments: