Thursday, January 08, 2015

The goal of programming is not the system, but the process of keeping the system alive

This took me a long time to realize...

Over the years I've (co-)created a bunch of programs, systems and projects. Small programs, like a perl script to create synthetic training data for tesseract; and large programs like the JaikuEngine Mobile Client. Small systems, like the linux server at my mother's office (running since 1998 as a firewall, file and directory server, SSH access point and previously an email server) and medium sized systems like the 50 machines doing video transcoding and streaming at my latest contracting project. Open-source projects, like the Jaiku port to AppEngine.

I have agonized over some of them. Over the unused code, left to rot. Over the systems that are running but might have bugs. Over the programs that no longer work because their dependencies have changed too much.

However, I only really understood this autumn that the reason I'm stressed about possible bugs in the code or system configuration I've left running somewhere is that I had thought the goal of programming was to create a program or a system. I should have understood earlier, that the goal of programming is to create the process through which the system is kept alive (until it no longer produces enough value to justify the maintenance). No program is perfect, if nothing else the world around it changes.

There are many ways to keep a system alive. You can sell (or buy) off-the-shelf software with support. You can sell custom software with support. You can sell custom software to an organization that is capable of taking over the maintenance. To have an open-source project, you must be able to create a community that wants to maintain the code as the world changes; including either providing leadership or being able to attract it.

Should I program again, I at least now know what I should aim for.

BTW, I don't think I'm the only one who hadn't realized this - this is the reason for all 'legacy' systems.

Sunday, December 28, 2014

Post-its on the board are a symptom of the team knowing what they are supposed to do, not the cause of it

TL;DR: if people know what they are supposed to do from the text that fits on a post-it, things are going well - if they don't, the post-its aren't helping, you need more prose than that.

Recently I had the (mis?)fortune of working on a medium-sized (4 man-years) project with a very inexperienced team. I was supposed to be the coding architect supporting them but ended up as the project-manager and tech lead.

When I started in the project (4 months into project calendar time), we had a typical Agile/scrum whiteboard with a bunch of post-its.

There were two problems with that board: 1) even if the post-its moved forward, the project didn't and 2) the developers didn't know what the post-its meant.

To be a bit more concrete, we were supposed to build the back-end of a networked personal video recorder (npvr) for a company that already had one, but one that didn't support variable bandwidth (i.e., mobile). This required knowledge of the existing back-end, systems programming, devops, video transcoding, streaming video protocol and video player functionality - none of which the team had before starting the project. To move the project from the realm of unhappy to the absurd, none of the developers were even familiar with most of the chosen technology stack (Linux, python, RHEL, rpms, zabbix, REST, mpeg4).

So what I ended up doing was abandoning the post-its and moving the project management to a conventional ticket system (JIRA) with signifcant amounts of prose for each ticket - several paragraphs at least, with clear Definitions-of-Done.

If that sounds stupid to you, you have developers who actually know what the customer/client/user wants without excessive prose. Congrats. If they don't, what would you have done?

Tuesday, October 14, 2014

AngularJS 1.3 is 40% faster than 1.2 on an iPhone. Or is it 20% faster?

AngularJS 1.3 was released today!

I ran my simple list generation test on both AngularJS 1.2 and 1.3 using the Profiler.

A simple answer is: AngularJS 1.3 is taking 40% less time in javascript to manipulate the DOM than 1.2.

A different but equally simple answer is: AngularJS 1.3 is taking 20% less time to get the contents of the page to your user than 1.2.

Blurb: the Profiler now runs on Windows as well as OS-X!

Friday, October 10, 2014

Writing asynchronous nginx plugins with python and tornado for 1000 qps

TL;DR by returning X-Accel-Redirects and using multiple tornado processes you can use python as an async plugin language for nginx.

I've been using an architecture at work for building video streaming servers that I've not seen anybody else describe. I'm combining the ease of writing, reading and testing python with the performance of nginx. I've been pretty happy with the maintainability of the result. It's been taken from separate cache and origin to cache+origin and from Smooth Streaming to HLS; and I can hit 1000 qps and 8 Gbps of video with it (with pretty beefy hardware though).

See

http://linuxfiddle.net/f/5f96743c-3d62-40f7-89ef-f64460e904c1

This setup wouldn't serve 1000 qps, missing are:

  • You'd want to use more tornado front-end processes so that python isn't the bottleneck
  • If there is any cache of any kind (e.g., vfs) in the backend, you want to use content-aware backend selection
  • You'd cache the mysql results rather than getting them out of the db every time
  • In the front-end, you'd want to merge all simultaneous requests for the same URL into one
  • Nginx should cache the service responses

Wednesday, October 01, 2014

Experiences in using python and ffmpeg to transcode broadcast video

This is a rough guide to implementing a transcoding pipeline from broadcast video to multi-bitrate mobile-friendly video, using ffmpeg. No code (the code's not mine to give away).

I've successfully implemented a transcoding pipeline for producing multi-bitrate fragmented mp4 files from broadcast DVB input. More concretely, I'm taking in broadcast TS captures with either mpeg2 video and mp2 audio (SD, varying aspect ratio) or h264 video and aac audio, both potentially with dvbsub. From that I'm producing ISMV output with multiple bitrate h264 video at fixed 16:9 aspect ratio and multiple bitrate aac audio, with burned subtitles. The ISMV outputs are post-processed with tools/ismindex and with the hls muxer.

There are number of limitations in ffmpeg that I've had to work around:

  • I haven't gotten sub2video or async working without reasonably monotonous DTS. Broadcast TS streams can easily contain backward jumps in time (e.g., a cheapo source that plays mp4 files and starts each file at 0). The fix is to cut the TS into pieces at timestamp jump locations and using '-itsoffset' to rewrite the timestamps and then concatenate. I'm using the segment and concat muxers for that.
  • Sub2video doesn't work with negative timestamps, so I use '-itsoffset' to get positive timestamps
  • For HD streams, I need to scale up the sub2video results from SD to HD. Sub2video doesn't handle the HD subtitle geometries. I'm not enough of an expert to know whether that's the issue, or whether that's just the way it's supposed to work with SD subs (typical) with HD video.
  • For columnboxing, I use the scale, pad and setdar video filters. These work fine, but their parameters are only evaluated at start, so I need to cut the video into pieces with a single aspect ratio first and concatenate later.
  • Audio sync (using aresample) gets confused if the input contains errors, so I need to first re-encode audio (with '-copyts') and only after that synchronize.
  • The TS PIDs are not kept over the segment muxer, so I given them on the command line with '-streamid'.

I've automated the above steps with a bunch of python, using ffprobe to parse the video stream. The result manages to correctly transcode about 98% of the TS inputs that our commercial transcoding and packaging pipe fails to handle (it would probably do even better on the rest).

Hope this helps somebody trying to accomplish the same thing.

Monday, September 22, 2014

Flow - do you speak it?

There's a lot of talk about flow and interruptions in our industry. I think everybody agrees that you don't get much programming done if you don't get into the flow.

But do you?

I've been working as a coder/architect/project manager for a few months at a largish corporation. It's been quite eye-opening.

You arrive at 9:00, have the daily stand-up at 10:00, lunch at 11:45 and a meeting at 14:00. In between there's a couple of people asking you for help. And you get absolutely bugger-all done.

Today I pair-programmed from 14:00 to 17:15. No e-mail, no Skype, no meetings. That was very much an outlier. I was pretty exhausted afterwards too - I guess if I'm not tired after the day I haven't been working that hard.

Every now and then I retire into the test lab or a quiet room or work from home. 5--6 hours of actual coding beats a week in the office.

But if you are living the intended 9:00-17:00 life with those few major interruptions per day - you don't produce squat.

Saturday, September 20, 2014

People don't run the same code on different platforms, except when it's too hard not to

TL;DR - Google uses GWT and j2objc to implement Operational Transforms on all platforms - so I shouldn't expect to be able to implement them for myself.

Most of us write code that only runs on one part of our whole product. I write python code for the server and javascript for the client. Facebook famously gave up on HTML5 for the iPhone. Cross-platform UI toolkits like wxWidgets, Qt or GTK are by definition doomed to the lowest common denominator (you could, in theory, reimplement the best features of each, but that's not been happening). Several attempts have been made to run Microsoft technologies on Unix (IE for UNIX, Wine and Mono), which have had mainly marginal importance - though the jury is out on Mono.

I and some friends spent pretty much all of last year trying to fit a data-heavy app into mobile HTML5. That was one of the many mistakes we made.

As a young sysadmin I settled on perl, since it ran on DOS and Linux (and later Windows). That went pretty well, but you wouldn't really build client applications with it.

But there is a big payout waiting for those who manage radical reuse.

The biggest case is HTML and javascript, especially on the desktop. Mobile HTML works for reading a newspaper, but not for building a rich client. I've recently seen people struggle with HTML and javascript on set-top-boxes, which supposedly run Webkit but in practice tie you to a very proprietary widget environment.

I find that some of the most interesting reuse is when there is a significant technological component that's being reused.

Both iOS's and Android's success builds on reusing the operating system and lower-level libraries (OS-X/FreeBSD and Linux, respectively). Symbian tried to ignore the need to re-use lower-level code, and failed partially due to that.

Oracle's success builds on letting people run its databases and clients on all sorts of OS's.

And now we have Google and Google Docs (Drive). Docs implements the world's best collaborative editing through Operational Transforms (OT). OT are a technological innovation similar to the relational model: a reasonably simple theory, which requires super-human engineering effort to make into a working product. That effort is large enough that you don't want to do it multiple times on multiple platforms. In both Oracle's and Docs's case you also want the results on different platforms to be exactly the same - for OT in real time.

Google is writing the OT code in Java for their servers and Android, and using GWT and j2objc to convert the code to the web and iPhone. And that's atypical even for Google. GWT's not really been a success for Google or others in other areas, especially not for UI.

So Operating Systems, Databases and Operational Transforms are too hard to reimplement. What else?

(This post brought to you by me trying to come up with a reasonable solution to syncing data between mobile clients without wasting all my time on that instead of building actual applications, like Shozu did.)