Bullet Time SFX Using Nothing but Web Tech

Example output from bulleTime.js

bulletTime.js is a web application that attempts to replicate the special effect 'Bullet Time' famously introduced in the movie: The Matrix.

The proliferation of advanced web technologies mainly: getUserMedia/WebRTC and Web Sockets created a unique opportunity to develop a web application that can potentially replicate the famous effect.

During Netflix's Hack Day, I had the opportunity to spend a few hours developing an early prototype of the concept. The main idea is to use an array of laptops, and take advantage of their webcam, native display and connectivity features.

# Infrastructure:

The infrastructure is composed of:

# First Challenge: Setup:

The physical setup of the machines is the trickiest part, the correct angle (pitch and yaw) for each device is crucial (otherwise the effect is broken and you end up with something like this)

The master machine needs to be aware of the position (index) of each worker (to be able to interlace the photos in the correct order). In order to access its webcam each worker needs to grant access permission via the browser.

I attempted to solve some of these challenges by dynamically allocating the index to each worker/camera every time a worker page is loaded. The page will display its index in a huge font that almost literally takes the whole screen (i.e. 100vh). Further each worker that gets added to the system is instantly displayed on the master dashboard. Said dashboard also conveniently displays the status of each worker stating whether access to the webcam has been granted or not.

# Setup example:

In this video all the workers are on the same machine, obviously when actually doing this, each worker needs to be on a different device

# Second Challenge: Synchronisation

Triggering all the photos at the exact moment in all devices is paramount for the effect to work (here is somewhat comedic example of when there is a small delay between shots). Achieving synchronicity between all the machines turned out to be very difficult and it looks like it's known to be a hard problem

I tried a few different techniques while trying to overcome this issue:

# Sound:

I attempted to use sound as a trigger (in the hope of removing network latency/congestion from the problem) I ended up generating a DTMF tone on my trigger device and this javascript implementation of the Goertzel algorithm running on each worker device. The idea was to process the incoming audio from the microphone waiting for the specific DTMF tone. Unfortunately this proved to be the most unreliable method. Different machines processed the incoming sound at vastly different speeds (I'm guessing due to CPU/microphone differences)

# NTP (Network Time Protocol)

Another technique I attempted to use was relying on NTP. Basically giving the machines a specific time in the near future (a couple of seconds) when they needed to take a shot. Unfortunately it looks like NTP in practice does not provide millisecond level precision (at least over WIFI)

# Web Sockets

This is the technique I ended up using. It was somewhat flaky since the devices were connected via WIFI and as a result network traffic vastly affected how precise the synchronization was. It did however produce acceptable results even if it was not super reliable.

# Potential Alternatives:

As explained, Web Sockets was the least crappy solution. There are a few more things that I would like to try including:

# Third Challenge: Webcams

While a quick trip to Netflix's IT department granted me with 7 Macs + mine (4 Retina Pros, 3 Air, 1 Pro 13... how cool is Netflix's IT department?) it turned out the quality, color/hues, distortions were different (even within the same models).

This is most likely because the laptops were 'loaners' so they have been heavily used. Unfortunately I didn't notice this while shooting, as a result I ended up manually removing the bad frames from the GIF after the fact). Here are a few examples of bad frames, and how they affect the final result: Reference Frame, Bad quality/blurry Frame, Bad colour/hue and final result (including the bad frames)

# Future

There are a few things that I'm planning on improving in the near future. One of the most important ones is to modify what each worker screen displays. The idea is to show a semi-transparent feed of what the direct sibling worker (n-1) device is seeing on top of its current camera feed. This will go a long way in making the setup easier and the final result less jumpy.

Using phones would yield substantially better results for a few reasons: They have much better cameras than laptops, it's easier to find several units of the exact same model/version and finally, it drastically reduces the distance between the lenses (thus making a smoother effect)

Mobile Android already supports the required web technologies, unfortunately Safari does not, leaving (at least for now) iPhones out of the fun.

Isn't the Open Web awesome? This whole thing was designed, coded and field tested during a hack day. I'm pretty sure that I can yield significantly better results with a few tweaks. I will report back whenever I get some the time/opportunity to try again (if you try it, please let me know)

Finally here are a few photos of the actual, setup and a "best of" video. The source code for bulletTime.js is available in Github here, obviously pull-requests are welcome.