Archive for March 24th, 2010

MocoCompositor: First Real Test

Wednesday, March 24th, 2010

So I finally got a hold of a green screen cloth from Krishna (Thanks Krishna!) and draped it over our office door. I setup the camera in front of it, found a cool sci-fi desktop wallpaper to use as my background plate, the live camera viewfinder as the foreground plate and ran the MocoCompositor with the greenscreen/chroma-key filter GLSL shader. The output from the compositor streams to a new viewfinder called compositor so any browser should be able to view the final composite also.

The computers were setup such that the netbook Dell Mini9 was running the server, background plate stream and the MocoBot (the live camera feed) programs. Meanwhile my desktop (which is a bit more heavy duty with an nVidia GPU card) was running the MocoCompositor which connected to the server on the Mini9 to pull the camera viewfinder feed as well as the background plate feed (oh ya I didn’t mention that I also wrote a quick static image streamer publisher that will later become a video/frame player in the UI). At each pyglet on_draw event the MocoCompositor applies/blits the current images to the appropriate plates/textures and composites them using the shader. The resulting texture is then pulled from the GPU framebuffer and compressed to jpeg (via pyglet via PIL) and streamed out to the server via a publisher.

It works surprisingly well and fast given the amount of network traffic it produces. And these are the results (I print-screen captured these while viewing the compositor viewfinder in a browser on yet another computer :). Note that the background plate is using a sci-fi wallpaper of the Earth I found on Google images… (I claim no copyright to it, but do claim fair-use for educational purposes):

Here’s me smiling cheesily (Note: my left shoulder isn’t abnormally low, I was reaching for the printscreen button on the keyboard below 🙂 )

And here’s me looking into the Universe contemplating existence or looking for Dr. Who.

Obviously the lighting was crappy and uneven hence the green pixels still present from the foreground plate. I need to come up with a way of passing the shader parameters to the server and to the UI for the user to fine-tune its settings rather than hardcoding it into the shader like I do now. Playing with green screening is getting to be more fun than I’d expected… need to get back to work.

Edit: I wonder if the JPEG compression/decompression could be done as a shader… Looks like NVIDIA’s site has an example of the DCT algorithm as a Cg shader (we use GLSL)… but I’m not sure of what the rest of the JPEG compression algorithm/format would need. But I think a future extension to this software could be a JPEG compression shader to be able to get rid of PIL doing the jpeg compression and speed things up even more… just a thought, but outside of the scope of this project obviously.

MocoCompositor: GPU Accelerated Compositing

Wednesday, March 24th, 2010

I’m a little giddy and excited about this new feature. Last semester I took a class at the main campus titled “Technical Animation” where we learned about all sorts of Computer Graphics and Animation techniques/algorithms used in the game and movie industries. It was a pretty cool class that focused on projects. My final project (teamed up with Federico Perazzi and Grace Lin (both from ETC)) was to create a target-driven smoke-simulation accelerated on the GPU. I knew absolutely nothing about smoke simulations or GPUs for that matter. Long story short, we taught ourselves how GPU accelerated computation worked and how to write shaders in GLSL… and eventually wrote a regular smoke simulation in GLSL (we ran out of time for the target-driven part). Turns out it doesn’t matter if you do the software in Python as far as speed is concerned since you’re passing all the heavy computation over to the GPU on the video card to do anyway. So we ended up using pyglet (the OpenGL interface in Python) and a tiny shader class to string together several custom shaders to do our smoke simulation… it worked in real-time pretty well.

Skip to the present: We at Mocotila talked a lot about compositing images plates together because that was one of the biggest uses for motion controlled cameras. Compositing a live action/model with a matte-painting or 3D model. But we also realized that compositing is usually done post-production and takes a lot of time to do, render and see the output. Then if any shots are screwed up in framing/etc you’d either have to reshoot or try to fudge the effects til it was acceptable.

But here we are with an awesome camera with its viewfinder being streamed to anything capable of reading http-streamed images… and not just the expensive camera either, our bioloid has a little webcam that is also being streamed in the same way and when we get Maya/Blender integration we might have live 3D renderings being streamed as well. Wouldn’t it be grand if the cameraman/director could see a live end-result composite preview so he could direct actors or reframe things appropriately? And what if we could just kinda composite several of these streams into a new viewfinder? This is where the MocoCompositor comes in… it runs on any computer on the network with a good video card capable of shaders. It pulls in (subscribes to) images from multiple image streams coming from the server and it publishes a new composited image stream to the server that anyone else can subscribe to.

The actual compositing is accomplished using the GPU via GLSL (the OpenGL Shading Language). This is where my Technical Animation story comes in… I went back and looked through my GPU smoke simulator and implemented it again taking out all the simulation stuff and adding layers of images to be processed like Photoshop Layers. Your view is from the top of the layer stack looking down (the very bottom being the background image plate). The algorithm runs from the bottom plate up applying an associated shader to each plate and saving the result into an output plate. The output plate is what is packaged as the jpeg image and published out to the server again.

So far we’ve implemented a green-screen shader that replaces all the green in the foreground plate with the pixels from the background plate… this means live green-screen replacement compositing. We experimented with background-subtraction (taking a reference shot of the background and subtracting it from the live shot so we wouldn’t need a green-screen, but it just wasn’t reliable or clean). We’re hoping to add some more shaders to this system all implemented via GLSL shaders… especially some gradient blur filters (if you know where I’m getting at :).

Under the Hood

The main thread is a pyglet app running its event loop. Word of advice NEVER mix OpenGL calls (or anything that touches hardware directly without locks and state preservation) across threads… bad things happen (one of these days I’ll get around to putting locks on the camera class too). Anyways in this main thread we have the on_draw event from pyglet where we run through each image plate and execute the appropriate shader on the input image (and working/output image). After we’ve gone through all the plates we package the output image texture back into a jpeg and send it out to the server via a http stream publisher.

Now since pyglet controls our main thread and we want to be able to pull images from at least 2 streams concurrently from the server, we need to do it in threads. So for each image stream coming in (subscribed to) we have a thread which grabs the image and updates a mutex-protected data structure (our image plates) with the raw data. The next time through the main pyglet loop it’ll reload the image from the raw data… we can’t do it out in the thread because lord knows what pyglet is doing behind the scenes (or if the libraries it uses are thread-safe) to load these images.

Speaking of the libraries used by pyglet… we were having this nasty SegFault in the MocoCompositor after a few seconds of working perfectly. Just out of nowhere it would SegFault (and not at the same period). After digging through pyglet and stepping through the execution using a python debugger I tracked down the problem to a codec being used for the JPEG decompression. Pyglet uses 3rd party libraries to decompress jpeg images (not sure if it has to do with patents or whatnot), but on the Ubuntu system I’d been using, it was defaulting to using gdkpixbuf to do the decompression… I assume it’s the fastest implementation available on the system (probably uses the C libjpeg or something). But I noticed in the pyglet documentation that you can specify a decoder to use for the decoding and I noticed that the Python Imaging Library was installed (PIL)… so I forced pyglet to use that decoder instead (it seems a tinsy bit slower) but it worked without any SegFaults [so far]… huzzah!

Also note that textures used in OpenGL really have to be made in powers of 2… otherwise crazy things start happening (things start striping and staggering diagonally). This was an ongoing struggle for quite a while, until on a whim I remembered that old rule and tried it as a power of 2 and it worked. So now I have the code looking at the input image size and rounding it up to the next power of 2 to allocate the texture. This leaves a big black unused area but it works… when we’re done calculating the output image, we simply blit the original size of the image from the big texture to send to the output stream as the jpeg image.