Microsoft’s Xbox Kinect is a revolutionary tool in computer gaming. It has recently been entered into the Guinness book of world records as the fastest-selling electronic device, impressively surpassing both the Apple iPod and iPad. This week saw Professor Chris Bishop, Distinguished Scientist from the Microsoft Research Centre, in association with the Centre for Science and Policy, come to the Judge Business School in Cambridge to discuss how the Kinect went from laboratory research to the market shelf.
An active demonstration saw Professor Bishop diving around fixing virtual leaks in a submerged fishbowl, which reminded his audience how the Kinect brings users into games via real-time full body motion tracking. The story begins in the Microsoft labs of Cambridge where research was underway on machine learning. The idea behind machine learning is that a program can be trained to learn from user input examples. This concept was illustrated by a demonstration of the ‘movie recommender’, a simple example of machine learning using a program that predicts which films you will like or dislike based upon selections you make yourself and using data of the likes and dislikes of thousands of other people. A similar system is used on consumer websites to recommend other products that may interest you. The research being undertaken at Microsoft involved image recognition, where the pixels from an image were separated into a finite number of preprogrammed categories. In this case, researchers started by differentiating between cows and sheep, then the number of categories was gradually increased. From the examples it is given, the machine learns how to assign the pixels in an image based on an initial database of entries, and subsequently responds to user input results.
Meanwhile over the Atlantic the computer game industry was evolving. Microsoft released the Xbox in 2005 to enter itself into the games console market having previously concentrated on PC gaming. Only one year later the Nintendo Wii was released which, although being graphically poorer than its competitors, offered a unique motion-sensitive user interface. Three years later Sony released the Playstation Move which uses the same camera tracking technology as the Wii. The difference with the Kinect is that it offers a completely controller-free user interface. It achieves this through an infrared camera and by using structured light; a series of infrared dots are sent out from the device that reflect off the body and are received by a detector to form a depth map of the object. The key to the success of Kinect is combining this depth map with machine learning using the same technique employed by the lab researchers during image recognition. Just as image recognition requires training, the machine needs to be trained to recognize different parts of the body. This is done by separating the body into approximately 30 different areas and classifying each pixel of an image of the body into one of these different areas. Given approximately 1,000,000 initial body positions with the correct assignment of pixels to areas of the body, the machine can learn how to assign these pixels and couple them with the depth map produced by the infrared camera. By doing so, it produces a real-time motion image of the body.
The technology behind the Kinect is hugely impressive and doesn’t just offer applications in the computer gaming industry; in the future it could be modified for use in surveillance or by surgeons as a user interface for viewing medical images during an operation where sterility is paramount. For Microsoft, the Xbox Kinect represents a unique success story from laboratory research to the commercial market. Professor Bishop suggests that this is due to the freedom Microsoft workers are permitted and the ethos and community that this entails.
Written by Richard Thomson
CSaP have also written a report on this lecture2