Tim: Johnny, what are you holding in your hand there?
Johnny: What I have here today is we are demoing at Google I/O 2014 is one of our 7” tablet development kits. This is for our Project Tango which is our effort inside of Google ATAP to give mobile devices a human scale understanding of space and motion. That roughly means we have this amazing ability to understand the shape of the environment and also our own position within this conference building. But that’s something we take for granted, that we’ve evolved over millions of years of evolution in our human perception system. Yet our mobile devices today have no sense of that similar spatial reasoning. Our goal is to work with the robotics and computer vision communities to harvest decades of research, to compress that into a very special mobile device. So what’s in these tablet development kits is a camera that we’ve developed specifically for motion tracking. This is a 170-degree field of view camera optimized for computer vision and 3D tracking. We also have in this particular device a depth sensor from a company called Mantis. This gives us information about the geometry of the floors, the walls and the rooms. We also have another prototype of time-of-flight depth sensing solution from a company called PMD Tech.
Tim: Could you explain that a little bit?
Johnny: There are two principles of operation that we have in our devices here today for doing depth sensing: One is called structured light, which is the one provided by Mantis. And what it is is that provides an infrared pattern projector that emits an image on to the wall, and then we observe that from a camera that is offset and then using that offset it is very similar to a stereo pair computation. We are also working with a company called PMD Tech which does time-of-flight depth sensing solutions. And time-of-flight works out the method of emitting pulses of light at a particular frequency and looking at the return frequency from the environment. When there is an object at 1 m or 2 m, it will actually cause a different shift of the phase and by detecting those phase shifts, they can detect depth as well. There are pros and cons to each of the two technologies. We are currently evaluating the performance and pushing, working with those partners to improve the quality of those sensors.
Let me give you a quick tour of some of the software that actually takes the data from the sensors to do some of the tracking. The first thing I am going to show you is a diagnostics app that basically shows you some of the raw data coming in. On the left side, you see the image from the fisheye camera, and you also see the gyro and accelerometer data at the bottom. You also see these green points on the image which is the hardware accelerated computer vision that allows us to attract the optical flow of the system, of the camera image coming in. Now we combine that optical flow data with the gyro and accelerometer and that gives us this motion estimate on the right side. Like many devices that have motion sensors, if I tilt the tablet left and right, it actually represents this movement. But what’s different about our Project Tango device is that I also can track my motion. So if I physically move left and physically move right, or if I move the tablet in a big circle, (oops! Sorry) it is actually tracking the full 6 degrees of freedom motion of the device. We are also tracking Z. Oops! I hit the menu on the screen. So we are also tracking Z. So if I raise the tablet or roll the tablet, we get full 3D tracking.
So let me show you a quick example of what you can do if you just simply take that data and insert it into a game. So this is a tech demo that we built inside of the Unity game engine. It is a very simple game where I just have to put the right colored block on the right colored switch. But if I pick up this blue block I actually have to walk forward to hit the switch. Here we go, here we go. Let me try this one more time. So this is a quick prototype we built inside of Unity. If I pick up this yellow block and put it on the switch, it drops down more blocks. In order to get to this blue switch, I have to actually walk forward, so drop it on that switch, and to pick up this green block, I actually have to walk all the way over that side. So what this does is it basically introduces motion into mobile gaming. If you walk 1 m in the physical world, you actually walk 1 m in the virtual environment. So you can start imagining, using the space of your house or your office space as part of a meaningful component of the game.
This is another demonstration where again it is a Unity tech demo where we’ve created sort of a fantasy environment and you can use the device to sort of camera control, just to point in the different directions to look at the space. You also see that there is this little wizard on the ground. It is only about 6” tall, so if I actually want to interact with him, all I have to do is squat down at his level. So he is right in front of me and I immediately see the world from his perspective. I can look at the stones and the mushrooms and the small plants, and I can interact with the character directly. But if I want to go back up to my character control mode, I just point on the ground and tell him to go to that part of the room which will be over there. If we had the ability to move around more, there are actually more structures and parts of this environment where I can actually explore the rest of the virtual space just by physically moving. So another demo app that we got just recently from one of our university research partners combines both that tracking data and the 3D mapping data into a single app. So this is a very early prototype of actually using both sensors together to build a full 3D capture of the environment.
This essentially is just some early code that we’ve gotten with research partners. As the hardware and software evolves, our goal is to basically make these standard parts of our platform. And game developers, application developers and game developers, can start building devices that actually understand this 3D tracking and 3D motion.
So we are making dev kits available later this year. Google I/O attendees can go to the laptop and sign up and we will make them available for purchase. But the general public can go to our regular website and also sign for release later.
Tim: I understand that next year there is a possibility of a more consumer-oriented version.
Johnny: That’s right. What we’ve been doing inside of ATAP is pushing the hardware vendors and the software development to mature it to the point where it becomes consumer-ready. We’d like to work very collaboratively with Android OEM ecosystem and we’ve already signed an early engagement agreement with LG to potentially create a consumer device in 2015.
Tim: Any idea, for applications, you’ve mentioned gaming, certainly architecture, I would assume some kind of inspections, what are some other applications that you’ve brainstormed inside the project, how do you imagine people are going to use this?
Johnny: We have a number of partners here from various different genres. We have Limbic Labs, who created a version of their game Zombie Gunship that uses Project Tango tracking which lets you fight with zombies on their floor. We also have other partners such as Trimble who is very interested in construction and industrial inspection. They have a demo where you can measure the corners of your room as well as see a little bit of x-ray vision into the Moscone Center using a previous CAD model that they have. These are just quick prototypes and glimpses of what might be possible later once we have more ubiquitous adoption of this hardware and software platform.
And then once you have the ability to track your position in other retail spaces, Aisle411 has done some very very early explorations into being able to navigate customers to their destinations. So we think that these are just the developers that we’ve been able to engage in so far for Google I/O but I am really excited to see what other developers can do with this platform.
Tim: When wearables like Glass actually have depth sensors as well.
Johnny: Yeah. So I am excited about the future of this as a wearable. Right now we are working on the tablet platform. Obviously there is an incredible future potential if it gets back into phones and to wearable devices. We are eager to work with the hardware vendors and sensor vendors to make all that possible. But there are no engagements to announce at this time for product lines in that direction.
Tim: One thing I know that within ATAP it is sort of a limited time that you spend with a particular project—what will you do after Tango?
Johnny: We are eager to work with the hardware ecosystem to get these devices into the consumer products. The future of Project Tango is still yet to be written. So we are just excited about what’s going to happen.
Tim: Let me ask one more question: All the data you are gathering, you’ve got all three dimensions and time element as well, is there a readable data format that someone can take this and actually manipulate that outside of Project Tango—is that part of the plan?
Johnny: File formats for 3D are – there are several that exist for 3D modeling programs as well as the construction and CAD industry. There are also file formats that the robotics research committee has settled upon. Currently, those formats are very much still in flux simply because these applications are still being written. We would most certainly want to be able to support the kinds that enable the research community because they will allow us to ingest the research code much faster into the platform.
Tim: And Google has a good history of taking a standard sometimes and saying here is our new more open version.
Johnny: Yes. Yeah, we want to make sure that there is a file format that is available to a large percent of the community, whether or not it is adopting a preexisting one or working in collaboration with major stakeholders to establish that. But those are all exciting things in the future.