Ionut Damian1, Mohammad Obaid2, Felix Kistler1, Elisabeth André1
1 Human Centered Multimedia, Augsburg University, Germany
2 Human Interface Technology Lab New Zealand, Christchurch, New Zealand
An Augmented Reality (AR) environment makes use of computer technology to augment the real world with virtual objects. The main goal of this technology is to enhance the user’s perception of his or her surroundings. AR has been successfully used in many fields, such as medicine, engineering, social science and gaming .
There are two critical components to each AR system: tracking and content display. In most AR systems the tracking is computer vision based. This means that the user’s position and orientation is computed with the help of visual aids. This approach “is accurate, but only works over short distances” and usually requires distinct markers placed throughout the real world [1, 3]. Exceptions include GPS or compass-based tracking , using 3D structures instead of 2D images [5, 6] or inertial sensor-based tracking . In terms of content display, AR systems rely on various types of display-camera combinations to blend the real and the virtual world.
We present an approach to AR in which the user’s whole body is immersed into the AR environment. This is achieved with the help of the Xsens MVN inertial motion capturing (Mocap) system. One major advantage of our approach is the greatly increased interaction possibilities between the user and the virtual world. Additionally, because it does not use vision-based tracking, markers or cameras are not required. The tracking and synchronization between real and virtual world is handled by the Mocap system itself giving the user more mobility. To visualize the AR environment to the user we use the “see-through” Vuzix STAR 1200 Head Mounted Displays (HMD). Such a system can be used in training, coaching or rehabilitation applications. Here the Mocap data allows the system to precisely evaluate and analyze the user’s movements.
The Xsens MVN system is a wireless full body suit consisting of 17 inertial sensors which track the user’s skeleton in real time. This information is sent to our AR system which manipulates the virtual objects and renders them on the HMD. Figure 1 illustrates the setup.
The tracking within the AR setup is handled by the Mocap system. This is composed of two parts: (1) real time tracking of the position of the user’s body in the real environment and (2) real time tracking of the user’s head position and orientation. The position of the user in the environment is determined by the Xsens MVN Studio software using complex bio-mechanical simulations, whereas the head orientation is obtained from an inertial sensor attached to the user’s head. The head position is computed relative to the user’s position. This data enables the system to accurately determine where the user is situated in the real world and where she or he is looking at. This is done in absence of any cameras or markers in the real world. This means that the user’s movement is not restricted and the synchronization between the real world and the virtual world can happen regardless of the user’s position or orientation as long as she or he is within the system’s range of 150m (outdoors) or 50m (indoors).
In addition, the system receives data about the rest of the user’s body which can be used to further enhance the interaction with the virtual world. For example, the position of the hands can be used for gesture recognition or for precise object manipulation. The data can also be used for an in-depth analysis of the user’s movement, such as computing various social cues (i.e. energy, fluidity, spatial extent) to give insight into the user’s mental state.
The virtual scene is managed by the AAA framework . For the synchronization of the virtual scene with the real environment we use a representation of the user in the virtual world, a virtual agent. The position of this agent is updated every frame to match the position of the user in the room. The virtual camera’s position and orientation is continuously synchronized with the user’s head. We then render the scene stereoscopically using two different camera positions, one for each eye, to simulate binocular vision.
Using this approach we are able to place any 3D object in the virtual environment and its position and orientation will be continuously updated so that it matches the user’s perspective to generate the AR effect without having an anchor point in the real environment (i.e. marker).
The AAA framework is designed to realize simulations of social situations using virtual agents. In the proposed setup, we use AAA to control virtual agents in the AR environment so that they behave and interact realistically with the user and their surroundings.
The main advantages of our approach over common AR setups are (1) greatly improved interaction possibilities between the user and the AR environment due to the full immersion of the user’s body in the virtual scene and (2) increased mobility for the user through the elimination of markers and cameras.
Initial tests conducted with a small user group which interacted with a virtual agent within the proposed AR environment, suggest accurate tracking. This resulted in a high degree of spatial presence giving the users the feeling they are part of the same environment as the virtual agent.
Part of this work has been published in the Proceedings of the 4th Augmented Human International Conference, Pages 233-234, ACM New York, NY, USA ©2013, ISBN: 978-1-4503-1904-1 doi 10.1145/2459236.2459277