Research Practice Blog

Augmented Reality and Matchmoving

Gergo Forgony

/21056164/

Research Practice Module

Digital Animation Course

Contents

· Introduction to virtual reality

· What is augmented reality and how is it related to virtual reality?

· What are the challenges and difficulties in these topics?

o Tracking

o Displaying

o Rendering

· How does the tracking process work in filming?

o Planning

o Data gathering

o Asset building

o Matchmoving

· How does the camera tracking software work?

· Conclusion

· Reference list

Augmented Reality and Matchmoving

Introduction to Virtual Reality

Virtual reality (VR) has become very popular outside the research community within the last two decades. According to Oliver (2005) science fiction movies influenced the research community: “Most of us associate these terms with the technological possibility to dive into a completely synthetic, computer-generated world – sometimes referred to as a virtual environment”.
In a virtual environment our senses such as vision, hearing, haptic, smell, are controlled by the computer while our actions influence the produced stimuli. There are no proper devices which could represent the sense of smell or taste. We already have video system, which surround the audience with realistic 3D graphics. We have advanced audio system as well. The haptic devices, what applies vibrations, motions and forces to the user. Referring to Ivan Sutherland (1965) the system which contains all of the sense display types would be the ultimate virtual environment like the Holodeck, in the film called Star Trek. A virtual room with 3D holographic images, sounds and smells; all generated by a computer. Although some bits of the holodeck have been realized today, most of it still science fiction.

So what is augmented reality (AR) and how is it related to virtual reality?

In contrast to traditional VR, in AR the real environment is not completely suppressed, instead it has a dominant role. Rather than putting a person into a virtual environment. AR attempts to integrate computer generated image into the real environment. This leads to a fundamental problem: It is more difficult to control a real environment than a completely virtual one. A good example for AR is, when R2-D2 projects Princess Leia as a 3D moving image into the real word (figure 1).

$Description: I:\TVU\Research practice\images\01_r2-d2-princess-leia-hologram.jpg$

What are the challenges and difficulties in these topics?

The most important task in Augmented Reality is to keep a correct consistent and smooth registration between real environment and the three-dimensional graphical objects. According to Oliver (2005) this registration problem is the most fundamental challenge in AR research today. In order to achieve augmented reality, the user has to use or wear different kind of devices. In 1968 Ivan Sutherland created the first functioning Head mounted display (HMD) which was also the birth of augmented reality. The device what the user was wearing on the head was using half-silvered mirrors as optical combiners that allowed the user to see both the computer-generated images reflected from cathode ray tubes and the real live objects in the environment. The moving user requires the system to determine the head’s position, orientation, and motion so that the 3D Computer generated images can be placed into the real environment. The whole process has to be done in real time.

1. Tracking

There are two methods for tracking. Marker-based tracking and marker less tracking. The first type of marker-based tracking is the inside-out, uses sensors that are attached to the moving objects. The sensors are able to tell their position relative to the fixed emitters in the environment. The other type is the outside-in, refers to systems that apply fixes sensors in the real environment that track emitters on the moving object. In agreement with Oliver (2005) the most promising tracking solution for augmented reality application is the marker less system, because it does not require any kind of artificial markers.

2. Displaying

$Description: I:\TVU\Research practice\images\02_figure2.jpg$

Figure 2

Figure 3

$Description: I:\TVU\Research practice\images\03_artoolkitplus_Smartphone.jpg$ The other building block for Augmented Reality is the displaying system. The previously mentioned head mounted display is the dominant method in the AR applications (figure 2); however there are issues and limitations. For example the limited field of view, fixed focus, limited image resolution, unstable objects registration relative to the eyes. We have to talk about the physical equipment with those wires, what the user is wearing. The user can be distracted by the weight or by the wires of the displaying devices. The second mayor technique is the handheld display like PDAs or mobile phones (figure 3). The main advantage of the handheld AR is portable however it has to be in hand all the time in front of the user. The third type of device is the spatial display, where the user is not carrying or wearing any device. The graphical information is displayed by digital projectors onto objects (figure 4). The user and the system are separated from each other. The main advantage of the spatial augmented reality is the number of people who can use the system at the same time. Since the users are not required to wear displays on their heads, the system allows collaboration between the users.

Figure 4

$Description: I:\TVU\Research practice\images\04_RTEmagicC_sarc.JPG.jpg$

3. Rendering

The third basic element for augmented reality is real-time rendering. Fast and realistic rendering methods play an important role. Ivan Sutherland said (1965):

"The ultimate display would, of course, be a room within which the computer can control the existence of matter. A chair displayed in such a room would be good enough to sit in. Handcuffs displayed in such a room would be confining, and a bullet displayed in such a room would be fatal."

“An ultimate goal could be to integrate graphical objects into the real environment in such a way that the observer can no longer distinguish between real and virtual.”

The colours, lighting and the image quality have to match the real objects in the live set so that the composition will be smooth and flawless. It will create the illusion that all those elements are part of the same scene. With correct compositing the computer generated image and the real live scene merge perfectly together and the composition will look believable and realistic. If the colours/levels of the layers are not matching the composition will look unreal and it is really easy to notice those mistakes. Shadow-casting, global illumination, ambient occlusion, ray-tracing rendering options slows down the process dramatically. It can be only used if interactive frame rates are not required. For interactive augmented reality applications fast image rendering is required.

If we talk about non-real time augmented reality, when interactive frame rates are not required, we can have a look into films. Mostly science fictions films where computer generated images are composited into the real live set. But how does augmented reality relate to films. If we take a step back and look at augmented reality again, in line with Stephen Cawood (2007). The goal of augmented reality is to create an illusion that the virtual objects are part of the real world. Referring to his words we can call those films or scenes augmented reality, because the techniques what were used are the same, like the tracking techniques.

How does the tracking process work in filming?

We were talking about some techniques and methods in real time augmented reality, Take a look at how does it work in the post production process in the filming industry? When we were talking about the head mounted displays, the system has to have the information about the position of the equipment in order to create the virtual image which will have the same perspective relative to real live objects to the eye. In films we have to do the same just with the camera. When we record a scene and want to put a computer generated image into the scene, we have to track the camera first. The process of recreating a live camera move, in the virtual computer generated word called camera tracking. Often used interchangeably with matchmove. How does it actually work? It is part of the post production process and matchmover takes information from the real-life set, where the director and the actors are shooting the film, and recreates that camera. The new virtual camera has the same focal length, height, tilt, lens, position, and motion. This particular technique is classified as a marker-less system, because there are no sensors and emitters which would help tracking the camera. Instead of using physical sensors in the live set. The camera tracking software derives 3D information from 2D planes and creates virtual points. In the next step it tracks those points, using the relative motion in the plane as the basis for solving the resulting camera move. When the application has done with the tracking, it produces a virtual camera solution. According to Erica Hornung (2010) the term “solution” is used because the software crunches an incredible volume of numbers and equations.

1. Planning

Matchmoving process has different stages. It starts with the planning. It happens before any filming occurs. Various departments work together with the director to determine the look of the film. The matchmover has to familiarize him- or herself with the information to recreate the scene virtually after filming.

2. Data Gathering

Second stage is the data gathering. It happens on the filming location where they start measuring the props with advanced equipment like laser survey heads. Engineers and visual effect artists use it to locate points in 3D space. Later they can use the data to recreate the environment. They measure basically everything that can be measured. They photograph everything what is possible on the location for reference later. They keep record of the camera details as well. Lens type, focal length and focus distance. The more detail recorded, the better

3. Building the Assets

Next step is the asset building. The team heads back to the office and put all information, details and references together in order to recreate a virtual scene. They create the props the character models and the skeleton rig which will be animated later.

4. Matchmoving

The fourth stage is the Camera tracking/ matchmoving. Lot of work has been done already. After tracking the real shot, matchmover recreates a virtual camera. A twin of the real-life camera: the same focal length, lens, height, tilt, position and motion. An image plane is attached to the virtual camera, it moves with it. If we look through the virtual camera we can see the real-live set and the virtual environment as well. Both set have to line up with each other (figure 5) so that the computer generated 3D images can be placed in the scene.

$Description: I:\TVU\Research practice\images\06_S1250003.jpg$

Figure 5

How does the camera tracking software work?

We give the program information; it goes through equations using the provided variables and gives us a camera solution. The more and better information we put in, the better solution we get.

What variables we have to give to the software? We have to point out different kind of spots on the image plane what stands out in the scene so that the program can track those points easily. According to Erica (2010) any kind of points what stand out from others like a crack on the wall or corners of objects would work perfectly. It is important because they match certain points in the virtual set. The program starts the 2D tracking. It tells where these features are on every frame of the image sequence. If we connect that point on the 2D image plane to the virtual camera with an imaginary string, the string will go through the virtual scene (the virtual set is between the virtual camera and the 2D image plane, like in figure 6 and 7) and it will hit the same exact point what is marked out on the 2D plane. That virtual point called 3D locator. When the virtual camera moves or rotates, the image plane moves with camera since they are connected to each other. In every frame the virtual scene lines up with the image plane perfectly.

$Description: I:\TVU\Research practice\images\07_S1250004.jpg$

$Description: I:\TVU\Research practice\images\05_S1250001.jpg$

Referring to Erica’s words (2010):

“For every 2D position on a plane and every associated 3D locator in the virtual set, the software stretches a string from the camera through the virtual location and to the plane, manoeuvring the virtual camera until all the strings line up correctly. The technical term for this is triangulation.”

“Triangulation is a process of determining the locations of a point in 3D space by calculating its relationship to known points.”

“Software, this bathroom tile here on the plane is right here on my model. This one right here is here on my model. This green box is right here on my model. I know how wide these tiles are, and how big the box is, so you can figure out the rest. Match it up – I’m going for a coffee.”

What kind of matchmove tasks are there? There is not just one type of matchmove – the term actually comprises a few different functions. The most common type is the Camera tracking, were we want to create a duplicate of the live action camera. Previously we were talking about surveyed camera tracking in depth. On the other hand there are surveyless matchmoves. Occasionally we do not get any survey data, or no information at all about the camera. Agreeing with Tim Dobbert (2005) when there is no information about camera at all, the matchmover team should use the automated camera tracking method. The automatic tracking system picks up the 2D points automatically and tracks them through the image sequence (figure 8). It uses mathematical algorithms and finally creates the virtual camera and the point cloud (group of 3D locators, figure 9).

$Description: I:\TVU\Research practice\images\08_S1250004.jpg$

Figure 8

$Description: I:\TVU\Research practice\images\09_S1250006.jpg$

Figure 9

Conclusion

Real time augmented reality and non-real time augmented reality in films, are different topics, on the other hand they are related to each other. There is a system in both, which tracks the real live scene, so that a computer generated image can be placed with the correct position, scale, orientation relative to the objects in the live set. Augmented reality’s potential goal is to create high level consistency between real and virtual environments.

Reference list:

Text

Sutherland, I. E. (1965). "The Ultimate Display". Proceedings of IFIP 65, vol 2, pp. 506-508

Tim Dobbert (2005). Matchmoving: The Invisible Art of Camera Tracking. SYBEX Inc.

Oliver Bimber and Ramesh Raskar (2005). Spatial Augmented Reality. A K Peters, Ltd.

Erica Hornung (2010). The Art and Technique of Matchmoving. Focal Pres.

Stephen Cawood (2007). Augmented Reality: A Practical Guide. The Pragmatic Bookshelf

Images/Figures

1. http://blog.tmcnet.com/blog/tom-keating/images/r2-d2-princess-leia-hologram.jpg

2. http://www.jvrb.org/articles/34/figure2.jpg

3. http://handheldar.icg.tugraz.at/images/artoolkitplus_Smartphone.jpg

4. http://www.uni-weimar.de/cms/uploads/RTEmagicC_sarc.JPG.jpg

5. 6. 7. 8. The Art and Technique of Matchmoving. Focal Pres.

Research Practice Blog

http://foridesignblog.blogspot.com/

Research Practice Blog

Thursday, 29 September 2011

Test