Fast and Scalable Structure-from-Motion for High-precision Mobile Augmented Reality Systems

TR Number
Date
2014-04-24
Journal Title
Journal ISSN
Volume Title
Publisher
Virginia Tech
Abstract

A key problem in mobile computing is providing people access to necessary cyber-information associated with their surrounding physical objects. Mobile augmented reality is one of the emerging techniques that address this key problem by allowing users to see the cyber-information associated with real-world physical objects by overlaying that cyber-information on the physical objects's imagery. As a consequence, many mobile augmented reality approaches have been proposed to identify and visualize relevant cyber-information on users' mobile devices by intelligently interpreting users' positions and orientations in 3D and their associated surroundings. However, existing approaches for mobile augmented reality primarily rely on Radio Frequency (RF) based location tracking technologies (e.g., Global Positioning Systems or Wireless Local Area Networks), which typically do not provide sufficient precision in RF-denied areas or require additional hardware and custom mobile devices.

To remove the dependency on external location tracking technologies, this dissertation presents a new vision-based context-aware approach for mobile augmented reality that allows users to query and access semantically-rich 3D cyber-information related to real-world physical objects and see it precisely overlaid on top of imagery of the associated physical objects. The approach does not require any RF-based location tracking modules, external hardware attachments on the mobile devices, and/or optical/fiducial markers for localizing a user's position. Rather, the user's 3D location and orientation are automatically and purely derived by comparing images from the user's mobile device to a 3D point cloud model generated from a set of pre-collected photographs.

A further challenge of mobile augmented reality is creating 3D cyber-information and associating it with real-world physical objects, especially using the limited 2D user interfaces in standard mobile devices. To address this challenge, this research provides a new image-based 3D cyber-physical content authoring method designed specifically for the limited screen sizes and capabilities of commodity mobile devices. This new approach does not only provide a method for creating 3D cyber-information with standard mobile devices, but also provides an automatic association of user-driven cyber-information with real-world physical objects in 3D.

Finally, a key challenge of scalability for mobile augmented reality is addressed in this dissertation. In general, mobile augmented reality is required to work regardless of users' location and environment, in terms of physical scale, such as size of objects, and in terms of cyber-information scale, such as total number of cyber-information entities associated with physical objects. However, many existing approaches for mobile augmented reality have mainly tested their approaches on limited real-world use-cases and have challenges in scaling their approaches. By designing fast direct 2D-to-3D matching algorithms for localization, as well as applying caching scheme, the proposed research consistently supports near real-time localization and information association regardless of users' location, size of physical objects, and number of cyber-physical information items.

To realize all of these research objectives, five research methods are developed and validated: 1) Hybrid 4-Dimensional Augmented Reality (HD4AR), 2) Plane transformation based 3D cyber-physical content authoring from a single 2D image, 3) Cached k-d tree generation for fast direct 2D-to-3D matching, 4) double-stage matching algorithm with a single indexed k-d tree, and 5) K-means Clustering of 3D physical models with geo-information. After discussing each solution with technical details, the perceived benefits and limitations of the research are discussed with validation results.

Description
Keywords
3D Reconstruction, Structure-from-Motion, 3D Cyber-physical Modeling, Direct 2D-to-3D Matching, Image-based Localization, Mobile Augmented Reality
Citation