Browsing by Author "Cao, Yong"
Now showing 1 - 20 of 40
Results Per Page
Sort Options
- An Analysis of Conventional & Heterogenous Workloads on Production Supercomputing ResourcesBerkhahn, Jonathan Allen (Virginia Tech, 2013-06-06)Cloud computing setups are a huge investment of resources and personnel to maintain. As
the workload on a system is a major contributing factor to both the performance of the
system and a representation of the needs of system users, a clear understanding of the
workload is critical to organizations that support supercomputing systems. In this paper,
we analyze traces from two production level supercomputers to infer the characteristics of
their workloads, and make observations as to the needs of supercomputer users based on
them. We particularly focus on the usage of graphical processing units by domain
scientists. Based on this analysis, we generate a synthetic workload that can be used for
testing future systems, and make observations as to e"cient resource provisioning. - Analysis of the Relationships between Changes in Distributed System Behavior and Group DynamicsLazem, Shaimaa (Virginia Tech, 2012-04-06)The rapid evolution of portable devices and social media has enabled pervasive forms of distributed cooperation. A group could perform a task using a heterogeneous set of the devices (desktop, mobile), connections (wireless, wired, 3G) and software clients. We call this form of systems Distributed Dynamic Cooperative Environments (DDCEs). Content in DDCEs is created and shared by the users. The content could be static (e.g., video or audio), dynamic (e.g.,wikis), and/or Objects with behavior. Objects with behavior are programmed objects that take advantage of the available computational services (e.g., cloud-based services). Providing a desired Quality of Experience (QoE) in DDCEs is a challenge for cooperative systems designers. DDCEs are expected to provide groups with the utmost flexibility in conducting their cooperative activities. More flexibility at the user side means less control and predictability of the groups' behavior at the system side. Due to the lack of Quality of Service (QoS) guarantees in DDCEs, groups may experience changes in the system behavior that are usually manifested as delays and inconsistencies in the shared state. We question the extent to which cooperation among group members is sensitive to system changes in DDCEs. We argue that a QoE definition for groups should account for cooperation emergence and sustainability. An experiment was conducted, where fifteen groups performed a loosely-coupled task that simulates social traps in a 3D virtual world. The groups were exposed to two forms of system delays. Exo-content delays are exogenous to the provided content (e.g., network delay). Endo-content delays are endogenous to the provided content (e.g., delay in processing time for Objects with behavior). Groups' performance in the experiment and their verbal communication have been recorded and analyzed. The results demonstrate the nonlinearity of groups' behavior when dealing with endo-content delays. System interventions are needed to maintain QoE even though that may increase the cost or the required resources. Systems are designed to be used rather than understood by users. When the system behavior changes, designers have two choices. The first is to expect the users to understand the system behavior and adjust their interaction accordingly. That did not happen in our experiment. Understanding the system behavior informed groups' behavior. It partially influenced how the groups succeeded or failed in accomplishing its goal. The second choice is to understand the semantics of the application and provide guarantees based on these semantics. Based on our results, we introduce the following design guidelines for QoE provision in DDCEs. • If possible the system should keep track of information about group goals and add guarding constraints to protect these goals. • QoE guarantees should be provided based on the semantics of the user-generated content that constitutes the group activity. • Users should be given the option to define the content that is sensitive to system changes (e.g., Objects with behavior that are sensitive to delays or require intensive computations) to avoid the negative impacts of endo-content delays. • Users should define the Objects with behavior that contribute to the shared state in order for the system to maintain the consistency of the shared state. • Endo-content delays were proven to have significantly negative impacts on the groups in our experiment compared to exo-content delays. We argue that system designers, if they have the choice, should trade processing time needed for Objects with behavior for exo-content delay.
- An Application-Oriented Approach for Accelerating Data-Parallel Computation with Graphics Processing UnitPonce, Sean; Jing, Huang; Park, Seung In; Khoury, Chase; Quek, Francis; Cao, Yong (Department of Computer Science, Virginia Polytechnic Institute & State University, 2009-03-01)This paper presents a novel parallelization and quantitative characterization of various optimization strategies for data-parallel computation on a graphics processing unit (GPU) using NVIDIA's new GPU programming framework, Compute Unified Device Architecture (CUDA). CUDA is an easy-to-use development framework that has drawn the attention of many different application areas looking for dramatic speed-ups in their code. However, the performance tradeoffs in CUDA are not yet fully understood, especially for data-parallel applications. Consequently, we study two fundamental mathematical operations that are common in many data-parallel applications: convolution and accumulation. Specifically, we profile and optimize the performance of these operations on a 128-core NVIDIA GPU. We then characterize the impact of these operations on a video-based motion-tracking algorithm called vector coherence mapping, which consists of a series of convolutions and dynamically weighted accumulations, and present a comparison of different implementations and their respective performance profiles.
- Architecture-Aware Mapping and Optimization on Heterogeneous Computing SystemsDaga, Mayank (Virginia Tech, 2011-04-27)The emergence of scientific applications embedded with multiple modes of parallelism has made heterogeneous computing systems indispensable in high performance computing. The popularity of such systems is evident from the fact that three out of the top five fastest supercomputers in the world employ heterogeneous computing, i.e., they use dissimilar computational units. A closer look at the performance of these supercomputers reveals that they achieve only around 50% of their theoretical peak performance. This suggests that applications that were tuned for erstwhile homogeneous computing may not be efficient for today's heterogeneous computing and hence, novel optimization strategies are required to be exercised. However, optimizing an application for heterogeneous computing systems is extremely challenging, primarily due to the architectural differences in computational units in such systems. This thesis intends to act as a cookbook for optimizing applications on heterogeneous computing systems that employ graphics processing units (GPUs) as the preferred mode of accelerators. We discuss optimization strategies for multicore CPUs as well as for the two popular GPU platforms, i.e., GPUs from AMD and NVIDIA. Optimization strategies for NVIDIA GPUs have been well studied but when applied on AMD GPUs, they fail to measurably improve performance because of the differences in underlying architecture. To the best of our knowledge, this research is the first to propose optimization strategies for AMD GPUs. Even on NVIDIA GPUs, there exists a lesser known but an extremely severe performance pitfall called partition camping, which can affect application performance by up to seven-fold. To facilitate the detection of this phenomenon, we have developed a performance prediction model that analyzes and characterizes the effect of partition camping in GPU applications. We have used a large-scale, molecular modeling application to validate and verify all the optimization strategies. Our results illustrate that if appropriately optimized, AMD and NVIDIA GPUs can provide 371-fold and 328-fold improvement, respectively, over a hand-tuned, SSE-optimized serial implementation.
- AVIST: A GPU-Centric Design for Visual Exploration of Large Multidimensional DatasetsMi, Peng; Sun, Maoyuan; Masiane, Moeti; Cao, Yong; North, Christopher L. (MDPI, 2016-10-07)This paper presents the Animated VISualization Tool (AVIST), an exploration-oriented data visualization tool that enables rapidly exploring and filtering large time series multidimensional datasets. AVIST highlights interactive data exploration by revealing fine data details. This is achieved through the use of animation and cross-filtering interactions. To support interactive exploration of big data, AVIST features a GPU (Graphics Processing Unit)-centric design. Two key aspects are emphasized on the GPU-centric design: (1) both data management and computation are implemented on the GPU to leverage its parallel computing capability and fast memory bandwidth; (2) a GPU-based directed acyclic graph is proposed to characterize data transformations triggered by users’ demands. Moreover, we implement AVIST based on the Model-View-Controller (MVC) architecture. In the implementation, we consider two aspects: (1) user interaction is highlighted to slice big data into small data; and (2) data transformation is based on parallel computing. Two case studies demonstrate how AVIST can help analysts identify abnormal behaviors and infer new hypotheses by exploring big datasets. Finally, we summarize lessons learned about GPU-based solutions in interactive information visualization with big data.
- Characterization and Exploitation of GPU Memory SystemsLee, Kenneth Sydney (Virginia Tech, 2012-07-06)Graphics Processing Units (GPUs) are workhorses of modern performance due to their ability to achieve massive speedups on parallel applications. The massive number of threads that can be run concurrently on these systems allow applications which have data-parallel computations to achieve better performance when compared to traditional CPU systems. However, the GPU is not perfect for all types of computation. The massively parallel SIMT architecture of the GPU can still be constraining in terms of achievable performance. GPU-based systems will typically only be able to achieve between 40%-60% of their peak performance. One of the major problems affecting this effeciency is the GPU memory system, which is tailored to the needs of graphics workloads instead of general-purpose computation. This thesis intends to show the importance of memory optimizations for GPU systems. In particular, this work addresses problems of data transfer and global atomic memory contention. Using the novel AMD Fusion architecture, we gain overall performance improvements over discrete GPU systems for data-intensive applications. The fused architecture systems offer an interesting trade off by increasing data transfer rates at the cost of some raw computational power. We characterize the performance of different memory paths that are possible because of the shared memory space present on the fused architecture. In addition, we provide a theoretical model which can be used to correctly predict the comparative performance of memory movement techniques for a given data-intensive application and system. In terms of global atomic memory contention, we show improvements in scalability and performance for global synchronization primitives by avoiding contentious global atomic memory accesses. In general, this work shows the importance of understanding the memory system of the GPU architecture to achieve better application performance.
- Clustered Layout Word Cloud for User Generated Online ReviewsWang, Ji (Virginia Tech, 2012-11-20)User generated reviews, like those found on Yelp and Amazon, have become important reference material in casual decision making, like dining, shopping and entertainment. However, very large amounts of reviews make the review reading process time consuming. A text visualization can speed up the review reading process. In this thesis, we present the clustered layout word cloud -- a text visualization that quickens decision making based on user generated reviews. We used a natural language processing approach, called grammatical dependency parsing, to analyze user generated review content and create a semantic graph. A force-directed graph layout was applied to the graph to create the clustered layout word cloud. We conducted a two-task user study to compare the clustered layout word cloud to two alternative review reading techniques: random layout word cloud and normal block-text reviews. The results showed that the clustered layout word cloud offers faster task completion time and better user satisfaction than the other two alternative review reading techniques. [Permission email from J. Huang removed at his request. GMc March 11, 2014]
- Co-Located Many-Player Gaming on Large High-Resolution DisplaysMachaj, David Andrew (Virginia Tech, 2009-05-04)Two primary types of multiplayer gaming have emerged over the years. The first type involves co-located players on a shared display, and typically caps at four players. The second type of gaming provides a single display for each player. This type scales well beyond four players, but places no requirement on co-location. This paper will attempt to combine the best of both worlds via high-resolution, highly-multiplayer gaming. Over the past few years, there has been a rise in the number of extremely high-resolution, tiled displays. These displays provide an enormous amount of screen space to work with. This space was used to allow twelve co-located players to play a game together. This study accomplishes three things: we designed and built PyBomber, a high-resolution and highly multiplayer game for up to twelve players; secondly, user trials were conducted to see whether this type of gaming is enjoyable as well as to learn what sorts of social interactions take place amongst so many players; lastly, the lessons learned were generalized into design criteria for future high-resolution games. Results show that with more people, much more of the time during a game was filled with vocal interactions between players. There were also more physical movements in the larger games. Over the course of this study, we learned that good high-resolution games will: decide between a singular gameplay area and split views, use the physical space in front of the display, provide feedback that is localized to each player, and utilize input devices appropriately.
- Context Sensitive Interaction Interoperability for Distributed Virtual EnvironmentsAhmed, Hussein Mohammed (Virginia Tech, 2010-05-28)The number and types of input devices and related interaction technique types are growing rapidly. Innovative input devices such as game controllers are no longer used just for games, propriety consoles and specific applications, they are also used in many distributed virtual environments, especially the so-called serious virtual environments. In this dissertation a distributed, service based framework is presented to offer context-sensitive interaction interoperability that can support mapping between input devices and suitable application tasks given the attributes (device, applications, users, and interaction techniques) and the current user context without negatively impacting performances of large scale distributed environments. The mapping is dynamic and context sensitive taking into account the context dimensions of both the virtual and real planes. What device or device component to use, how and when to use them depend on the application, task performed, the user and the overall context, including location and presence of other users. Another use of interaction interoperability is as a testbed for input devices, and interaction techniques making it possible to test reality based interfaces and interaction techniques with legacy applications. The dissertation provides a description how the framework provides these affordances and a discussion of motivations, goals and the addressed challenges. Several proof of the concept implementations were developed and an evaluation of the framework performance (in terms of system characteristics) demonstrates viability, scalability and negligible delays.
- Design Study to Visualize Stock Market Bubble Formations and BurstsIyer, Sruthi Ganesan (Virginia Tech, 2014-06-23)The stock market is a very complex and continuously changing environment in which many varying factors shape its growth and decline. Studying interesting trends and analyzing the intricate movements of the market, while ignoring distracting and uninteresting patterns has the potential to save large amounts of money for individuals as well as corporations and governments. This thesis describes research that was conducted with the goal to visualize stock market data in such a way that it is able to show how behavior and movement of various market entities affects the condition of the market as a whole. Different visualizations have been proposed, some that improve on existing traditional methods used by the Finance community and others that are novel in their layout and representation of data and interactions. The proposed design, by the use of interactive multiple coordinated views showing overviews and details of the stock market data using animated bubble charts and statistics, aims to enable the user to visualize market conditions that lead to the formation of a bubble in the market, how they lead to a crash and how the market corrects itself after such a crash.
- Electric Propulsion Plume Simulations Using Parallel ComputerWang, Joseph J.; Cao, Yong; Kafafy, Raed; Decyk, Viktor (Hindawi, 2007-01-01)A parallel, three-dimensional electrostatic PIC code is developed for large-scale electric propulsion simulations using parallel supercomputers. This code uses a newly developed immersed-finite-element particle-in-cell (IFE-PIC) algorithm designed to handle complex boundary conditions accurately while maintaining the computational speed of the standard PIC code. Domain decomposition is used in both field solve and particle push to divide the computation among processors. Two simulations studies are presented to demonstrate the capability of the code. The first is a full particle simulation of near-thruster plume using real ion to electron mass ratio. The second is a high-resolution simulation of multiple ion thruster plume interactions for a realistic spacecraft using a domain enclosing the entire solar array panel. Performance benchmarks show that the IFE-PIC achieves a high parallel efficiency of ≥ 90%
- Exploring Performance Portability for Accelerators via High-level Parallel PatternsHou, Kaixi (Virginia Tech, 2018-08-27)Nowadays, parallel accelerators have become prominent and ubiquitous, e.g., multi-core CPUs, many-core GPUs (Graphics Processing Units) and Intel Xeon Phi. The performance gains from them can be as high as many orders of magnitude, attracting extensive interest from many scientific domains. However, the gains are closely followed by two main problems: (1) A complete redesign of existing codes might be required if a new parallel platform is used, leading to a nightmare for developers. (2) Parallel codes that execute efficiently on one platform might be either inefficient or even non-executable for another platform, causing portability issues. To handle these problems, in this dissertation, we propose a general approach using parallel patterns, an effective and abstracted layer to ease the generating efficient parallel codes for given algorithms and across architectures. From algorithms to parallel patterns, we exploit the domain expertise to analyze the computational and communication patterns in the core computations and represent them in DSL (Domain Specific Language) or algorithmic skeletons. This preserves the essential information, such as data dependencies, types, etc., for subsequent parallelization and optimization. From parallel patterns to actual codes, we use a series of automation frameworks and transformations to determine which levels of parallelism can be used, what optimal instruction sequences are, how the implementation change to match different architectures, etc. Experiments show that our approaches by investigating a couple of important computational kernels, including sort (and segmented sort), sequence alignment, stencils, etc., across various parallel platforms (CPUs, GPUs, Intel Xeon Phi).
- Exploring the Effects of Higher-Fidelity Display and Interaction for Virtual Reality GamesMcMahan, Ryan Patrick (Virginia Tech, 2011-12-05)In recent years, consumers have witnessed a technological revolution that has delivered more-realistic experiences in their own homes. Expanding technologies have provided larger displays with higher resolutions, faster refresh rates, and stereoscopic capabilities. These advances have increased the level of display fidelity—the objective degree of exactness with which real-world sensory stimuli are reproduced by a display system. Similarly, the latest generation of video game systems (e.g., Nintendo Wii and Xbox Kinect) with their natural, gesture-based interactions have delivered increased levels of interaction fidelity—the objective degree of exactness with which real-world interactions can be reproduced in an interactive system. Though this technological revolution has provided more realistic experiences, it is not completely clear how increased display fidelity and interaction fidelity impact the user experience because the effects of increasing fidelity to the real world have not been empirically established. The goal of this dissertation is to provide a better understanding of the effects of both display fidelity and interaction fidelity on the user experience. For the context of our research, we chose virtual reality (VR) games because immersive VR allows for high levels of fidelity to be achieved while games usually involve complex, performance-intensive tasks. In regard to the user experience, we were concerned with objective performance metrics and subjective responses such as presence, engagement, perceived usability, and overall preferences. We conducted five systematically controlled studies that evaluated display and interaction fidelity at contrasting levels in order to gain a better understanding of their effects. In our first study, which involved a 3D object manipulation game within a three-sided CAVE, we found that stereoscopy and the total size of the visual field surrounding the user (i.e., field of regard or FOR) did not have a significant effect on manipulation times but two high-fidelity interaction techniques based on six degrees-of-freedom (DOF) input outperformed a low-fidelity technique based on keyboard and mouse input. In our second study, which involved a racing game on a commercial game console, we solely investigated interaction fidelity and found that two low-fidelity steering techniques based on 2D joystick input outperformed two high-fidelity steering techniques based on 3D accelerometer data in terms of lap times and driving errors. Our final three studies involved a first-person shooter (FPS) game implemented within a six-sided CAVE. In the first of these FPS studies, we evaluated display fidelity and interaction fidelity independently, at extremely high and low levels, and found that both significantly affected strategy, performance, presence, engagement, and perceived usability. In particular, performance results were strongly in favor of two conditions: low-display, low-interaction fidelity (representative of desktop FPS games) and high-display, high-interaction fidelity (similar to the real world). In the second FPS study, we investigated the effects of FOR and pointing fidelity on the subtasks of searching, aiming, and firing. We found that increased FOR affords faster searching and that high-fidelity pointing based on 6-DOF input provided faster aiming than low-fidelity mouse pointing and a mid-fidelity mouse technique based on the heading of the user. In the third FPS study, we investigated the effects of FOR and locomotion fidelity on the subtasks of long-distance navigation and maneuvering. Our results indicated that increased FOR increased perceived usability but had no significant effect on actual performance while low-fidelity keyboard-based locomotion outperformed our high-fidelity locomotion technique developed for our original FPS study. The results of our five studies show that increasing display fidelity tends to have a positive correlation to user performance, especially for some components such as FOR. Contrastingly, our results have indicated that interaction fidelity has a non-linear correlation to user performance with users performing better with "traditionalThe results of our five studies show that increasing display fidelity tends to have a positive correlation to user performance, especially for some components such as FOR. Contrastingly, our results have indicated that interaction fidelity has a non-linear correlation to user performance with users performing better with "traditional", extremely low-fidelity techniques and "natural", extremely high-fidelity techniques while performing worse with mid-fidelity interaction techniques. These correlations demonstrate that the display fidelity and interaction fidelity continua appear to have differing effects on the user experience for VR games. In addition to learning more about the effects of display fidelity and interaction fidelity, we have also developed the Framework for Interaction Fidelity Analysis (FIFA) for comparing interaction techniques to their real-world counterparts. There are three primary factors of concern within FIFA: biomechanical symmetry, control symmetry, and system appropriateness. Biomechanical symmetry involves the comparison of the kinematic, kinetic, and anthropometric aspects of two interactions. Control symmetry compares the dimensional, transfer function, and termination characteristics of two interactions. System appropriateness is concerned with how well a VR system matches the interaction space and objects of the real-world task (e.g., a driving simulator is more appropriate than a 2D joystick for a steering task). Although consumers have witnessed a technological revolution geared towards more realistic experiences in recent years, we have demonstrated with this research that there is still much to be learned about the effects of increasing a system's fidelity to the real world. The results of our studies show that the levels of display and interaction fidelity are significant factors in determining performance, presence, engagement, and usability.
- Generalizing the Utility of Graphics Processing Units in Large-Scale Heterogeneous Computing SystemsXiao, Shucai (Virginia Tech, 2013-07-03)Today, heterogeneous computing systems are widely used to meet the increasing demand for high-performance computing. These systems commonly use powerful and energy-efficient accelerators to augment general-purpose processors (i.e., CPUs). The graphic processing unit (GPU) is one such accelerator. Originally designed solely for graphics processing, GPUs have evolved into programmable processors that can deliver massive parallel processing power for general-purpose applications. Using SIMD (Single Instruction Multiple Data) based components as building units; the current GPU architecture is well suited for data-parallel applications where the execution of each task is independent. With the delivery of programming models such as Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL), programming GPUs has become much easier than before. However, developing and optimizing an application on a GPU is still a challenging task, even for well-trained computing experts. Such programming tasks will be even more challenging in large-scale heterogeneous systems, particularly in the context of utility computing, where GPU resources are used as a service. These challenges are largely due to the limitations in the current programming models: (1) there are no intra-and inter-GPU cooperative mechanisms that are natively supported; (2) current programming models only support the utilization of GPUs installed locally; and (3) to use GPUs on another node, application programs need to explicitly call application programming interface (API) functions for data communication. To reduce the mapping efforts and to better utilize the GPU resources, we investigate generalizing the utility of GPUs in large-scale heterogeneous systems with GPUs as accelerators. We generalize the utility of GPUs through the transparent virtualization of GPUs, which can enable applications to view all GPUs in the system as if they were installed locally. As a result, all GPUs in the system can be used as local GPUs. Moreover, GPU virtualization is a key capability to support the notion of "GPU as a service." Specifically, we propose the virtual OpenCL (or VOCL) framework for the transparent virtualization of GPUs. To achieve good performance, we optimize and extend the framework in three aspects: (1) optimize VOCL by reducing the data transfer overhead between the local node and remote node; (2) propose GPU synchronization to reduce the overhead of switching back and forth if multiple kernel launches are needed for data communication across different compute units on a GPU; and (3) extend VOCL to support live virtual GPU migration for quick system maintenance and load rebalancing across GPUs. With the above optimizations and extensions, we thoroughly evaluate VOCL along three dimensions: (1) show the performance improvement for each of our optimization strategies; (2) evaluate the overhead of using remote GPUs via several microbenchmark suites as well as a few real-world applications; and (3) demonstrate the overhead as well as the benefit of live virtual GPU migration. Our experimental results indicate that VOCL can generalize the utility of GPUs in large-scale systems at a reasonable virtualization and migration cost.
- GPU Based Large Scale Multi-Agent Crowd Simulation and Path PlanningGusukuma, Luke (Virginia Tech, 2015-04-26)Crowd simulation is used for many applications including (but not limited to) videogames, building planning, training simulators, and various virtual environment applications. Particularly, crowd simulation is most useful for when real life practices wouldn't be practical such as repetitively evacuating a building, testing the crowd flow for various building blue prints, placing law enforcers in actual crowd suppression circumstances, etc. In our work, we approach the fidelity to scalability problem of crowd simulation from two angles, a programmability angle, and a scalability angle, by creating new methodology building off of a struct of arrays approach and transforming it into an Object Oriented Struct of Arrays approach. While the design pattern itself is applied to crowd simulation in our work, the application of crowd simulation exemplifies the variety of applications for which the design pattern can be used.
- GPU Based Methods for Interactive Information Visualization of Big DataMi, Peng (Virginia Tech, 2016-01-19)Interactive visual analysis has been a key component of gaining insights in information visualization area. However, the amount of data has increased exponentially in the past few years. Existing information visualization techniques lack scalability to deal with big data, such as graphs with millions of nodes, or millions of multidimensional data records. Recently, the remarkable development of Graphics Processing Unit (GPU) makes GPU useful for general-purpose computation. Researchers have proposed GPU based solutions for visualizing big data in graphics and scientific visualization areas. However, GPU based big data solutions in information visualization area are not well investigated. In this thesis, I concentrate on the visualization of big data in information visualization area. More specifically, I focus on visual exploration of large graphs and multidimensional datasets based on the GPU technology. My work demonstrates that GPU based methods are useful for sensemaking of big data in information visualization area.
- GPU-based Streaming for Parallel Level of Detail on Massive Model RenderingPeng, Chao; Cao, Yong (Department of Computer Science, Virginia Polytechnic Institute & State University, 2011)Rendering massive 3D models in real-time has long been recognized as a very challenging problem because of the limited computational power and memory space available in a workstation. Most existing rendering techniques, especially level of detail (LOD) processing, have suffered from their sequential execution natures, and does not scale well with the size of the models. We present a GPU-based progressive mesh simplification approach which enables the interactive rendering of large 3D models with hundreds of millions of triangles. Our work contributes to the massive rendering research in two ways. First, we develop a novel data structure to represent the progressive LOD mesh, and design a parallel mesh simplification algorithm towards GPU architecture. Second, we propose a GPU-based streaming approach which adopt a frame-to-frame coherence scheme in order to minimize the high communication cost between CPU and GPU. Our results show that the parallel mesh simplification algorithm and GPU-based streaming approach significantly improve the overall rendering performance.
- Immersive Virtual Reality and 3D Interaction for Volume Data AnalysisLaha, Bireswar (Virginia Tech, 2014-09-04)This dissertation provides empirical evidence for the effects of the fidelity of VR system components, and novel 3D interaction techniques for analyzing volume datasets. It provides domain-independent results based on an abstract task taxonomy for visual analysis of scientific datasets. Scientific data generated through various modalities e.g. computed tomography (CT), magnetic resonance imaging (MRI), etc. are in 3D spatial or volumetric format. Scientists from various domains e.g., geophysics, medical biology, etc. use visualizations to analyze data. This dissertation seeks to improve effectiveness of scientific visualizations. Traditional volume data analysis is performed on desktop computers with mouse and keyboard interfaces. Previous research and anecdotal experiences indicate improvements in volume data analysis in systems with very high fidelity of display and interaction (e.g., CAVE) over desktop environments. However, prior results are not generalizable beyond specific hardware platforms, or specific scientific domains and do not look into the effectiveness of 3D interaction techniques. We ran three controlled experiments to study the effects of a few components of VR system fidelity (field of regard, stereo and head tracking) on volume data analysis. We used volume data from paleontology, medical biology and biomechanics. Our results indicate that different components of system fidelity have different effects on the analysis of volume visualizations. One of our experiments provides evidence for validating the concept of Mixed Reality (MR) simulation. Our approach of controlled experimentation with MR simulation provides a methodology to generalize the effects of immersive virtual reality (VR) beyond individual systems. To generalize our (and other researchers') findings across disparate domains, we developed and evaluated a taxonomy of visual analysis tasks with volume visualizations. We report our empirical results tied to this taxonomy. We developed the Volume Cracker (VC) technique for improving the effectiveness of volume visualizations. This is a free-hand gesture-based novel 3D interaction (3DI) technique. We describe the design decisions in the development of the Volume Cracker (with a list of usability criteria), and provide the results from an evaluation study. Based on the results, we further demonstrate the design of a bare-hand version of the VC with the Leap Motion controller device. Our evaluations of the VC show the benefits of using 3DI over standard 2DI techniques. This body of work provides the building blocks for a three-way many-many-many mapping between the sets of VR system fidelity components, interaction techniques and visual analysis tasks with volume visualizations. Such a comprehensive mapping can inform the design of next-generation VR systems to improve the effectiveness of scientific data analysis.
- Increasing Selection Accuracy and Speed through Progressive RefinementBacim de Araujo e Silva, Felipe (Virginia Tech, 2015-07-21)Although many selection techniques have been proposed and developed over the years, selection by pointing is perhaps the most popular approach for selection. In 3D interfaces, the laser-pointer metaphor is commonly used, since users only have to point to their target from a distance. However, the task of selecting objects that have a small visible area or that are in highly cluttered environments is hard when using pointing techniques. With both indirect and direct pointing techniques in 3D interfaces, smaller targets require higher levels of pointing precision from the user. In addition, issues such as target occlusion as well as hand and tracker jitter negatively affect user performance. Therefore, requiring the user to perform selection in a single precise step may result in users spending more time to select targets so that they can be more accurate (effect known as the speed-accuracy trade-off). We describe an approach to address this issue, called Progressive Refinement. Instead of performing a single precise selection, users gradually reduce the set of selectable objects to reduce the required precision of the task. This approach, however, has an inherent trade-off when compared to immediate selection techniques. Progressive refinement requires a gradual process of selection, often using multiple steps, although each step can be fast, accurate, and nearly effortless. Immediate techniques, on the other hand, involve a single-step selection that requires effort and may be slower and more error-prone. Therefore, the goal of this work was to explore this trade-off. The research includes the design and evaluation of progressive refinement techniques for 3D interfaces, using both pointing- and gesture-based interfaces for single-object selection and volume selection. Our technique designs and other existing selection techniques that can be classified as progressive refinement were used to create a design space. We designed eight progressive refinement techniques and compared them to the most commonly used techniques (for a baseline comparison) and to other state-of-the-art selection techniques in a total of four empirical studies. Based on the results of the studies, we developed a set of design guidelines that will help other researchers design and use progressive refinement techniques.
- Interactive Graph Layout of a Million NodesMi, Peng; Sun, Maoyuan; Masiane, Moeti; Cao, Yong; North, Christopher L. (MDPI, 2016-12-20)Sensemaking of large graphs, specifically those with millions of nodes, is a crucial task in many fields. Automatic graph layout algorithms, augmented with real-time human-in-the-loop interaction, can potentially support sensemaking of large graphs. However, designing interactive algorithms to achieve this is challenging. In this paper, we tackle the scalability problem of interactive layout of large graphs, and contribute a new GPU-based force-directed layout algorithm that exploits graph topology. This algorithm can interactively layout graphs with millions of nodes, and support real-time interaction to explore alternative graph layouts. Users can directly manipulate the layout of vertices in a force-directed fashion. The complexity of traditional repulsive force computation is reduced by approximating calculations based on the hierarchical structure of multi-level clustered graphs. We evaluate the algorithm performance, and demonstrate human-in-the-loop layout in two sensemaking case studies. Moreover, we summarize lessons learned for designing interactive large graph layout algorithms on the GPU.