Browsing by Author "Lu, Tianjun"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
- Bicycle and Pedestrian Traffic Monitoring and AADT Estimation in a Small Rural College TownLu, Tianjun (Virginia Tech, 2016-08-15)Non-motorized (i.e., bicycle and pedestrian) traffic patterns are an understudied but important part of transportation systems. A key need for transportation planners is traffic monitoring programs similar to motorized traffic. Count campaigns can help estimate mode choice, measure infrastructure performance, track changes in volume, prioritize projects, analyze travel patterns (e.g., annual average daily traffic [AADT] and miles traveled [MT]), and conduct safety analysis (e.g., crash, injury and collision). However, unlike for motorized traffic, non-motorized traffic has not been comprehensively monitored in communities throughout the U.S. and is generally performed in an ad hoc fashion. My thesis explores how to (1) best count bicycles and pedestrians on the entire transportation network, rather than only focus on off-street trail systems or specific transportation corridors and (2) estimate AADT of bicycles and pedestrians in a small college town (i.e., Blacksburg, VA). I used a previously developed count campaign in Blacksburg, VA to collect bicycle and pedestrian counts using existing monitoring technologies (e.g., pneumatic tubes, passive infrared, and RadioBeam). I then summarized those counts to (1) identify seasonal, daily, and hourly patterns of non-motorized traffic and (2) develop scaling factors (analogous to those used in motor vehicle count programs) derived from the continuous reference sites to estimate long-term averages (i.e., AADT) for short-duration count sites. I collected ~40,000 hours of bicycle and pedestrian counts from early September 2014 to January 2016. The count campaign included 4 continuous reference sites (~ full year-2015 counts) and 97 short-duration sites (≥ 1-week counts) that covered different road and trail types (i.e., major road, local road, and off-street trails). I used 25 commercially available counters (i.e., 12 MetroCount MC 5600 Vehicle Classifier System [pneumatic tube counters], 10 Eco-counter 'Pyro' [passive infrared counters], and 3 Chambers RadioBeam Bicycle-People Counter [radiobeam counters]) to conduct the traffic count campaign. Three MetroCount, 4 Eco-counter, and 1 RadioBeam counter were installed at the 4 continuous reference sites; the remaining counters were rotated on a weekly basis at the short-duration count sites. I validated automated counts with field-based manual counts for all counters (210 total hours of validation counts). The validation counts were used to adjust automated counts due to systematic counter errors (e.g., occlusion) by developing correction equations for each type of counter. All automated counters were well correlated with the manual counts (MetroCount R2 [absolute error]: 0.90 [38%]; Eco-counter: 0.97 [24%]; RadioBeam bicycle: 0.92 [19%], RadioBeam pedestrian: 0.92 [22%]). I compared three bicycle-based classification schemes provided by MetroCount (i.e., ARX Cycle, BOCO and Bicycle 15). Based on the validation counts the BOCO (Boulder County, CO) classification scheme (hourly counts) had similar R2 using a polynomial correction equation (0.898) as compared to ARX Cycle (0.895) and Bicycle 15 (0.897). Using a linear fit, the slope was smallest for BOCO (1.26) as compared to ARX Cycle (1.29) and Bicycle 15 (1.31). Therefore, I used the BOCO classification scheme to adjust the automated hourly bicycle counts from MetroCount. To ensure a valid count dataset was used for further analysis, I conducted quality assurance and quality control (QA/QC) protocols to the raw dataset. Overall, the continuous reference sites demonstrated good temporal coverage during the period the counters were deployed (bicycles: 96%; pedestrians: 87%) and for the calendar year-2015 (bicycles: 75%; pedestrians: 87%). For short-duration sites, 98% and 94% of sites had at least 7 days of monitoring for bicycles and pedestrians, respectively; no sites experienced 5 days or less of counts. I analyzed the traffic patterns and estimated AADT for all monitoring sites. I calculated average daily traffic, mode share, weekend to weekday ratio and hourly traffic curves to assess monthly, daily, and hourly patterns of bicycle and pedestrian traffic at the continuous reference sites. I then classified short-duration count sites into factor groups (i.e., commute [28%], recreation [11%], and mixed [61%]). These factor groups are normally used for corresponding continuous reference sites with the same patterns to apply scaling factors. However, due to limitations of the number (n=4) of continuous reference sites, the factor groups were only used as supplemental information in this analysis. To impute missing days at the 4 continuous reference sites to build a full year-2015 (i.e., 365 days) dataset, I built 8 site-specific negative binomial regression models (4 for bicycles and 4 for pedestrians) using temporal and weather variables (i.e., daily max temperature, daily temperature variation compared to the normal 30-year averages [1980-2010], precipitation, wind speed, weekend, and university in session). In general, the goodness-of-fit for the models was better for the bicycle traffic models (validation R2 = ~0.70) as compared to the pedestrian traffic models (validation R2 = ~0.30). The selected variables were correlated with bicycle and pedestrian traffic and cyclists are more sensitive to weather conditions than pedestrians. Adding model-generated estimates of missing days into the existing observed reference site counts allowed for calculating AADT for each continuous reference site (bicycles volumes ranged from 21 to 179; pedestrian volumes ranged from 98 to 4,232). Since a full year-2015 dataset was not available at the short-duration sites, I developed day-of-year scaling factors from the 4 continuous reference sites to apply to the short-duration counts. The scaling factors were used to estimate site-specific AADT for each day of the short-duration count sites (~7 days of counts per location). I explored the spatial relationships among bicycle and pedestrian AADT, road and trail types, and bike facility (i.e., bike lane). The results indicated that bicycle AADT is significantly higher (p < 0.01) on roads with a bike lane (mean: 72) as compared to roads without (mean: 30); bicycle AADT is significantly higher (p < 0.01) on off-street trails (mean: 72) as compared to major roads (mean: 33). Pedestrian AADT is significantly higher (p < 0.01) on local roads (mean: 693) as compared to off-street trails (mean: 111); this finding is likely owing to the fact that most roads on the Virginia Tech campus are classified as local roads. In Chapter 5, I conclude with (1) recommendations for implementation (e.g., counter installation and data analysis), (2) key findings of bicycle and pedestrian traffic analysis in Blacksburg and (3) strengths, limitations, and directions for future research. This research has the potential to influence urban planning; for example, offering guidance on developing routine non-motorized traffic monitoring, estimating bicycle and pedestrian AADT, prioritizing projects and measuring performance. However, this work could be expanded in several ways; for example, deploying more continuous reference sites, exploring ways to monitor or estimate pedestrians where no sidewalks exist and incorporating other spatial variables (e.g., land use variables) to study pedestrian volumes in future research. The overarching goal of my research is to yield guidance for jurisdictions that seek to implement systematic bicycle and pedestrian monitoring campaigns and to help decision making to encourage healthy, safe, and harmonious communities.
- Combining expert and crowd-sourced training data to map urban form and functions for the continental USDemuzere, Matthias; Hankey, Steven C.; Mills, Gerald; Zhang, Wenwen; Lu, Tianjun; Bechtel, Benjamin (2020-08-11)Although continental urban areas are relatively small, they are major drivers of environmental change at local, regional and global scales. Moreover, they are especially vulnerable to these changes owing to the concentration of population and their exposure to a range of hydro-meteorological hazards, emphasizing the need for spatially detailed information on urbanized landscapes. These data need to be consistent in content and scale and provide a holistic description of urban layouts to address different user needs. Here, we map the continental United States into Local Climate Zone (LCZ) types at a 100 m spatial resolution using expert and crowd-sourced information. There are 10 urban LCZ types, each associated with a set of relevant variables such that the map represents a valuable database of urban properties. These data are benchmarked against continental-wide existing and novel geographic databases on urban form. We anticipate the dataset provided here will be useful for researchers and practitioners to assess how the configuration, size, and shape of cities impact the important human and environmental outcomes.
- Land Use Regression models for 60 volatile organic compounds: Comparing Google Point of Interest (POI) and city permit dataLu, Tianjun; Lansing, Jennifer; Zhang, Wenwen; Bechle, Matthew J.; Hankey, Steven C. (2019-08-10)Land Use Regression (LUR) models of Volatile Organic Compounds (VOC) normally focus on land use (e.g., industrial area) or transportation facilities (e.g., roadway); here, we incorporate area sources (e.g., gas stations) from city permitting data and Google Point of Interest (POI) data to compare model performance. We used measurements from 50 community-based sampling locations (2013-2015) in Minneapolis, MN, USA to develop LUR models for 60 VOCs. We used three sets of independent variables: (1) base-case models with land use and transportation variables, (2) models that add area source variables from local business permit data, and (3) models that use Google POI data for area sources. The models with Google POI data performed best; for example, the total VOC (TVOC) model has better goodness-of-fit (adj-R-2: 0.56; Root Mean Square Error [RMSE]: 032 mu g/m(3)) as compared to the permit data model (0.42; 037) and the base-case model (0.26; 0.41). Area source variables were selected in over two thirds of models among the 60 VOCs at small-scale buffer sizes (e.g., 25 m-500 m). Our work suggests that VOC LUR models can be developed using community-based sampling and that models improve by including area sources as measured by business permit and Google POI data. (C) 2019 The Authors. Published by Elsevier B.V.
- New Opportunities in Crowd-Sourced Monitoring and Non-government Data Mining for Developing Urban Air Quality Models in the USLu, Tianjun (Virginia Tech, 2020-05-15)Ambient air pollution is among the top 10 health risk factors in the US. With increasing concerns about adverse health effects of ambient air pollution among stakeholders including environmental scientists, health professionals, urban planners and community residents, improving air quality is a crucial goal for developing healthy communities. The US Environmental Protection Agency (EPA) aims to reduce air pollution by regulating emissions and continuously monitoring air pollution levels. Local communities also benefit from crowd-sourced monitoring to measure air pollution, particularly with the help of rapidly developed low-cost sampling technologies. The shift from relying only on government-based regulatory monitoring to crowd-sourced effort has provided new opportunities for air quality data. In addition, the fast-growing data sciences (e.g., data mining) allow for leveraging open data from different sources to improve air pollution exposure assessment. My dissertation investigates how new data sources of air quality (e.g., community-based monitoring, low-cost sensor platform) and model predictor variables (e.g., non-government open data) based on emerging modeling approaches (e.g., machine learning [ML]) could be used to improve air quality models (i.e., land use regression [LUR]) at local, regional, and national levels for refined exposure assessment. LUR models are commonly used for predicting air pollution concentrations at locations without monitoring data based on neighboring land use and geographic variables. I explore the use of crowd-sourced low-cost monitoring data, new/open dataset from government and non-government sponsored platforms, and emerging modeling techniques to develop LUR models in the US. I focus on testing whether: (1) air quality data from community-based monitoring is feasible for developing LUR models, (2) air quality data from non-government crowd-sourced low-cost sensor platforms could supplement regulatory monitors for LUR development, and (3) new/open data extracted from non-government sponsored platforms could serve as alternative datasets to traditional predictor variable sources (e.g., land use and geographic features) in LUR models. In Chapter 3, I developed LUR models using community-based sampling (n = 50) for 60 volatile organic compounds (VOC) in the city of Minneapolis, US. I assessed whether adding area source-related features improves LUR model performance and compared model performance using variables featuring area sources from government vs. non-government sponsored platforms. I developed three sets of models: (1) base-case models with land use and transportation variables, (2) base-case models adding area source variables from local business permit data (government sponsored platform), and (3) base-case models adding Google point of interest (POI) data for area sources. Models with Google POI data performed the best; for example, the total VOC (TVOC) model had better goodness-of-fit (adj-R2: 0.56; Root Mean Square Error [RMSE]: 0.32 µg/m3) as compared to the permit data model (0.42; 0.37) and the base-case model (0.26; 0.41). This work suggests that VOC LUR models can be developed using community-based samples and adding Google POI could improve model performance as compared to using local business permit data. In Chapter 4, I evaluated a national LUR model using annual average PM2.5 concentrations from low-cost sensors (i.e., PurpleAir platform) in 6 US urban areas (n = 149) and tested the feasibility of using low-cost sensor data for developing LUR models. I compared LUR models using only the PurpleAir sensors vs. hybrid LUR models (combining both the EPA regulatory monitors and the PurpleAir sensors). I found that the low-cost sensor network could serve as a promising alternative to fill the gaps of existing regulatory networks. For example, the national regulatory monitor-based LUR (i.e., CACES LUR developed as part of the Center for Air, Climate, and Energy Solutions) may fail to capture locations with high PM2.5 concentrations and the within-city spatial variability. Developing LUR models using the PurpleAir sensors was reasonable (PurpleAir sensors only: 10-fold CV R2 = 0.66, MAE = 2.01 µg/m3; PurpleAir and regulatory monitors: R2 = 0.85, MAE = 1.02 µg/m3). I also observed that incorporating PurpleAir sensor data into LUR models could help capture within-city variability and merit further investigation on areas of disagreement with the regulatory monitors. This work suggests that the use of crowd-sourced low-cost sensor networks for LUR models could potentially help exposure assessment and inform environmental and health policies, particularly for places (e.g., developing countries) where regulatory monitoring network is limited. In Chapter 5, I developed national LUR models to predict annual average concentrations of 6 criteria pollutants (NO2, PM2.5, O3, CO, SO2 and PM10) in the US to compare models using new data (Google POI, Google Street View [GSV] and Local Climate Zone [LCZ]) vs. traditional geographic variables (e.g., road lengths, area of built land) based on different modeling approaches (partial least square [PLS], stepwise regression and machine learning [ML] with and without Kriging effect). Model performance was similar for both variable scenarios (e.g., random 10-fold CV R2 of ML-kriging models for NO2, new vs. traditional: 0.89 vs. 0.91); whereas adding the new variables to the traditional LUR models didn't necessarily improve model performance. Models with kriging effect outperformed those without (e.g., CV R2 for PM2.5 using the new variables, ML-kriging vs. ML: 0.83 vs. 0.67). The importance of the new variables to LUR models highlights the potential of substituting traditional variables, thus enabling LUR models for areas with limited or no data (e.g., developing countries) and across cities. The dissertation presents the integration of new/open data from non-government sponsored platform and crowd-sourced low-cost sensor networks in LUR models based on different modeling approaches for predicting ambient air pollution. The analyses provide evidence that using new data sources of both air quality and predictor variables could serve as promising strategies to improve LUR models for tracking exposures more accurately. The results could inform environment scientists, health policy makers, as well as urban planners interested in promoting healthy communities.