Browsing by Author "Wu, Jian"
Now showing 1 - 15 of 15
Results Per Page
Sort Options
- Automatic Metadata Extraction Incorporating Visual Features from Scanned Electronic Theses and DissertationsChoudhury, Muntabir; Jayanetti, Himarsha R.; Wu, Jian; Ingram, William A.; Fox, Edward (IEEE, 2021-09-27)Electronic Theses and Dissertations (ETDs) contain domain knowledge that can be used for many digital library tasks, such as analyzing citation networks and predicting research trends. Automatic metadata extraction is important to build scalable digital library search engines. Most existing methods are designed for born-digital documents such as GROBID, CERMINE, and ParsCit, so they often fail to extract metadata from scanned documents such as for ETDs. Traditional sequence tagging methods mainly rely on text-based features. In this paper, we propose a conditional random field (CRF) model that combines text-based and visual features. To verify the robustness of our model, we extended an existing corpus and created a new ground truth corpus consisting of 500 ETD cover pages with human validated metadata. Our experiments show that CRF with visual features outperformed both a heuristic baseline and a CRF model with only text-based features. The proposed model achieved 81.3%-96% F1 measure on seven metadata fields. The data and source code are publicly available on Google Drive1 and a GitHub repository2.
- Building A Large Collection of Multi-domain Electronic Theses and DissertationsUddin, Sami; Banerjee, Bipasha; Wu, Jian; Ingram, William A.; Fox, Edward A. (IEEE, 2021-12-15)In this work, we report our progress on building a collection containing over 450k Electronic Theses and Dissertations (ETDs), including full-text and metadata. Our goal is to close the gap of accessibility between long text and short text documents, and to create a new research opportunity for the scholarly community. For that, we developed an ETD Ingestion Framework (EIF) that automatically harvests metadata and PDFs of ETDs from university libraries. We faced multiple challenges and learned many lessons during the process, that led to proposed solutions to overcome/mitigate the limitations of the current data. We also described the data that we have collected. We hope our methods will be useful for building similar collections from university libraries and that the data can be used for research and education.
- Decision Making Tools for Optimizing Environmental Sampling Plans for Listeria in Poultry Processing PlantsAl Wahaimed, Abdullah Saud (Virginia Tech, 2022-07-08)Meat and poultry slaughtering and processing practices have been associated with the microbial contamination with Listeria spp. Ready-to-eat poultry products have been considered as a primary agent associated with Listeria monocytogenes illness outbreaks. Developing environmental monitoring programs (EMPs) that are based on product and/or process risk level analysis is a useful approach to reduce contamination in poultry processing plants and enhance food safety. Sampling criteria that is based on product risk levels and process control in ready-to-eat poultry processing facilities was developed to allow users to design and conduct appropriate sampling plans to target Listeria spp. After developing the criteria, an internet-based environmental monitoring program ("EZSafety") was developed to allow poultry producers to enhance their sample collection and analysis of test results over time and conduct appropriate sampling plans for Listeria spp. and other microbiological indicators. The frontend of the program website was built using React Native (an open-source JavaScript library for building user interfaces). The backend of the program website was built using Node.js which executes JavaScript code outside a web browser. MongoDB was used as a document-oriented database for the website. The program was evaluated by 20 food safety professionals to assess its ability to develop appropriate sampling plans to target Listeria spp. The majority of these participants believed that EZSafety has several tools that are effective for targeting Listeria spp. and other indicators and enhancing environmental monitoring. Additionally, most participants agreed that EZSafety is organized and user-friendly. EMPs can play a significant role in improving the detection rate and the prevention of Listeria spp. and other indicators in poultry processing plants.
- Evaluating Pediococcus acidilactici and Enterococcus faecium NRRL B-2354 as Surrogates for Salmonella enterica on Low Temperature Saturated Steam Pasteurized Cashews and Macadamia NutsSaunders, Thomas P.; Wu, Jian; Williams, Robert C.; Huang, Haibo; Ponder, Monica A. (2023-02)Low temperature saturated steam pasteurization reduces numbers of Salmonella enterica on macadamia nuts and cashews. Lethality of pasteurization must be verified necessitating identification of non-pathogenic surrogates. Whole cashews and macadamia nuts were co-inoculated with S. enterica and one of three potential surrogates (Enterococcus faecium, Pediococcus acidilactici, and Staphylococcus carnosus) then dried to original water activity (aw) around 0.44 - 0.51. Nuts were packaged in woven polypropylene bags and commercially processed using saturated steam at 85 ± 5°C. Bacteria were enumerated by plating onto TSA with an overlay of XLT-4 for Salmonella, or on media selective for the potential surrogates. Mean reduction (log CFU·g-1) of Salmonella (6.0 ± 0.14) was significantly larger than E. faecium (4.3± 0.12), or P. acidilactici (3.7± 0.14) on cashews. Mean reduction of Salmonella (5.9 ± 0.18) was significantly larger than P. acidilactici (4.4± 0.18) on macadamia nuts, but instances occurred where E. faecium reduction on macadamia nuts exceeded Salmonella. St.carnosus reduction exceeded that of Salmonella and is therefore not an appropriate surrogate. Based on these studies caution should be applied when choosing a surrogate for different types of nuts, as acceptability of E. faecium as a surrogate for Salmonella varied between cashews and macadamia nuts.
- Evaluation of Listeria monocytogenes and Staphylococcus aureus Survival and Growth during Cooling of Hams Cured with Natural-Source NitriteWu, Jian; Acuff, Jennifer C.; Waterman, Kim M.; Ponder, Monica A. (International Association of Food Protection, 2021-02-01)Growing consumer demand for clean-label “natural” products has encouraged more meat processors to cure meat products with natural sources of nitrate or nitrite such as celery juice powder. One challenge for these producers is to identify safe cooling rates in products cured with celery juice powder where extended cooling could allow growth of pathogens. The Food Safety and Inspection Service of the U.S. Department of Agriculture recently added guidelines for stabilization of meat products cured using naturally occurring nitrites based on control of Clostridium spp. However, a knowledge gap exists for safe cooling rates that prevent the growth of Listeria monocytogenes and Staphylococcus aureus, potential postlethality contaminants, in naturally cured ham. The study was conducted to investigate the temperature profiles of naturally cured hams of typical sizes during refrigerator cooling and to determine the behavior of S. aureus and L. monocytogenes on ham during these cooling periods. Whole hams (14 lb [6,300 g]), half hams (6 lb [2,700 g]), and quarter hams (3 lb [1,400 g]) were slowly cooked in a smokehouse until internal temperatures reached a minimum of 1408F (608C) and then were immediately transferred into a walk-in cooler (388F [3.38C]). Cooling times for hams of all sizes were within the requirements for cured products but not for uncured products. Worst-case scenarios of postprocessing surface contamination were simulated by inoculating small naturally cured ham samples with S. aureus or L. monocytogenes. These inoculated hams were then cooled under controlled conditions of 130 to 458F (54.4 to 7.28C) for 720 to 900 min. By the end of cooling, small decreases (0.5 to 0.6 log CFU/g) were found for each inoculum. These findings may help small ham processors evaluating production and quality control methods to determine whether recommended concentrations of natural curing agents used to prevent growth of clostridial pathogens may also prevent growth of other pathogens during meat cooling.
- Improving Access to ETD Elements Through Chapter Categorization and SummarizationBanerjee, Bipasha (Virginia Tech, 2024-08-07)The field of natural language processing and information retrieval has made remarkable progress since the 1980s. However, most of the theoretical investigation and applied experimentation is focused on short documents like web pages, journal articles, or papers in conference proceedings. Electronic Theses and Dissertations (ETDs) contain a wealth of information. These book-length documents describe research conducted in a variety of academic disciplines. While current digital library systems can be directly used to find a document of interest, they do not also facilitate discovering what specific parts or segments are of particular interest. This research aims to improve access to ETD components by providing users with chapter-level classification labels and summaries to help easily find portions of interest. We explore the challenges such documents pose, especially when dealing with a highly specialized academic vocabulary. We use large language models (LLMs) and fine-tune pre-trained models for these downstream tasks. We also develop a method to connect the ETD discipline and the department information to an ETD-centric classification system. To help guide the summarization model to create better chapter summaries, for each chapter, we try to identify relevant sentences of the document abstract, plus the titles of cited references from the bibliography. We leverage human feedback that helps us evaluate models qualitatively on top of using traditional metrics. We provide users with chapter classification labels and summaries to improve access to ETD chapters. We generate the top three classification labels for each chapter that reflect the interdisciplinarity of the work in ETDs. Our evaluation proves that our ensemble methods yield summaries that are preferred by users. Our summaries also perform better than summaries generated by using a single method when evaluated on several metrics using an LLM-based evaluation methodology.
- Inactivation of Salmonella enterica and Surrogate Enterococcus faecium on Whole Black Peppercorns and Cumin Seeds Using Vacuum Steam PasteurizationNewkirk, Jordan J.; Wu, Jian; Acuff, Jennifer C.; Caver, Chris B.; Mallikarjunan, Kumar; Wiersema, Brian D.; Williams, Robert C.; Ponder, Monica A. (Frontiers, 2018-08-07)Spices, including black pepper and cumin seeds, have been implicated in outbreaks of salmonellosis and prompted recalls of ready-to-eat products containing contaminated spices. Vacuum-assisted steam pasteurization is performed to improve the safety and quality of many low water activity products, however process parameters associated with inactivation on whole spices are not well described. The objective of this study was to determine the effectiveness of a lab-scale vacuum-assisted steam process for the inactivation of Salmonella enterica and its potential surrogate Enterococcus faecium ATCC 8459 inoculated onto the surface of whole peppercorns and cumin seeds. In addition, the effect of two inoculation preparation methods [growth on tryptic soy agar (TSA) or inclusion within a native microbiota biofilm], on the reduction of S. enterica serovars or E. faecium was compared on steam pasteurized whole black peppercorns. Spices were processed using steam under a vacuum to achieve a mean product temperature of 86.7 ± 2.8◦C for different dwell times. Salmonella inoculated using the TSA-grown method, required 83 and 70 s respectively to achieve a 5-log reduction of Salmonella on peppercorns and cumin seeds. Longer time periods were needed to achieve a 5-log reduction of Salmonella when it was present in a native biofilm on whole peppercorns. Survivor estimations were best predicted by the Weibull models. The mean log reductions of E. faecium were 0.9 log CFU/g lower than Salmonella on whole black peppercorns inoculated using the TSA-grown cells (P = 0.0021). The mean log reductions of Salmonella and E. faecium prepared using the biofilm-inclusion method were not significantly different (P = 0.76). E. faecium log CFU/g reductions were not significantly different compared to Salmonella on whole cumin seeds (P = 0.42) indicating that while reductions are comparable the surrogate may not always provide a conservative indication of complete Salmonella elimination for all spices processed using vacuum-assisted steam.
- Inhibiting foodborne pathogens Vibrio parahaemolyticus and Listeria monocytogenes using extracts from traditional medicine: Chinese gallnut, pomegranate peel, Baikal skullcap root and forsythia fruitWu, Jian; Goodrich, Katheryn M.; Eifert, Joseph D.; Jahncke, Michael L.; O'Keefe, Sean F.; Welbaum, Gregory E.; Neilson, Andrew P. (De Gruyter, 2018-06-21)Foodborne illnesses have been a heavy burden in the United States and globally. Many medicinal herbs have been cultivated in the US and many of which contain antimicrobial compounds with the potential to be used for food preservation. Methanol/water extracts of pomegranate peel (“PP”, Punica Granatum L.), Chinese gallnut (“CG”, Galla chinensis), Forsythia fruit (“FF”, Forsythia suspensa) and Baikal skullcap root (“BS”, Scutellaria baicalensis) were tested for antimicrobial activity using the agar diffusion assay on tryptic soy agar (TSA) and microdilution assay in tryptic soy broth (TSB). CG and PP extracts showed good to excellent inhibitory effect against Vibrio parahaemolyticus and Listeria monocytogenes in both assays, with a minimum inhibitory concentration (MIC) range from 0.04 to 5 mg/mL. BS had moderate inhibitory effects against V. parahaemolyticus with an MIC of 5 mg/mL in TSB, and against L. monocytogenes with an MIC of 20 mg/mL on TSA. CG was analyzed using LC-MS and fractionated using HPLC. The major components were identified as gallic acid, digallic acid, methyl gallate, and gallotannins (oligo-galloyl-D-glucose, nGG, n = 1~10). Six fractions (I - VI) were collected and their antibacterial activities were tested against L. monocytogenes, and V. parahaemolyticus both on TSA and in TSB. On TSA, fraction III, IV and V inhibited V. parahaemolyticus but no fraction inhibited L. monocytogenes. In TSB, all fractions inhibited V. parahaemolyticus and fractions II - V inhibited L. monocytogenes. Future studies are needed to investigate the effects of medicinal plants on food products.
- Inhibiting Listeria monocytogenes, Vibrio parahaemolyticus and Morganella morganii with Aqueous Methanol Extracts of Punica granatum and Galla chinensisWu, Jian (Virginia Tech, 2014-12-08)Listeria monocytogenes, Vibrio parahaemolyticus and Morganella morganii are closely related to foodborne illnesses caused by the consumption of seafood and ready-to-eat (RTE) food. Traditional Chinese medicines (TCM) have been widely studied as complementary and alternative medicines, and many of them have been verified to have antimicrobial properties. The purpose of this research was to study antimicrobial effects of plant extracts as potential preservatives in seafood products and to identify the primary antimicrobial compounds in plant extracts. Four plants, Pomegranate peel (PP, Punica Granatum L.), Chinese gallnut (CG, Galla chinensis), forsythia fruit (FS, Forsythia suspensa) and Baikal skullcap root (BS, Scutellaria baicalensis) were ground and extracted with 70% methanol, respectively. The extracts were diluted at tested for antimicrobial activities on V. parahaemolyticus, L. monocytogenes and M. morganii both in agar diffusion assay using tryptic soy agar (TSA), and in microdilution assay using tryptic soy broth (TSB). Both CG and PP extracts, with concentrations no lower than 1 mg/ml, significantly inhibited both V. parahaemolyticus and L. monocytogenes (P<0.01) and reduced the bacterial population by up to 4 logs. No significant inhibition was observed with FS and BS extracts, except for BS at 5 mg/ml on V. parahaemolyticus. None of the extracts showed significant inhibition against M. morganii. The antibacterial activities of CG and PP 70% methanol extracts were tested in ground raw tuna and cooked tail-on shrimp. The extracts were mixed in tuna with final concentration at 1.7 mg/ml, and applied as soaking treatments (5 mg/ml) for shrimp. Both CG and PP extracts inhibited V. parahaemolyticus on both food matrices while only CG significantly inhibited L. monocytogenes. The 70% methanol crude extract of CG was analyzed by HPLC and LC-MS. Oligo-galloyl-O-glucose (nGG, n=1-10) are the major compounds in CG. The crude CG extract was fractionated using HPLC and the fractions were collected based on elution time and tested for their antimicrobial activities against V. parahaemolyticus and L. monocytogenes using agar diffusion methods. The fractions containing 3GG-8GG were the most active antimicrobials on both bacteria.
- Maximizing Equitable Reach and Accessibility of ETDsIngram, William A.; Wu, Jian; Fox, Edward A. (ACM, 2023)This poster addresses accessibility issues of electronic theses and dissertations (ETDs) in digital libraries (DLs). ETDs are available primarily as PDF files, which present barriers to equitable access, especially for users with visual impairments, cognitive or learning disabilities, or for anyone needing more efficient and effective ways of finding relevant information within these long documents. We propose using AI techniques, including natural language processing (NLP), computer vision, and text analysis, to convert PDFs into machine-readable HTML documents with semantic tags and structure, extracting figures and tables, and generating summaries and keywords. Our goal is to increase the accessibility of ETDs and to make this important scholarship available to a wider audience.
- MetaEnhance: Metadata Quality Improvement for Electronic Theses and Dissertations of University LibrariesChoudhury, Muntabir Hasan; Salsabil, Lamia; Jayanetti, Himarsha R.; Wu, Jian; Ingram, William A.; Fox, Edward A. (ACM, 2023)Metadata quality is crucial for discovering digital objects through digital library (DL) interfaces. However, due to various reasons, the metadata of digital objects often exhibits incomplete, inconsistent, and incorrect values. We investigate methods to automatically detect, correct, and canonicalize scholarly metadata, using seven key fields of electronic theses and dissertations (ETDs) as a case study. We propose MetaEnhance, a framework that utilizes state-of-the-art artificial intelligence (AI) methods to improve the quality of these fields. To evaluate MetaEnhance, we compiled a metadata quality evaluation benchmark containing 500 ETDs, by combining subsets sampled using multiple criteria. We evaluated MetaEnhance against this benchmark and found that the proposed methods achieved nearly perfect F1-scores in detecting errors and F1-scores ranging from 0.85 to 1.00 for correcting five of seven key metadata fields. The codes and data are publicly available on GitHub11https://github.com/lamps-lab/ETDMiner/tree/master/metadata-correction.
- Performance of cost-effective PET packaging with light protective additives to limit photo-oxidation in UHT milk under refrigerated LED-lighted storage conditionWang, Aili; Stancik, Cheryl M.; Yin, Yun; Wu, Jian; Duncan, Susan E. (Elsevier, 2022-03-01)Protection efficiency of polyethylene terephthalate (PET) packaging comprising light protective additives (LPA) was investigated for ultra-high temperature (UHT) 2% milk over 26 weeks in refrigerated LED-lighted retail case storage. For the first time, synergistic efficacy of titanium dioxide (TiO2) and carbon black (CB) pigments in PET packaging in protecting milk nutrients and flavor was evaluated in this study. Combination of TiO2 (6.8 wt%) and CB (23 ppm) pigments in PET packaging (S-PET) protected milk quality as effectively as light-proof packaging up to 13 weeks of storage, as demonstrated by improved riboflavin retention, lower TBARS value, and reduced production of volatiles associated with off-flavor including aldehydes compounds and dimethyl disulfide. Photolysis of riboflavin in milk packaged with S-PET packaging was limited to 7.0% within 13 weeks of storage under exposure of high intensity of LED light. The study demonstrated that selection of appropriate LPA combined with PET packaging provided a cost-effective solution for light protection of 2% milkfat UHT milk stored in refrigerated LED-lighted retail conditions for up to 26 weeks.
- Segmenting Electronic Theses and Dissertations By ChaptersManzoor, Javaid Akbar (Virginia Tech, 2023-01-18)
- A Study of Computational Reproducibility using URLs Linking to Open Access Datasets and SoftwareSalsabil, Lamia; Wu, Jian; Choudhury, Muntabir; Ingram, William A.; Fox, Edward A.; Rajtmajer, Sarah; Giles, C. Lee (ACM, 2022-04-25)Datasets and software packages are considered important resources that can be used for replicating computational experiments. With the advocacy of Open Science and the growing interest of investigating reproducibility of scientific claims, including URLs linking to publicly available datasets and software packages has become an institutionalized part of research publications. In this preliminary study, we investigated the disciplinary dependency and chronological trends of including open access datasets and software (OADS) in electronic theses and dissertations (ETDs), based on a hybrid classifier called OADSClassifier, consisting of a heuristic and a supervised learning model. The classifier achieves the best F1 of 0.92.We found that the inclusion of OADS-URLs exhibited a strong disciplinary dependence and the fraction of ETDs containing OADS-URLs has been gradually increasing over the past 20 years.We developed and share a ground truth corpus consisting of 500 manually labeled sentences containing URLs from scientific papers. The dataset and source code are available at https://github.com/lamps-lab/oadsclassifier.
- Who can submit an excellent review for this manuscript in the next 30 days? - Peer Reviewing in the age of overloadAlhoori, Hamed; Fox, Edward A.; Frommholz, Ingo; Liu, Haiming; Coupette, Corinna; Rieck, Bastian A.; Ghosal, Tirthankar; Wu, Jian (ACM, 2023)With millions of research articles published yearly, the peer review process is in danger of collapsing, especially in 'hot' areas with popular conferences. Challenges arise from the large number of manuscripts submitted, skyrocketing use of preprint archives and institutional repositories, problems regarding the identification and availability of experts, conflicts of interest, and bias in reviewing. Such issues can affect the integrity of the reviewing process as well as the timeliness, quality, credibility, and reproducibility of research articles. Several solutions and systems have been suggested, but none work well, and neither authors nor editors are happy with how long it takes to complete reviewing the submitted research. This panel addresses these challenges and potential solutions, including digital libraries that recommend reviewers, as well as broader issues like opportunities for identifying peer reviewers for scholarly journals by engaging doctoral students and postdocs, as well as those who recently completed their Ph.D.