The Internal Revenue Service (IRS) provides a plethora of data related to tax-exempt organizations through the publication of IRS Form 990 tax filings in Extensible Markup Language (XML) format, hosted between their website and Amazon Web Services (AWS). These data sources possess filing data beginning in tax year 2012, and ending in the most recently filed and uploaded tax year of 2020. This defines the project’s study window as 2012-2020. The primary goal of this project is to create a database of Form 990 filings to support research related to tourism offices and various other tax-exempt organizations. The primary challenge of this project is to process filings from all years within the study window and upload them to the database in a unified manner. The development of this database utilizes tools such as Jupyter Notebooks, SQLite, and various Python libraries for scraping, preprocessing, and analysis. Due to the number of different return types and the massive amount of data contained in the forms, understanding the forms in their standard format is incredibly challenging. Additionally, most documentation about 990 forms is oriented to accountants or tax experts who are well versed in financial jargon. This issue extends to the XML data files themselves, as many of the XML tags are heavily abbreviated, and cross referencing each of them with its corresponding location on Form 990 is a tedious and near impossible task. The solution to these problems lies in archiving the data but also having it accessible for use.