Theses and Dissertations

Date of Award

12-1-2025

Document Type

Thesis

Degree Name

Master of Science in Engineering (MSE)

Department

Computer Science

First Advisor

Yifeng Gao

Second Advisor

Marzieh Ayati

Third Advisor

Timothy Wylie

Abstract

This thesis presents an automated and interpretable pipeline that links natural-language weather narratives with local meteorological sensor time series. Using large language models, NOAA-style event reports are transformed into structured records capturing event type, timing, descriptive context, and uncertainty. Each extracted event is aligned with harmonized temperature, precipitation, and wind measurements from nearby weather stations, enabling systematic comparisons between narrative evidence and observed atmospheric conditions.

Across roughly fifty stations and more than two thousand events, the analyses show that discrepancies between narrative descriptions and sensor behavior arise primarily from spatial separation rather than from temporal offsets, sensor preprocessing artifacts, or limitations in the extraction workflow. Rule-based diagnostics, narrative-derived distance estimates, geographic visualization, misalignment stratification, and a controlled temporal-shift experiment collectively support this conclusion. Analyses conducted on the updated “smoother-boundary” dataset further reduced edge noise while preserving the same spatial trends.

This work provides a reproducible multimodal framework for linking text and time-series data, an empirical characterization of narrative–sensor mismatch across U.S. stations, and a methodological foundation for future multimodal benchmarks, event-alignment studies, and weather-focused retrieval or forecasting systems.

Comments

Copyright 2025 Juan Luis Garza. All Rights Reserved. https://proquest.com/docview/3292628012

Share

COinS