All are invited to attend the Dissertation Defense in the Department of Computer Science. 

Student: Kehinde Ajayi
Date and Time: Wed, Oct 29, 2025 at 12:30 PM
Location: ECSB 1st floor Auditorium

Committee Chair: Jian Wu
Committee Members:
Dr. Michael Nelson
Dr. Michele Weigle
Dr. Sampath Jayarathna
Dr. Yi He (College of William & Mary)

Title: SCITEUQ: TOWARD UNCERTAINTY-AWARE COMPLEX SCIENTIFIC TABLE DATA EXTRACTION AND UNDERSTANDING

Abstract: Scientific tables report critical research insights, data, and findings for scientific progress. Because PDF is the de facto standard format for scientific paper publishing, there has been an emerging need for an automatic method to extract data from PDF files. A significant fraction of scientific tables exhibit complex structure and content, making it challenging for machine learning tools to accurately extract the content directly from PDF files. Despite the advancements in Table Structure Recognition (TSR), automated extraction of data from complex scientific tables remains a challenge due to variations in table structures and contents. In this dissertation, we developed SCITEUQ, a software framework to address these challenges by enabling automated, accurate, and uncertainty-aware extraction of data from complex scientific tables in multiple disciplines. By integrating TSR with Optical Character Recognition and Uncertainty Quantification (UQ), SCITEUQ aims at significantly improving the quality of data extraction while significantly reducing the workload of humans to verify extracted data. We also developed SciTableQA, a benchmark for evaluating the question-answer (QA) and reasoning capabilities of Large Language Models on complex scientific tables. This research advances the fields of information extraction for complex scientific tables, which will potentially benefit scientific data compilation in a wide range of scientific domains.