Logo image
Evaluation of a text-mining application for the rapid analysis of free-text wildlife necropsy reports
Journal article   Open access   Peer reviewed

Evaluation of a text-mining application for the rapid analysis of free-text wildlife necropsy reports

Stefan Saverimuttu, Kate McInnes, Kristin Warren, Lian Yeap, Stuart Hunter, Brett Gartrell, An Pas, James Chatterton and Bethany Jackson
PloS one, Vol.20(11), e0337720
2025
pdf
Published1.13 MBDownloadView
Published (Version of Record)CC BY V4.0 Open Access

Abstract

The ability to efficiently derive insights from wildlife necropsy data is essential for advancing conservation and One Health objectives, yet close reading remains the mainstay of knowledge retrieval from ubiquitous free-text clinical data. This time-consuming process poses a barrier to the efficient utilisation of such valuable resources. This study evaluates part of a bespoke text-mining application, DEE (Describe, Explore, Examine), designed for extracting insights from free-text necropsy reports housed in Aotearoa New Zealand’s Wildbase Pathology Register. A pilot test involving nine veterinary professionals assessed DEE’s ability to quantify the occurrence of four clinicopathologic findings (external oiling, trauma, diphtheritic stomatitis, and starvation) across two species datasets by comparison to manual review. Performance metrics—recall, precision, and F1-score—were calculated and analysed alongside tester-driven misclassification patterns. Findings reveal that while DEE (and the principals underlying its function) offers time-efficient data retrieval, its performance is influenced by search term selection and the breadth of vocabulary which may describe a clinicopathologic finding. Those findings characterized by limited terminological variance, such as external oiling, yielded the highest performance scores and the most consistency across application testers. Mean F1-scores across all tested findings and application testers was 0.63–0.93. Results highlight the utility and limitations of term-based text-mining approaches and suggests that enhancements to automatically capture this terminological variance may be necessary for broader implementation. This pilot study highlights the potential of relatively simple, rule-based text-mining approaches to derive insights natural language wildlife data in the support of One Health goals.

Details

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

#15 Life on Land

Metrics

18 File views/ downloads
7 Record Views

InCites Highlights

These are selected metrics from InCites Benchmarking & Analytics tool, related to this output

Collaboration types
Domestic collaboration
International collaboration
Citation topics
1 Clinical & Life Sciences
1.228 Virology - Tropical Diseases
1.228.994 Viral Hemorrhagic Fevers
Web Of Science research areas
Veterinary Sciences
ESI research areas
Plant & Animal Science
Logo image