Logo image
Reinventing the necropsy review; a case study of data extraction optimisation from penguin records in Aotearoa New Zealand
Doctoral Thesis   Open access

Reinventing the necropsy review; a case study of data extraction optimisation from penguin records in Aotearoa New Zealand

Stefan D Saverimuttu
Doctor of Philosophy (PhD), Murdoch University
2024
pdf
Whole Thesis2.72 MBDownloadView
Open Access

Abstract

Animal health surveillance--New Zealand Information storage and retrieval systems--Animal health Penguins--Autopsy--New Zealand
Wildlife health data can be a powerful contributor to conservation outcomes, disease surveillance, and the broader objectives of One Health – an aim to achieve optimal health for people, animals, and the environment. Collections of wildlife necropsy reports are a common source of such data. However, the nature of these data creates a resource barrier between acquisition and analysis, resulting in sporadic extraction to inform such outcomes. To evaluate challenges and opportunities provided by wildlife health data, necropsy records for all the culturally, economically, and ecologically significant Sphenisciformes within the Wildbase Pathology Register of Aotearoa New Zealand were extracted, validated, and analysed. This manual process highlighted the dominance of a threatened and arguably cryptic species in the database (hoiho or yellow eyed penguin, Megadyptes antipodes), and that infectious/ inflammatory diagnoses were the most frequently encountered across all reviewed reports (35.7%, 523/1463). The free-text nature of many fields complicated analysis through high rates of typographical variance requiring manual resolution. The manual review shed light on threats to Sphenisciformes in Aotearoa, but also highlighted the temporal cost of knowledge extraction. Using these insights, an application was developed to facilitate time sensitive knowledge extraction, via text-mining and a dashboard approach to analyse and display information akin to the manual necropsy review. Simple algorithms were used to derive categorical fields such as species, and sex, while more complex query-based algorithms were used to quantify subjective elements, such as the prevalence of specific clinicopathologic findings. To evaluate the performance of this application nine professionals in wildlife health were recruited to a pilot study, to quantify the occurrence of four clinicopathological findings across two species datasets extracted from the Wildbase Pathology Register. Results from the testers were compared to the manual review (a “gold standard”), to determine the proportion of false negative and false positive records returned by each tester across the four clinicopathological findings that had been assigned to find. Mean F1-scores, which infer the level of agreement between the tester and the manual review, ranged from 0.63-0.93. Agreement was affected by tester, and the clinicopathologic finding being examined. The majority of misclassification (false positive or false negative records) was attributed to inappropriate search term selection and differences in interpretation of records. Further, the linguistically simple clinicopathologic findings (e.g., ‘oiled’) performed more consistently across users and in greater agreement with the manual review when compared to the other findings tested. The value of the application was affirmed in the pilot testing, however highlighted the potential for individual users, and clinicopathological findings with high rates of synonymy (e.g., “Starvation”), to impact performance. Overall, the development and testing of this application demonstrates the under-recognised value of automated methods in the extraction of knowledge from painstakingly acquired wildlife health data. This utility may be improved through the implementation of more sophisticated semantic-based techniques for data extraction as compared to the relatively simple term-based approaches utilised here. With early warning a recognised precursor to the success of any intervention, be it for conservation or infectious disease purposes, approaches that fast-track evidence-based adaptive management are a global priority for wildlife health.

Details

Metrics

70 File views/ downloads
67 Record Views
Logo image