Logo image
Development and implementation of a data parsing protocol for companion animal cancer data
Journal article   Open access   Peer reviewed

Development and implementation of a data parsing protocol for companion animal cancer data

Chiara Palmieri, Matt Taylor, Mike Rickerby, Peter Bennett, Mieghan Bruce, Mark Krockenberger, Philippa McLaren, Thelma Meiring, Kerrie Mengersen, Gabriele Rossi, …
Veterinary pathology, p.3009858251413572
2026
PMID: 41574631
pdf
Published632.98 kBDownloadView
CC BY-NC V4.0 Open Access

Abstract

cancer cancer registry cat data database dog
Companion animal cancer diagnostic reports are text-based documents containing essential information on tumor classification and diagnosis. Establishing an animal cancer registry requires integrating and extracting structured data from diverse report formats across multiple providers. This study presents the development of an object-oriented programming approach to standardize and automate cancer data collection for canine and feline patients, enabling the creation of the Australian Companion Animal Registry of Cancers (ACARCinom); Australia’s first national registry of cat and dog cancers. An object-oriented programming approach was developed using the C# language for data processing, tested on sample data from 6 data providers. The initial programming phase focused on designing a parser that identified report sections using regular expressions based on standardized headings. The text was then cleaned to remove unnecessary formatting and HTML tags. Data dictionaries containing preferred terms and synonyms were used to extract key information such as diagnosis, topography, grade, and metastasis, improving consistency and accuracy. A coordinate map of extracted terms was generated to analyze spatial relationships within the report, allowing prioritization of diagnoses. The system also logged parsing decisions and potential issues for expert review. Markup using HTML tags enabled clear visualization of parsed content within the original reports. Extracted data and patient metadata were stored in an intermediary database table, allowing veterinary pathology experts to review and refine entries before final import. This automated solution streamlines data extraction and standardization from diverse sources, enabling the efficient analysis of cancer records and enhancing research and surveillance capacity in veterinary oncology.

Details

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

#15 Life on Land

Metrics

1 File views/ downloads
5 Record Views

InCites Highlights

These are selected metrics from InCites Benchmarking & Analytics tool, related to this output

Collaboration types
Domestic collaboration
International collaboration
Citation topics
No Topic Assigned
No Topic Assigned
No Topic Assigned
Web Of Science research areas
Pathology
Veterinary Sciences
ESI research areas
Plant & Animal Science
Logo image