GoDataX Community Members

For now, this section features real-world data projects and insightful articles from founder. In the future, it will become a platform for publishing contributions from members.

Modeling techniques and environment for multidimensional analysis

Scientific article presented as a partial requirement for obtaining the Executive MBA certificate in Business Intelligence and Data Warehouse from the lato sensu postgraduate course at SJT College.

Abstract

Analytical environments for veterinary clinical records extracted from medical records are usually stored in relational databases. Advances in treatments and anamneses are recorded in textual form, making it difficult to analyze sets of similar cases. This study proposes a data model for multidimensional analysis of medical records and veterinary research, enabling cross-referenced and consolidated analytical information to track the evolutionary process of these treatments. The importance of analyzing the evolution of diseases and treatments in animals lies in the ease of making informed decisions during veterinary consultations. The model will allow veterinarians to examine jurisprudence cases with animals of the same size, breed, geographic location, and other relevant characteristics. Users will also have access to real cases that can serve as a basis for future studies.

Keywords:

Data warehouse, unstructured data, veterinary medical records

Definitions of Terms

  • Business Intelligence: Refers to the process of collecting, organizing, analyzing, sharing, and monitoring information that supports business management.
  • Data Warehouse: A data warehouse is a computing system used to store consolidated information about an organization's activities in a database. Its structure favors reporting, analyzing large data volumes, and obtaining strategic insights to aid decision-making.
  • Data Mining: The process of exploring large datasets to find consistent patterns, such as association rules or temporal sequences, to detect systematic relationships between variables.
  • Potential Limitations: The study does not consider control software responsible for data inclusion and extraction in the project.
  • OLAP Tool: OLAP (Online Analytical Processing) is software technology that allows business analysts, managers, and executives to analyze corporate data interactively and dynamically, enabling multidimensional data exploration.

Introduction

Veterinary students, professors, pet owners, and veterinarians currently lack a centralized and self-updating knowledge base that serves as a repository for real-time accessible information. This study aims to propose a multidimensional model that supports consolidated analytical information and statistical data on the success and failure of treatments and disease management.

Objective

This study specifically aims to detail the multidimensional model within the "PetScience" project, which seeks to create a framework for multidimensional analysis of medical records and veterinary research. This new model will provide researchers, students, and veterinary professionals with consolidated analytical information to support their studies and improve the evolutionary process of veterinary medicine. The study will describe the modeling techniques necessary for building the project's first version, ensuring text-based storage for qualitative data transformation.

Problem Statement

There is currently a lack of access to information on animal diagnoses, making it difficult to study various diseases. No real-time software provides strategic analysis perspectives, making "PetScience" a vital solution for advancing the veterinary sector.

Theoretical Study and Research

To understand the needs of veterinarians and the challenges in controlling specific diseases, theoretical research was conducted to develop an analytical model capable of tracking infected animals' behaviors. The study focused on diseases such as dirofilariasis, filariasis, leishmaniasis, and polyarthritis. After reviewing relevant literature, an interview was conducted with Professor Argemiro Sanavria, a national and international reference in veterinary research.

Interview with Professor Argemiro Sanavria

Dr. Argemiro Sanavria holds a degree in Veterinary Medicine from the Federal University of Santa Maria (UFSM-RS), a Master’s in Rural Extension from UFSM-RS, a PhD in Veterinary Sciences from the Federal Rural University of Rio de Janeiro (UFRRJ), and a Postdoctoral degree from the National Agrarian University of Havana (UNAH) in Cuba. During the interview, he provided insights into the studies reviewed, the challenges in isolating vector-borne diseases, and the importance of environmental control in preventing infections.

Solution

The proposed solution involves creating a data analysis environment that integrates structured and textual data, using taxonomy facets for clinical case analysis. The system will enable online storage of analytical data, allowing veterinarians to conduct data mining and access reports and dashboards, facilitating real-time analysis of cases based on factors such as species, breed, geographic location, and medical history.

Importance

This project emphasizes the importance of jurisprudence in successful and unsuccessful veterinary cases. The system will provide real-time access to case precedents for veterinarians, enabling informed decision-making based on breed, treatment history, and geographic factors. Additionally, veterinary students will gain access to real cases that will serve as a foundation for their studies, contributing to the continuous evolution of veterinary medicine.

Multidimensional Model

The proposed analytical multidimensional model (Veterinary Clinical Data Warehouses - DWC) will enable the construction of facets (Multifaceted Analysis of Veterinary Medical Records). It will allow for the correlation of analytical information through perspectives and dimensions, integrating diagnoses, tests, and treatment outcomes.

Description of image 1 Description of image 2

Final Considerations

This study identified the most critical dimensions for developing a multidimensional model for veterinary medical record analysis: coat, size, tests, breed, location, time, and medications. The model will improve disease identification, particularly for dirofilariasis, filariasis, leishmaniasis, and polyarthritis, by incorporating analytical storage techniques and OLAP tools.

Extending Terminology Definitions - Multidimensional Modeling Techniques

  • Snowflaking: Organizing low-cardinality attributes into separate linked tables to optimize storage while potentially complicating the data model.
  • Role-Playing Dimensions: Dimensions used in multiple roles, such as a "Time Dimension" serving as both "Order Date" and "Delivery Date."
  • Causal Dimensions: Factors influencing events, such as promotions, weather conditions, or marketing campaigns.
  • Junk Dimensions: A mix of textual attributes and flags that lack a coherent organizational structure.
  • Audit Dimension: A special dimension that tracks metadata such as data lineage, reliability, and transformation history.
  • Degenerate Dimensions: Transactional document identifiers stored directly in fact tables without additional attributes.
  • Slowly Changing Dimensions: Dimensions that evolve over time, such as changes in product descriptions or customer profiles.
  • Many-to-Many Dimensions: Linking fact tables with multiple dimension values via bridge tables to manage complex relationships.

References

  • REBECA BACCHI, 2015.
  • BRUM, W. M.; PEREIRA, M. A. V. C.; VITA, G. F.; FERREIRA, I.; MELLO, E. R.; AURNHEIMER, R. C. M.; SANAVRIA, Argemiro; PADUA, E. D. Parasitismo em aves silvestres residentes e migratórias da Ilha da Marambaia, Estado do Rio de Janeiro. Pesquisa Veterinária Brasileira (Online), v. 36, p. 1101-1108, 2016.
  • LARSSON, M.H.M.A. Prevalência de microfilárias e Dirofilaria immitis em cães do Estado de São Paulo. Braz. J. Vet. Res. Anim. Sci., São Paulo, 27(2): 183-186, 1990.
  • Leishmaniose Visceral Canina: Estudo imagiológico em cães naturalmente infectados, ficha catalográfica elaborada pela seção técnica de aquisição e tratamento da informação. Divisão de Biblioteca e Documentação – Campus de Botucatu - UNESP. Bibliotecária responsável: Rosemeire Aparecida Vicente - CRB 8/5651.
  • SANAVRIA, Argemiro; VITA, G. F.; THOME, S. M. G.; ANGELO, I. C.; PADUA, E. D.; GAIOTTE, D. G.; SANAVRIA, T. E. C.; HELENAELECTO, E.; NOGUEIRA, L. C. R.; SILVA, C. B. Intelligent monitoring of Aedes aegypti in a rural area of Rio de Janeiro State, Brazil. Revista do Instituto de Medicina Tropical de São Paulo, v. 59, p. 1-9, 2017.
  • ALEXANDRE REDSON, 2014.
  • SILVA, Rodrigo Costa da; LANGONI, Helio. Dirofilariose: Zoonose emergente negligenciada. Ciência Rural, Santa Maria, v. 39, n. 5, p. 1614-1623, ago. 2009. ISSN 0103-8478.
  • VITA, Gilmar F.; PEREIRA, Maria Angélica Vieira da Costa; FERREIRA, ILDEMAR; SANAVRIA, A.; AURNHEIMER, R. C. Status da Leishmaniose Tegumentar Americana no Estado do Rio de Janeiro. Revista do Instituto de Medicina Tropical de São Paulo, v. 58, p. 1-8, 2016. Citações: 3.
  • VITA, Gilmar F.; FERREIRA, Ildemar; PEREIRA, Maria Angélica V. da Costa; SANAVRIA, Argemiro; AURNHEIMER, Rita de Cássia M.; BARBOSA, Celso G.; GALLO, Samira S. M.; VASCONCELLOS, Henrique V. G. Eficácia de Chenopodium ambrosioides (erva-de-santa-maria) no controle de endoparasitos de Coturnix japonica (codorna japonesa). Pesquisa Veterinária Brasileira (Online), v. 35, p. 424-430, 2015.