Colección especial COVID-19
Permanent URI for this collection
Esta colección especial recoge todo tipo de materales relacionados con la COVID-19 o de los coronavirus en general como aportación al mejor y más extenso conocimiento de estas enfermedades, como artículos o informes de investigación o materiales más divulgativo en las que ha participado la UPV.
RDA Recomendaciones y pautas sobre el intercambio de datos para COVID-19,Browse
Browsing Colección especial COVID-19 by Sponsor "BANCO SANTANDER, S.A."
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- PublicationPotential limitations in COVID-19 machine learning due to data source variability: A case study in the nCov2019 dataset(Oxford University Press, 2021-02) Sáez Silvestre, Carlos; Romero, Nekane; Conejero Casares, José Alberto; García Gómez, Juan Miguel; Dpto. de Física Aplicada; Instituto Universitario de Tecnologías de la Información y Comunicaciones; Dpto. de Matemática Aplicada; Instituto Universitario de Matemática Pura y Aplicada; Escuela Técnica Superior de Ingeniería Industrial; Escuela Técnica Superior de Ingeniería Informática; BANCO SANTANDER, S.A.; Universitat Politècnica de València[EN] Objective: The lack of representative coronavirus disease 2019 (COVID-19) data is a bottleneck for reliable and generalizable machine learning. Data sharing is insufficient without data quality, in which source variability plays an important role. We showcase and discuss potential biases from data source variability for COVID-19 machine learning. Materials and Methods: We used the publicly available nCov2019 dataset, including patient-level data from several countries. We aimed to the discovery and classification of severity subgroups using symptoms and comorbidities. Results: Cases from the 2 countries with the highest prevalence were divided into separate subgroups with distinct severity manifestations. This variability can reduce the representativeness of training data with respect the model target populations and increase model complexity at risk of overfitting. Conclusions: Data source variability is a potential contributor to bias in distributed research networks. We call for systematic assessment and reporting of data source variability and data quality in COVID-19 data sharing, as key information for reliable and generalizable machine learning.
- PublicationSubphenotyping of Mexican Patients With COVID-19 at Preadmission To Anticipate Severity Stratification: Age-Sex Unbiased Meta-Clustering Technique(JMIR Publications Inc., 2022-03) Zhou, Lexin; Romero-Garcia, Nekane; Martínez-Miranda, Juan; Conejero Casares, José Alberto; García Gómez, Juan Miguel; Sáez Silvestre, Carlos; Dpto. de Física Aplicada; Instituto Universitario de Tecnologías de la Información y Comunicaciones; Dpto. de Matemática Aplicada; Instituto Universitario de Matemática Pura y Aplicada; Escuela Técnica Superior de Ingeniería Industrial; Escuela Técnica Superior de Ingeniería Informática; BANCO SANTANDER, S.A.; Universitat Politècnica de València[EN] Background: The COVID-19 pandemic has led to an unprecedented global health care challenge for both medical institutions and researchers. Recognizing different COVID-19 subphenotypes-the division of populations of patients into more meaningful subgroups driven by clinical features-and their severity characterization may assist clinicians during the clinical course, the vaccination process, research efforts, the surveillance system, and the allocation of limited resources. Objective: We aimed to discover age-sex unbiased COVID-19 patient subphenotypes based on easily available phenotypical data before admission, such as pre-existing comorbidities, lifestyle habits, and demographic features, to study the potential early severity stratification capabilities of the discovered subgroups through characterizing their severity patterns, including prognostic, intensive care unit (ICU), and morbimortality outcomes. Methods: We used the Mexican Government COVID-19 open data, including 778,692 SARS-CoV-2 population-based patient-level data as of September 2020. We applied a meta-clustering technique that consists of a 2-stage clustering approach combining dimensionality reduction (ie, principal components analysis and multiple correspondence analysis) and hierarchical clustering using the Ward minimum variance method with Euclidean squared distance. Results: In the independent age-sex clustering analyses, 56 clusters supported 11 clinically distinguishable meta-clusters (MCs). MCs 1-3 showed high recovery rates (90.27%-95.22%), including healthy patients of all ages, children with comorbidities and priority in receiving medical resources (ie, higher rates of hospitalization, intubation, and ICU admission) compared with other adult subgroups that have similar conditions, and young obese smokers. MCs 4-5 showed moderate recovery rates (81.30%-82.81%), including patients with hypertension or diabetes of all ages and obese patients with pneumonia, hypertension, and diabetes. MCs 6-11 showed low recovery rates (53.96%-66.94%), including immunosuppressed patients with high comorbidity rates, patients with chronic kidney disease with a poor survival length and probability of recovery, older smokers with chronic obstructive pulmonary disease, older adults with severe diabetes and hypertension, and the oldest obese smokers with chronic obstructive pulmonary disease and mild cardiovascular disease. Group outcomes conformed to the recent literature on dedicated age-sex groups. Mexican states and several types of clinical institutions showed relevant heterogeneity regarding severity, potentially linked to socioeconomic or health inequalities. Conclusions: The proposed 2-stage cluster analysis methodology produced a discriminative characterization of the sample and explainability over age and sex. These results can potentially help in understanding the clinical patient and their stratification for automated early triage before further tests and laboratory results are available and even in locations where additional tests are not available or to help decide resource allocation among vulnerable subgroups such as to prioritize vaccination or treatments.