Learning healthcare delivery network with longitudinal electronic health records data

The Annals of Applied Statistics

Published On 2024/3

This Supplementary Material includes details for marginal pseudo-likelihood, comparison of MIC to cAIC and cBIC, and sensitivity analysis of the choice to basis functions in the real data application.

Journal

The Annals of Applied Statistics

Volume

18

Issue

1

Page

882-898

Authors

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

H-Index

50

Research Interests

rheumatoid arthritis

cardiovascular disease

applications of bioinformatics for clinical research

University Profile Page

Other Articles from authors

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Seminars in Arthritis and Rheumatism

Reasons for multiple biologic and targeted synthetic DMARD switching and characteristics of treatment refractory rheumatoid arthritis

ObjectiveSwitching biologic and targeted synthetic DMARD (b/tsDMARD) medications occurs commonly in RA patients, however data are limited on the reasons for these changes. The objective of the study was to identify and categorize reasons for b/tsDMARD switching and investigated characteristics associated with treatment refractory RA.MethodsIn a multi-hospital RA electronic health record (EHR) cohort, we identified RA patients prescribed ≥1 b/tsDMARD between 2001-2017. Consistent with the EULAR “difficult to treat” (D2T) RA definition, we further identified patients who discontinued ≥2 b/tsDMARDs with different mechanisms of action. We performed manual chart review to determine reasons for medication discontinuation. We defined “treatment refractory” RA as not achieving low disease activity (<3 tender or swollen joints on <7.5mg of daily prednisone equivalent) despite treatment with two different b …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Biometrics

Semisupervised transfer learning for evaluation of model classification performance

In many modern machine learning applications, changes in covariate distributions and difficulty in acquiring outcome information have posed challenges to robust model training and evaluation. Numerous transfer learning methods have been developed to robustly adapt the model itself to some unlabeled target populations using existing labeled data in a source population. However, there is a paucity of literature on transferring performance metrics, especially receiver operating characteristic (ROC) parameters, of a trained model. In this paper, we aim to evaluate the performance of a trained binary classifier on unlabeled target population based on ROC analysis. We proposed Semisupervised Transfer lEarning of Accuracy Measures (STEAM), an efficient three-step estimation procedure that employs (1) double-index modeling to construct calibrated density ratio weights and (2) robust imputation to leverage …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Journal of the American Heart Association

Biomarkers of Cardiovascular Risk in Patients With Rheumatoid Arthritis: Results From the TARGET Trial

Background Cardiovascular disease remains an important comorbidity in patients with rheumatoid arthritis (RA), but traditional models do not accurately predict cardiovascular risk in patients with RA. The addition of biomarkers could improve prediction. Methods and Results The TARGET (Treatments Against RA and Effect on FDG PET/CT) trial assessed whether different treatment strategies in RA differentially impact cardiovascular risk as measured by the change in arterial inflammation on arterial target to background ratio on fluorodeoxyglucose positron emission tomography/computed tomography scans conducted 24 weeks apart. A group of 24 candidate biomarkers supported by prior literature was assessed at baseline and 24 weeks later. Longitudinal analyses examined the association between baseline biomarker values, measured in plasma EDTA, and the change in arterial inflammation target to …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Rheumatology

Association of the multi-biomarker disease activity score with arterial 18-fluorodeoxyglucose uptake in rheumatoid arthritis

Objectives Rheumatoid arthritis (RA) and atherosclerosis share many common inflammatory pathways. We studied whether a multi-biomarker panel for RA disease activity (MBDA) would associate with changes in arterial inflammation in an interventional trial. Methods In the TARGET Trial, RA patients with active disease despite methotrexate were randomly assigned to the addition of either a TNF inhibitor or sulfasalazine+hydroxychloroquine (triple therapy). Baseline and 24-week follow-up 18F-fluorodeoxyglucose (FDG) positron emission tomography/computed tomography scans were assessed for change in arterial inflammation measured as the maximal arterial target-to-blood background ratio of FDG uptake in the most diseased segment of the carotid arteries or aorta (MDS-TBRmax). The MBDA test, measured at baseline and weeks 6, 18, and 24, was assessed for its …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Patterns

LATTE: Label-efficient incident phenotyping from longitudinal electronic health records

Electronic health record (EHR) data are increasingly used to support real-world evidence studies but are limited by the lack of precise timings of clinical events. Here, we propose a label-efficient incident phenotyping (LATTE) algorithm to accurately annotate the timing of clinical events from longitudinal EHR data. By leveraging the pre-trained semantic embeddings, LATTE selects predictive features and compresses their information into longitudinal visit embeddings through visit attention learning. LATTE models the sequential dependency between the target event and visit embeddings to derive the timings. To improve label efficiency, LATTE constructs longitudinal silver-standard labels from unlabeled patients to perform semi-supervised training. LATTE is evaluated on the onset of type 2 diabetes, heart failure, and relapses of multiple sclerosis. LATTE consistently achieves substantial improvements over …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Scientific Reports

Heterogeneous associations between interleukin-6 receptor variants and phenotypes across ancestries and implications for therapy

The Phenome-Wide Association Study (PheWAS) is increasingly used to broadly screen for potential treatment effects, e.g., IL6R variant as a proxy for IL6R antagonists. This approach offers an opportunity to address the limited power in clinical trials to study differential treatment effects across patient subgroups. However, limited methods exist to efficiently test for differences across subgroups in the thousands of multiple comparisons generated as part of a PheWAS. In this study, we developed an approach that maximizes the power to test for heterogeneous genotype–phenotype associations and applied this approach to an IL6R PheWAS among individuals of African (AFR) and European (EUR) ancestries. We identified 29 traits with differences in IL6R variant-phenotype associations, including a lower risk of type 2 diabetes in AFR (OR 0.96) vs EUR (OR 1.0, p-value for heterogeneity = 8.5 × 10–3), and higher …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Pharmacoepidemiology and Drug Safety

Improving the accuracy of automated gout flare ascertainment using natural language processing of electronic health records and linked Medicare claims data

Background We aimed to determine whether integrating concepts from the notes from the electronic health record (EHR) data using natural language processing (NLP) could improve the identification of gout flares. Methods Using Medicare claims linked with EHR, we selected gout patients who initiated the urate‐lowering therapy (ULT). Patients' 12‐month baseline period and on‐treatment follow‐up were segmented into 1‐month units. We retrieved EHR notes for months with gout diagnosis codes and processed notes for NLP concepts. We selected a random sample of 500 patients and reviewed each of their notes for the presence of a physician‐documented gout flare. Months containing at least 1 note mentioning gout flares were considered months with events. We used 60% of patients to train predictive models with LASSO. We evaluated the models by the area under the curve (AUC) in the validation data …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Journal of the American Medical Informatics Association

Centralized Interactive Phenomics Resource: an integrated online phenomics knowledgebase for health data users

Objective Development of clinical phenotypes from electronic health records (EHRs) can be resource intensive. Several phenotype libraries have been created to facilitate reuse of definitions. However, these platforms vary in target audience and utility. We describe the development of the Centralized Interactive Phenomics Resource (CIPHER) knowledgebase, a comprehensive public-facing phenotype library, which aims to facilitate clinical and health services research. Materials and Methods The platform was designed to collect and catalog EHR-based computable phenotype algorithms from any healthcare system, scale metadata management, facilitate phenotype discovery, and allow for integration of tools and user workflows. Phenomics experts were engaged in the development and testing of the site. Results The knowledgebase stores phenotype …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

The Prevalence Of Atherosclerosis Identified On Coronary Ct Angiography Among Patients With Psoriatic Disease And Impact On Statin Utilization

Methods: We included patients with PsO undergoing CCTA at two large tertiary-care academic medical centers. PsO was identified as 2 ICD-9/10 codes at least 30 days apart in the electronic health record (EHR) prior to or within one year of CCTA. Baseline characteristics and CV risk factors were obtained from the EHR. Patients with a known history of CAD (prior MI, PCI, or CABG) were excluded. Pre-and post-statin use was defined as documented prescriptions within 2 years prior to and within one year after the CCTA. CCTA were classified by presence or absence of plaque.Results: There were 566 patients with PsO who underwent a CCTA, of whom 490 patients had no history of CAD. The mean age was 60±12, 47% female, and 87% were White. Traditional modifiable cardiovascular risk factor included hypertension (78%), dyslipidemia (51%), diabetes (25%), and chronic kidney disease (5%)(Table). A total of …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Generate Analysis-Ready Data for Real-world Evidence: Tutorial for Harnessing Electronic Health Records With Advanced Informatic Technologies

Although randomized controlled trials (RCTs) are the gold standard for establishing the efficacy and safety of a medical treatment, real-world evidence (RWE) generated from real-world data has been vital in postapproval monitoring and is being promoted for the regulatory process of experimental therapies. An emerging source of real-world data is electronic health records (EHRs), which contain detailed information on patient care in both structured (eg, diagnosis codes) and unstructured (eg, clinical notes and images) forms. Despite the granularity of the data available in EHRs, the critical variables required to reliably assess the relationship between a treatment and clinical outcome are challenging to extract. To address this fundamental challenge and accelerate the reliable use of EHRs for RWE, we introduce an integrated data curation and modeling pipeline consisting of 4 modules that leverage recent advances in natural language processing, computational phenotyping, and causal modeling techniques with noisy data. Module 1 consists of techniques for data harmonization. We use natural language processing to recognize clinical variables from RCT design documents and map the extracted variables to EHR features with description matching and knowledge networks. Module 2 then develops techniques for cohort construction using advanced phenotyping algorithms to both identify patients with diseases of interest and define the treatment arms. Module 3 introduces methods for variable curation, including a list of existing tools to extract baseline variables from different sources (eg, codified, free text, and medical imaging) and end points …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Annals of the Rheumatic Diseases

Reducing cardiovascular risk with immunomodulators: a randomised active comparator trial among patients with rheumatoid arthritis

ObjectiveRecent large-scale randomised trials demonstrate that immunomodulators reduce cardiovascular (CV) events among the general population. However, it is uncertain whether these effects apply to rheumatoid arthritis (RA) and if certain treatment strategies in RA reduce CV risk to a greater extent.MethodsPatients with active RA despite use of methotrexate were randomly assigned to addition of a tumour necrosis factor (TNF) inhibitor (TNFi) or addition of sulfasalazine and hydroxychloroquine (triple therapy) for 24 weeks. Baseline and follow-up 18F-fluorodeoxyglucose-positron emission tomography/CT scans were assessed for change in arterial inflammation, an index of CV risk, measured as an arterial target-to-background ratio (TBR) in the carotid arteries and aorta.Results115 patients completed the protocol. The two treatment groups were well balanced with a median age of 58 years, 71% women, 57 …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

medRxiv

Knowledge-Driven Online Multimodal Automated Phenotyping System

Though electronic health record (EHR) systems are a rich repository of clinical information with large potential, the use of EHR-based phenotyping algorithms is often hindered by inaccurate diagnostic records, the presence of many irrelevant features, and the requirement for a human-labeled training set. In this paper, we describe a knowledge-driven online multimodal automated phenotyping (KOMAP) system that i) generates a list of informative features by an online narrative and codified feature search engine (ONCE) and ii) enables the training of a multimodal phenotyping algorithm based on summary data. Powered by composite knowledge from multiple EHR sources, online article corpora, and a large language model, features selected by ONCE show high concordance with the state-of-the-art AI models (GPT4 and ChatGPT) and encourage large-scale phenotyping by providing a smaller but highly relevant feature set. Validation of the KOMAP system across four healthcare centers suggests that it can generate efficient phenotyping algorithms with robust performance. Compared to other methods requiring patient-level inputs and gold-standard labels, the fully online KOMAP provides a significant opportunity to enable multi-center collaboration.

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Arthritis & Rheumatology (Hoboken, NJ)

Finding the right fit for genes in rheumatology clinical care.

Finding the right fit for genes in rheumatology clinical care. - Abstract - Europe PMC Sign in | Create an account https://orcid.org Europe PMC Menu About Tools Developers Help Contact us Helpdesk Feedback Twitter Blog Tech blog Developer Forum Europe PMC plus Search life-sciences literature (43,274,515 articles, preprints and more) Search Advanced search Feedback This website requires cookies, and the limited processing of your personal data in order to function. By using the site you are agreeing to this as outlined in our privacy notice and cookie policy. Abstract Full text Finding the right fit for genes in rheumatology clinical care. Vassy JL 1 , Knevel R 2 , Liao KP 1 Author information Affiliations 1. Department of Medicine, Veterans Affairs Boston Healthcare System. (2 authors) 2. Department of Rheumatology, Leiden University Medical Centre, Leiden, the Netherland. (1 author) ORCIDs linked to this article …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Shared inflammatory pathways of rheumatoid arthritis and atherosclerotic cardiovascular disease

The association between chronic inflammation and increased risk of cardiovascular disease in rheumatoid arthritis (RA) is well established. In the general population, inflammation is an established independent risk factor for cardiovascular disease, and much interest is placed on controlling inflammation to reduce cardiovascular events. As inflammation encompasses numerous pathways, the development of targeted therapies in RA provides an opportunity to understand the downstream effect of inhibiting specific pathways on cardiovascular risk. Data from these studies can inform cardiovascular risk management in patients with RA, and in the general population. This Review focuses on pro-inflammatory pathways targeted by existing therapies in RA and with mechanistic data from the general population on cardiovascular risk. Specifically, the discussions include the IL-1, IL-6 and TNF pathways, as well as the …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

medRxiv

Arch: Large-scale knowledge graph via aggregated narrative codified health records analysis

Objective:Electronic health record (EHR) systems contain a wealth of clinical data stored as both codified data and free-text narrative notes, covering hundreds of thousands of clinical concepts available for research and clinical care. The complex, massive, heterogeneous, and noisy nature of EHR data imposes significant challenges for feature representation, information extraction, and uncertainty quantification. To address these challenges, we proposed an efficient Aggregated naRrative Codified Health (ARCH) records analysis to generate a large-scale knowledge graph (KG) for a comprehensive set of EHR codified and narrative features.Methods:The ARCH algorithm first derives embedding vectors from a co-occurrence matrix of all EHR concepts and then generates cosine similarities along with associated p-values to measure the strength of relatedness between clinical features with statistical certainty …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

From real-world electronic health record data to real-world results using artificial intelligence

With the worldwide digitalisation of medical records, electronic health records (EHRs) have become an increasingly important source of real-world data (RWD). RWD can complement traditional study designs because it captures almost the complete variety of patients, leading to more generalisable results. For rheumatology, these data are particularly interesting as our diseases are uncommon and often take years to develop. In this review, we discuss the following concepts related to the use of EHR for research and considerations for translation into clinical care: EHR data contain a broad collection of healthcare data covering the multitude of real-life patients and the healthcare processes related to their care. Machine learning (ML) is a powerful method that allows us to leverage a large amount of heterogeneous clinical data for clinical algorithms, but requires extensive training, testing, and validation. Patterns …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Journal of Machine Learning Research

Augmented transfer regression learning with semi-non-parametric nuisance models

We develop an augmented transfer regression learning (ATReL) approach that introduces an imputation model to augment the importance weighting equation to achieve double robustness for covariate shift correction. More significantly, we propose a novel semi-non-parametric (SNP) construction framework for the two nuisance models. Compared with existing doubly robust approaches relying on fully parametric or fully non-parametric (machine learning) nuisance models, our proposal is more flexible and balanced to address model misspecification and the curse of dimensionality, achieving a better trade-off in terms of model complexity. The SNP construction presents a new technical challenge in controlling the first-order bias caused by the nuisance estimators. To overcome this, we propose a two-step calibrated estimating approach to construct the nuisance models that ensures the effective reduction of potential bias. Under this SNP framework, our ATReL estimator is root-n-consistent when (i) at least one nuisance model is correctly specified and (ii) the nonparametric components are rate-doubly robust. Simulation studies demonstrate that our method is more robust and efficient than existing methods under various configurations. We also examine the utility of our method through a real transfer learning example of the phenotyping algorithm for rheumatoid arthritis across different time windows. Finally, we propose ways to enhance the intrinsic efficiency of our estimator and to incorporate modern machine-learning methods in the proposed SNP framework.

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

Arthritis Care & Research

Social determinants of health documentation among individuals with rheumatic and musculoskeletal conditions in an integrated care management program

Objective Social determinants of health (SDoH), such as poverty, are associated with increased burden and severity of rheumatic and musculoskeletal diseases. We studied the prevalence and documentation of SDoH needs in electronic health records (EHRs) of individuals with these conditions. Methods We randomly selected individuals with ≥1 ICD‐9/10 code for a rheumatic/musculoskeletal condition enrolled in a multihospital integrated care management program that coordinates care for medically and/or psychosocially complex individuals. We assessed SDoH documentation using terms for financial needs, food insecurity, housing instability, transportation, and medication access from EHR note review and ICD‐10 “Z” SDoH billing codes. We used multivariable logistic regression to examine associations between demographic factors (age, gender, race, ethnicity, insurance) and >1 (vs. 0) SDoH needs …

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

medRxiv

Diversity and scale: genetic architecture of 2,068 traits in the VA Million Veteran Program

Genome-wide association studies (GWAS) have underrepresented individuals from non-European populations, impeding progress in characterizing the genetic architecture and consequences of health and disease traits. To address this, we present a population-stratified phenome-wide GWAS followed by a multi-population meta-analysis for 2,068 traits derived from electronic health records of 635,969 participants in the Million Veteran Program (MVP), a longitudinal cohort study of diverse US Veterans genetically similar to the respective African (121,177), Admixed American (59,048), East Asian (6,702), and European (449,042) superpopulations defined by the 1000 Genomes Project. We identified 38,270 independent variants associating with one or more traits at experiment-wide P< 4.6× 10− 11 significance; fine-mapping 6,318 signals identified from 613 traits to single-variant resolution. Among these, a third (2 …

Other articles from The Annals of Applied Statistics journal

Kun Chen

Kun Chen

University of Connecticut

The Annals of Applied Statistics

Tensor regression for incomplete observations with application to longitudinal studies

The supplementary material contains the extension of the proposed method to log-contrast models, the derivation of the model fitting algorithms, and additional numerical results for simulation and real data studies.

Peter A. Sims

Peter A. Sims

Columbia University in the City of New York

The Annals of Applied Statistics

RZiMM-scRNA: A regularized zero-inflated mixture model framework for single-cell RNA-seq data

We provide additional plots and tables for the results section in Supplementary Information.

Wensheng Guo

Wensheng Guo

University of Pennsylvania

The Annals of Applied Statistics

Semiparametric bivariate hierarchical state space model with application to hormone circadian relationship

In the supplementary material, we provide detailed calculations for the EM algorithm and the likelihood ratio test. Additionally, we include further results from the application analysis and simulation studies.

Tianchen Xu

Tianchen Xu

Columbia University in the City of New York

The Annals of Applied Statistics

Tensor regression for incomplete observations with application to longitudinal studies

The supplementary material contains the extension of the proposed method to log-contrast models, the derivation of the model fitting algorithms, and additional numerical results for simulation and real data studies.

Florian Steinke

Florian Steinke

Technische Universität Darmstadt

The Annals of Applied Statistics

Generative machine learning methods for multivariate ensemble postprocessing

Ablation studies regarding the architecture and hyperparameter choices of the conditional generative model and some results not shown in the paper are provided.

Jorge Mateu

Jorge Mateu

Universidad Jaime I

The Annals of Applied Statistics

A nonseparable first-order spatiotemporal intensity for events on linear networks: An application to ambulance interventions

The supplementary material summarises the results obtained when testing the spatial predictive accuracy considering different time periods and alternative training sets. Moreover, it includes additional details on the comparison with planar and separable modelling approaches. Finally, we also report more precise details regarding the computing times on the extended road network.

Paola Crippa

Paola Crippa

University of Notre Dame

The Annals of Applied Statistics

Sensitivity analysis of wind energy resources with Bayesian non-Gaussian and nonstationary functional ANOVA

The supplement contains additional analyses and plots in support to the main findings in the paper. The code for this work is available at the following GitHub repository: github.com/Env-an-Stat-group/24.Zhang.AoAS.

Jian Kang

Jian Kang

University of Michigan

The Annals of Applied Statistics

Latent subgroup identification in image-on-scalar regression

In the supplementary material, we provide supplemental information about the Hermite polynomials and basis function construction, the process of using the GPfit package to estimate the smoothing parameter, sensitivity analysis with varying hyperparameter values, and additional figures describing the detailed simulation and application results.

Stefano Mazzuco

Stefano Mazzuco

Università degli Studi di Padova

The Annals of Applied Statistics

Functional concurrent regression with compositional covariates and its application to the time-varying effect of causes of death on human longevity

This directory contains the R package fcrc, including the code for reproducing the analysis, the simulation studies and all images of the paper.

Jon Wakefield

Jon Wakefield

University of Washington

The Annals of Applied Statistics

A Bayesian hierarchical small area population model accounting for data source specific methodologies from American Community Survey, Population Estimates Program, and …

Appendices A–H. Appendix A: Notation Table, Appendix B: Modeling Approaches for Satiotemporal Interaction Terms, Appendix C: Summary of Model Assumptions and Justifications, Appendix D: Naive Model Description, Appendix E: Comparison of Errors, Appendix F: Validation Results for Selected Counties, Appendix G: Maps of Relative Differences, Appendix H: County Estimates for Georgia.

Katherine P. Liao, MD, MPH

Katherine P. Liao, MD, MPH

Harvard University

The Annals of Applied Statistics

Learning healthcare delivery network with longitudinal electronic health records data

This Supplementary Material includes details for marginal pseudo-likelihood, comparison of MIC to cAIC and cBIC, and sensitivity analysis of the choice to basis functions in the real data application.

Jeffrey P. Spence

Jeffrey P. Spence

Stanford University

The Annals of Applied Statistics

A simple and flexible test of sample exchangeability with applications to statistical genomics

The Supplementary Information PDF includes technical details, proofs, and supplementary figures for our work.

Marc G Genton

Marc G Genton

King Abdullah University of Science and Technology

The Annals of Applied Statistics

Sensitivity analysis of wind energy resources with Bayesian non-Gaussian and nonstationary functional ANOVA

The supplement contains additional analyses and plots in support to the main findings in the paper. The code for this work is available at the following GitHub repository: github.com/Env-an-Stat-group/24.Zhang.AoAS.

Jie Peng

Jie Peng

University of California, Davis

The Annals of Applied Statistics

Estimating fiber orientation distribution with application to study brain lateralization using HCP D-MRI data

A supplementary text with additional details on FOD estimators, synthetic experiments results and the HCP D-MRI application.

Bijan Niknam

Bijan Niknam

Harvard University

The Annals of Applied Statistics

Privacy-preserving, communication-efficient, and target-flexible hospital quality measurement

The Supplementary Material consists of five appendices. In Appendix I we derive the form of the influence functions. In Appendix II, we show how patient-level information is not required to solve for the data-adaptive hospital-level weights (i.e., summary-level information is sufficient). In Appendix III, we prove the data-adaptive property of the hospital-level weights. Appendix IV and Appendix V contain additional results from the simulation study and real data analysis, respectively.

Danica M. Ommen

Danica M. Ommen

Iowa State University

The Annals of Applied Statistics

Density-based matching rule: Optimality, estimation, and application in forensic problems

The online supplement provides additional figures described in this article and the details of non-Gaussian distributions used in the simulation study.

Brian J Reich

Brian J Reich

North Carolina State University

The Annals of Applied Statistics

Modeling extremal streamflow using deep learning approximations and a flexible spatial process

The Supplementary Material consists of three appendices. Appendix A goes over the some properties of the PMM, and an overview of the variable importance measure used in the text. Appendix B presents supplementary simulation studies detailing the performance of the PMM in various density estimation and parameter estimation scenarios. Appendix C consists of additional results from the HCDN data analysis, including MCMC convergence, model comparison and model fit results, and selected results from analyzing the extremal streamflow data in its original scale.

Weining Shen

Weining Shen

University of California, Irvine

The Annals of Applied Statistics

Risk-aware restricted outcome learning for individualized treatment regimes of schizophrenia

Additional numerical results and the algorithm description are provided.

Theo Economou

Theo Economou

University of Exeter

The Annals of Applied Statistics

A hierarchical spline model for correcting and hindcasting temperature data

All the data, code and Supplementary Material (including Figures S1–S5) are available online (Economou, Johnson and Dyson (2024)) but can also be accessed at Zenodo (Economou (2023)) with DOI: 10.5281/zenodo.10074436. Note that the station names have been anonymised for confidentiality purposes. This repository comprises a single zipped file, code_data_supplementary_plots_v2.zip, which includes all the supplementary figures referenced in the paper: FigureS1.pdf: trace plot of the MCMC samples for the deviance. FigureS2.pdf: predicted vs observed Tmax values for each station. FigureS3.pdf: QQ plot for each station. FigureS4.pdf: Empirical and predicted autocorrelation plots for the 10 stations with long enough time series. FigureS5.pdf: Empirical autocorrelation plots for all stations (except 11, 16, 19 and 21) up to lag 30 Figure3_All_Stations.pdf: Same as Figure 3 but for all stations …