With mammography being the primary breast cancer screening strategy, it is essential to make full use of the mammogram imaging data to better identify women who are at higher and lower than average risk. Our primary goal in this study is to extract mammogram-based features that augment the well-established breast cancer risk factors to improve prediction accuracy. In this article, we propose a supervised functional principal component analysis (sFPCA) over triangulations method for extracting features that are ordered by the magnitude of association with the failure time outcome. The proposed method accommodates the irregular boundary issue posed by the breast area within the mammogram imaging data with flexible bivariate splines over triangulations. We also provide an eigenvalue decomposition algorithm that is computationally efficient. Compared to the conventional unsupervised FPCA method, the proposed method results in a lower Brier Score and higher area under the ROC curve (AUC) in simulation studies. We apply our method to data from the Joanne Knight Breast Health Cohort at Siteman Cancer Center. Our approach not only obtains the best prediction performance comparing to unsupervised FPCA and benchmark models but also reveals important risk patterns within the mammogram images. This demonstrates the importance of utilizing additional supervised image-based features to clarify breast cancer risk.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)