With the advent of electronic health records, information collected in the course of regular health care is increasingly being used for clinical research. The hope is that the wealth of clinical data and the realistic setting (compared with information derived from highly controlled experiments like randomized trials) will aid in the investigation of determinants of disease and understanding of which treatments are effective in regular practice and for which patients. The availability of information in such databases is often driven by how a patient feels and may therefore be associated with the health outcomes being considered. We call this an outcome dependent visit process and recent work has shown that ignoring the outcome dependence can produce significant bias in the regression coefficients when fitting longitudinal data models. It is therefore important to have tools to recognize datasets exhibiting outcome dependence. We develop a score statistic to motivate the form of diagnostic test statistics, suggest a variety of approaches for diagnosing such situations, and evaluate their performance. Simple diagnostic tests achieve high power for diagnosing outcome dependent visit processes. This occurs when generalized estimating equations methods begin to be exhibit bias in estimating regression coefficients and before likelihood based methods are substantially biased.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)