Summary
Topic modeling is a popular method used to describe biological count data. With topic models, the user must specify the number of topics |$K$|. Since there is no definitive way to choose |$K$| and since a true value might not exist, we develop a method, which we call topic alignment, to study the relationships across models with different |$K$|. In addition, we present three diagnostics based on the alignment. These techniques can show how many topics are consistently present across different models, if a topic is only transiently present, or if a topic splits into more topics when |$K$| increases. This strategy gives more insight into the process of generating the data than choosing a single value of |$K$| would. We design a visual representation of these cross-model relationships, show the effectiveness of these tools for interpreting the topics on simulated and real data, and release an accompanying R package, alto