Abstract
Genome-wide association studies (GWAS) have played an important role in identifying genetic variants underlying human complex traits. However, its success is hindered by weak effect at causal variants and presence of noise at non-causal variants. In an effort to overcome these difficulties, a previous study proposed a regularized regression method that penalizes on the difference of signal strength between two consecutive single-nucleotide polymorphisms (SNPs).
We provide a generalization to the afore-mentioned method so that more adjacent SNPs can be incorporated. The choice of optimal number of SNPs is studied. Simulation studies indicate that when consecutive SNPs have similar absolute coefficients our method performs better than using LASSO penalty. In other situations, our method is still comparable to using LASSO penalty. The practical utility of the proposed method is demonstrated by applying it to Genetic Analysis Workshop 16 rheumatoid arthritis GWAS data.
An implementation of the proposed method is provided in R package MWLasso.