We integrated large-scale genomic and plasma proteomic data from over 10,000 individuals to characterize the genetic architecture of host proteins reported to interact with SARS-CoV-2 proteins in cellular assays, or which have been reported to be related to virus entry, the host hyper-immune or procoagulant response, or severity of COVID-19.
We successfully identified 220 independent genetic variants acting in cis for 97 proteins targeted by 106 aptamers ("SOMAmers"; DNA-based protein affinity reagents). Among these 97 proteins, 38 are the targets of existing drugs, including 15 proteins that were previously identified as interacting with structural or non-structural proteins encoded in the SARS-CoV-2 genome and 16 proteins that encode biomarkers related to COVID-19 severity, prognosis, or outcome.
Manhattan plot of cis-associations statistics (encoding gene ±500kb) for 179 proteins. The most significant regional sentinel protein quantitative trait loci (pQTL) acting in cis are annotated by larger dots for 104 unique protein targets (p<5×10-8). Starred genes indicate those targeted by multiple aptamers.
Data access
Use Tabix to retreive associations for specific genetic regions without the need for downloading the whole data set.
Example:
tabix http://omicscience.org/apps/covidpgwas/data/all.grch37.tabix.gz 6:1000000-1100000