- Open Access
Using genetic variants to evaluate the causal effect of serum vitamin D concentration on COVID-19 susceptibility, severity and hospitalization traits: a Mendelian randomization study
Journal of Translational Medicine volume 19, Article number: 300 (2021)
The coronavirus disease 2019 (COVID-19) pandemic has struck globally and is exerting a devastating toll on humans. The pandemic has led to calls for widespread vitamin D supplementation in public. However, evidence supporting the role of vitamin D in the COVID-19 pandemic remains controversial.
We performed a two-sample Mendelian randomization (MR) analysis to analyze the causal effect of the 25-hydroxyvitamin D [25(OH)D] concentration on COVID-19 susceptibility, severity and hospitalization traits by using summary-level GWAS data. The causal associations were estimated with inverse variance weighted (IVW) with fixed effects (IVW-fixed) and random effects (IVW-random), MR-Egger, weighted edian and MR Robust Adjusted Profile Score (MR.RAPS) methods. We further applied the MR Steiger filtering method, MR Pleiotropy RESidual Sum and Outlier (MR-PRESSO) global test and PhenoScanner tool to check and remove single nucleotide polymorphisms (SNPs) that were horizontally pleiotropic.
We found no evidence to support the causal associations between the serum 25(OH)D concentration and the risk of COVID-19 susceptibility [IVW-fixed: odds ratio (OR) = 0.9049, 95% confidence interval (CI) 0.8197–0.9988, p = 0.0473], severity (IVW-fixed: OR = 1.0298, 95% CI 0.7699–1.3775, p = 0.8432) and hospitalized traits (IVW-fixed: OR = 1.0713, 95% CI 0.8819–1.3013, p = 0.4878) using outlier removed sets at a Bonferroni-corrected p threshold of 0.0167. Sensitivity analyses did not reveal any sign of horizontal pleiotropy.
Our MR analysis provided precise evidence that genetically lowered serum 25(OH)D concentrations were not causally associated with COVID-19 susceptibility, severity or hospitalized traits. Our study did not provide evidence assessing the role of vitamin D supplementation during the COVID-19 pandemic. High-quality randomized controlled trials are necessary to explore and define the role of vitamin D supplementation in the prevention and treatment of COVID-19.
The coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has struck globally and is exerting a devastating toll on humans . As of February 12, 2021, this emerging highly infectious disease has spread to six continents, and there have been 107,252,265 confirmed cases of COVID-19, including 2,355,339 deaths, reported to the World Health Organization (WHO) . The pandemic situation has engulfed the global community into an accelerated search for preventive and therapeutic strategies and has led to calls for widespread vitamin D supplementation . On Dec 17, 2020, the National Institute for Health and Care Excellence (NICE) published an updated rapid review of recent studies on vitamin D and COVID-19 in collaboration with Public Health England and the Scientific Advisory Committee on Nutrition . They support the advice for everyone to take vitamin D supplements to maintain bone and muscle health during the autumn and winter months . The UK government also released and updated new guidance allowing extremely clinically vulnerable people to opt in to receive a free 4-month supply of daily vitamin D supplementation .
Humans obtain vitamin D from exposure to sunlight by the action of ultraviolet B on the skin, from their diet, and from dietary supplementation. The vitamin D status is reflected by the level of serum 25-hydroxyvitamin D [25(OH)D], which is produced by the hepatic hydroxylation of vitamin D . The beneficial role of vitamin D in bone growth and maintenance is undisputed and has influenced practical clinical guidelines and public health policies over the years . However, the evidence supporting the role of vitamin D in other health and disease processes, in particular in acute respiratory tract infection or COVID-19, remains controversial . One meta-analysis of 25 randomized controlled trials (RCTs) of over 11,000 participants showed that vitamin D supplementation could reduce the risk of acute respiratory infections. The protective effects were stronger in those with baseline 25(OH)D concentrations < 25 nmol/l . However, another prespecified analysis from the D-Health RCT in more than 20,000 Australian adults recruited from the general population suggested that monthly doses of 60,000 IU of vitamin D did not reduce the risk or severity of acute respiratory tract infections. Although the analysis showed a statistically significant effect on the overall duration of symptoms, the reduction was small (0.5 days) and unlikely to be clinically meaningful .
RCTs are the optimal study design to explore and define the role of vitamin D supplementation in preventing the occurrence and severe course of COVID-19, but they have challenges . Mendelian randomization (MR) analysis can overcome the limitations of conventional studies by using single nucleotide polymorphisms (SNPs) as instrumental variables (IVs) for assessing the causal effect of a risk factor (exposure) on an outcome. MR relies on three assumptions: (1) the genetic variant is associated with the exposures; (2) the genetic variant is not associated with confounders; and (3) the genetic variant influences the outcome only through the exposures . A two-sample MR obtains IV-exposure and IV-outcome associations from two different sets of participants. The IVs used in MR are available due to the genome-wide association studies (GWAS) being conducted and high-throughput genomic technologies being developed. In this study, we used the MR approach to explore the causal effect of the 25(OH)D concentration on COVID-19 susceptibility, severity and hospitalization traits. This approach may provide estimates of the effect of 25(OH)D while reducing bias due to confounding and reverse causation.
We performed a two-sample MR analysis to study the effect of the 25(OH)D concentration on COVID-19 susceptibility, severity and hospitalization traits. Our approach relied upon summary-level GWAS data to obtain MR estimates [13, 14]. We used all SNPs that strongly and independently (R2 < 0.001) predicted exposures at genome-wide significance (p < 5E−08). Proxy SNPs (R2 > 0.9) from LDlink (https://ldlink.nci.nih.gov/) were used when the SNPs were not available for the outcome . The palindromic SNPs with intermediate allele frequencies (palindromic SNPs referred to the SNPs with A/T or G/C alleles and “intermediate allele frequencies” referred to 0.01 < allele frequency < 0.30) were excluded from the above selected instrument SNPs. SNPs with a minor allele frequency (MAF) of < 0.01 were also excluded to avoid potential the statistical bias from the original GWAS since they usually carry with low confidence. We also calculated the F statistics for the SNPs to measure the strength of the instruments. IVs with a F statistic less than 10 were excluded and were often labeled “weak instruments” . Further, we used the PhenoScanner tool [17, 18] to exclude any of the selected SNPs were associated with other phenotypes at risk of affecting COVID-19 susceptibility, severity or hospitalization independently of the 25(OH)D concentration. These rigorously selected SNPs were used as the final instrumental SNPs for the subsequent MR analysis.
We retrieved summary data for the association between SNPs and the serum 25(OH)D concentration from the UK Biobank (UKB) with phenotype, genotype and clinical information on 417,580 individuals of European ancestry (age range from 40 to 69 years old) . Individuals were identified by projecting the UKB sample to the first two principal components of the 1,000 Genome Project using Hap Map 3 SNPs with MAF > 0.01 in both data sets. Genotype data were quality-controlled and imputed to the Haplotype Reference Consortium and UK10K reference panels by the UKB group. Genetic variants with a minor allele count > 5 and imputation score > 0.3 for all individuals were extracted and converted genotype probabilities to hard-call genotypes using PLINK2. In total, 8,806,780 variants, including 260,713 SNPs on the X chromosome, were available for analysis. A linear mixed model GWAS was performed to identify the associations between genetic variants and the 25(OH)D concentration. Meta-analysis with SUNLIGHT consortium GWAS results  was also performed. Information regarding the quality control and statistical analyses has been reported previously .
We obtained estimates of the effect of the 25(OH)D concentration on COVID-19 by obtaining effect coefficients from the above SNPs in GWAS meta-analyses from the COVID-19 Host Genetics Initiative (COVID-19 HGI) . The latest summary statistics were from the fifth round of GWAS meta-analysis shared publicly on January 18, 2021. Detailed information on participating studies, quality control, and analyses has been provided on the COVID-19 HGI website (https://www.covid19hg.org/results/). The COVID-19 HGI used different case/control definitions to identify genetic variants associated with COVID-19 susceptibility, disease severity and hospitalized cases of European ancestry and all ancestries. In our study, we only included European ancestry. We used a susceptibility phenotype that compared 38,984 confirmed COVID-19 cases, defined as individuals with laboratory confirmation of SARS-CoV-2 infection based on nucleic acid amplification- or serology-based tests or by electrical health records [using International Classification of Diseases (ICD) or physician notes], chart review or self-reporting, with 1,644,784 controls enrolled in the cohorts and not included as cases. To assess COVID-19 severity, we used the severe phenotype with 5101 patients, who were defined as patients with COVID-19 with very severe respiratory symptoms and requiring respiratory support, including intubation, CPAP, BiPAP, continuous external negative pressure or high-flow nasal cannula. Controls were 1,383,241 individuals enrolled in the cohorts and not included as cases. The hospitalized phenotype compared 9,986 hospitalized patients with COVID-19, with 1,877,672 controls enrolled in the cohorts and not included as cases. The study details included in the COVID-19 HGI GWAS meta-analyses and three phenotypes are shown in Additional file 1: Tables S1, Additional file 2: Tables S2, Additional file 3: Tables S3. We extracted 114 independent SNPs from the 25(OH)D GWAS as IVs for the three COVID-19 phenotypes. One SNP (rs182244780) was not available in the GWAS of the hospitalized phenotype, and we did not find the proxy SNP from LDlink; therefore, it was left out of the analysis. After harmonizing the exposure and outcomes datasets, two SNPs (rs11606, rs2246832) were removed for being palindromic with intermediate allele frequencies. Finally, 112, 112 and 111 SNPs were the “Complete sets” involved in the MR analyses (Table 1).
We applied the principles of univariable two-sample MR to obtain estimates of the causal effect of the 25(OH)D concentration on COVID-19 susceptibility, severity and hospitalization traits separately. Analyses were performed using TwoSampleMR R package of MR-Base  (https://github.com/MRCIEU/TwoSampleMR). The statistical tests of the MR analysis were two-sided, and the results of the MR analyses were considered statistically significant at a Bonferroni-corrected p < 0.0167 (e.g., 0.05/3 outcomes). The causal associations were estimated with inverse variance weighted (IVW) with fixed effects (IVW-fixed) and random effects (IVW-random) [14, 22], MR-Egger [23, 24] and weighted median estimate methods . The IVW method uses a meta-analysis approach to combine the Wald ratios of the causal effects of each SNP and can provide the most precise estimates [14, 22]. The weighted median estimate provides a reliable effect estimate of the causal effect when at least 50% of the weight in the analysis comes from effective IVs . MR-Egger regression is used to create a weighted linear regression of the outcome coefficients with the exposure coefficients. The slope of the weighted regression line provides an asymptotically unbiased causal estimate of the exposure on the outcome if the INSIDE (instrument strength is independent of direct effect) assumption is met. In addition, the intercept of the MR-Egger regression line was used to quantify the amount of horizontal pleiotropy present in the data averaged across the genetic instruments [23, 24]. Under the INSIDE assumption, the MR-Egger intercept test identifies horizontal pleiotropy if the intercept from the MR-Egger analysis is not equal to zero . We also calculated the Robust Adjusted Profile Score (MR.RAPS) to estimate the causal effects, which can lead to considerably higher statistical power than the conventional MR analysis . MR.RAPS considers the measurement error in SNP-exposure effects and is unbiased when there are many weak instruments as well as robust to systematic and idiosyncratic pleiotropy . The MR.RAPS method can also alleviate but cannot solve the problem of horizontal pleiotropy .
When selecting SNPs from the GWAS, especially in very large GWAS, it can be difficult to determine whether a SNP has its primary association with the exposure being studied or the outcome . For example, in our study, if COVID-19 phenotypes exert a causal effect on the serum 25(OH)D concentration, then there is a possibility that some SNPs primarily associated with COVID-19 may also pass the genome-wide significance threshold in a GWAS of the 25(OH)D concentration with a large sample size. These SNPs can then misleadingly be applied as genetic instruments for the 25(OH)D concentration when they should actually be applied as IVs for COVID-19. Therefore, we applied MR Steiger filtering  as implemented in the TwoSampleMR R package to test the causal direction of each of the extracted SNPs on the exposures [25(OH)D] and outcome (COVID-19 phenotypes). This approach calculates the variance explained in the exposure and the outcome by the instrumenting SNPs and tests if the variance in the outcome is less than the exposure. For any SNP that had less variance in the exposure than the outcome (which means it showed evidence of primarily affecting outcomes rather than exposures), we removed those lipid SNPs and conducted IVW, MR-Egger, weighted median and MR.RAPS using the remaining instruments (“Steiger filtered” sets). IVs with “TRUE” meant the evidence for causality in the expected direction, while “FALSE” meant the evidence for causality in the reverse direction (Additional file 4: Tables S4, Additional file 5: Tables S5, Additional file 6: Tables S6). After Steiger filtering, 109, 109 and 109 SNPs (rs182244780, rs1894100 and rs4694423 were excluded) were involved in the MR analysis (Table 1).
We further applied the MR Pleiotropy RESidual Sum and Outlier (MR-PRESSO) global test  to reduce heterogeneity in the estimate of the causal effect to remove SNPs that were horizontal pleiotropic outliers. We conducted this analysis by using the MR-PRESSO R package (https://github.com/rondolab/MR-PRESSO). The number of distributions was set to 1000 and the threshold was set to 0.05. We also conducted IVW, MR-Egger, weighted median and MR.RAPS using the remaining instruments (“Outlier removed” sets). None of the horizontal pleiotropic outliers were identified for the COVID-19 severity and hospitalization, while three SNPs (rs532436, rs12949853 and rs72997688) were removed for COVID-19 susceptibility (Table 1). We estimated the power of our study using the outlier removed sets according to a method suggested by Brion et al. . The equations using an approximate linear model on the observed binary (0–1) scale were adapted for binary outcomes, which require several parameters to estimate. These parameters include the proportion of phenotypic variation explained by IV SNPs, the effect size of the exposure to the outcome at the epidemiological level, sample size, and the standard deviation (SD) of the exposure and outcome. The three outlier removed sets of SNPs for COVID-19 susceptibility, severity and hospitalization collectively explained 0.0183, 0.0185 and 0.0185 of the variance in the 25(OH)D concentration, respectively (Additional file 4: Tables S4, Additional file 5: Tables S5, Additional file 6: Tables S6).
We used the IVW, MR-Egger and Maximum likelihood  methods to detect heterogeneity. Heterogeneity was quantified by the Cochran Q statistic. To identify potentially influential SNPs, we performed a “leave-one-out” sensitivity analysis in which we excluded one SNP at a time and performed an IVW-random method on the remaining SNPs to identify the potential influence of outlying variants on the estimates. To validate the MR results, we conducted sensitivity analyses using genetic variants associated with 25(OH)D that were identified in the SUNLIGHT consortium GWAS results . The large, multicenter, GWAS analysis considering phenotype data from 79,366 individuals of European ancestry included 31 studies from epidemiological cohorts from Europe, Canada, and the USA . We first extracted 7 SNPs associated with 25(OH)D at genome-wide significance (p < 5E−08) and pruned for linkage disequilibrium at a R2 coefficient of correlation of < 0.001 (Additional file 7: Table S7). Then, to reduce the limitations of SNP numbers, we pruned SNPs to a R2 of 0.01 and extracted 10 SNPs associated with 25(OH)D (Additional file 8: Table S8). We performed IVW-fixed, IVW-random [14, 22], MR-Egger [23, 24] and weighted median methods  to calculate the causal estimates.
Further, we used the PhenoScanner tool [17, 18] to check whether any of the 109 identified SNPs in the outlier removed sets in the current study were associated with other phenotypes at risk of affecting the three COVID-19 phenotypes independent of the 25(OH)D concentration. We assessed SNPs at a threshold of p < 5E−08 for their association with any other phenotypes. Using the PhenoScanner tool, we found that 35 SNPs (rs10454087, rs1047891, rs10822145, rs12056768, rs12317268, rs1260326, rs142004400, rs1800588, rs182050989, rs2037511, rs2131925, rs2229742, rs2418929, rs2642439, rs2756119, rs28407950, rs2952289, rs3745669, rs41301394, rs429358, rs4418728, rs4616820, rs4846917, rs512083, rs532436, rs5771043, rs58387006, rs6834488, rs71297391, rs73413596, rs804281, rs8063565, rs8107974, rs964184 and rs9861009) were significantly associated with hematological traits (e.g., white blood cell count, platelet count and granulocyte count). Changes in hematological parameters in SARS-CoV-2 infected patients are very common, and several studies have shown that hematological parameters are markers of disease severity and progression. It was confirmed that SARS-CoV-2 infection was associated with alterations in the blood cell count. One in four COVID-19 patients experienced some form of leukopenia (WBC < 4.00E+09 cells/l), with the majority (63.0%) exhibiting lymphocytopenia (lymphocyte count < 1.00E+09 cells/l) . Additionally, a reduced number of eosinophils has been reported in more than half (52.9%) of patients who tested positive for COVID-19 . Zhou et al.  also reported a significant platelet count reduction between patients with severe disease and those exhibiting mild disease. To avoid horizontal pleiotropy caused by possible causal mechanistic associations between hematological traits and COVID-19 phenotypes, we performed univariable two-sample MR with IVW-fixed and IVW-random methods to assess the epidemiological correlations between hematological traits and COVID-19 phenotypes. Genome-wide significant (p < 5E−08) and independent (R2 < 0.001) genetic variants of predicted exposures of 34 blood cell indices, including white blood cell count, platelet count and lymphocyte count were extracted from the Astle et al. GWAS . The GWAS on blood cell traits was performed on a population of approximately 173,000 individuals of European ancestry, excluding those with blood cancers or major blood disorders, largely from the UK. The significance threshold was set using Bonferroni correction at p < 4.90E−04 (0.05/[34 × 3]). We excluded SNPs significantly associated with blood cell traits if causal links were identified between blood cell traits and COVID-19 phenotypes.
As we mentioned above, outlier removed sets included 106,109 and 109 SNPs, respectively (Table 1). The detailed characteristics of SNPs in outlier removal sets associated with the 25(OH)D concentration are shown in Additional file 4: Tables S4, Additional file 5: Tables S5, Additional file 6: Tables S6. For these IVs, all of the F statistics were above 10 (ranging from 29.7798 to 1349.6840 for COVID-19 susceptibility; ranging from 29.7798 to 1349.6840 for COVID-19 severity; and ranging from 29.7798 to 1349.6840 for COVID-19 hospitalization) with average F statistics of 89.1985, 87.7882 and 87.7882, respectively (Additional file 4: Tables S4, Additional file 5: Tables S5, Additional file 6: Tables S6), indicating that they satisfy the strong relevance assumption of MR and that weak instrument bias would not substantially influence the estimations of causal effects. The results of MR analyses of the 25(OH)D concentration and three COVID-19 phenotypes with three sets of SNPs (complete, Steiger filtered and outlier removed sets) are presented in Table 1. We estimated the overall odds ratio (OR) of COVID-19 phenotypes per 1-SD increase in the 25(OH)D concentration. IVW-fixed, IVW-random, MR-Egger, weighted median and MR.RAPS using the complete sets of SNPs did not suggest the causal effect of increased/decreased 25(OH)D concentration on COVID-19 susceptibility, severity or hospitalization traits at a Bonferroni-corrected p threshold of 0.0167 [IVW-fixed: OR = 0.9369, 95% confidence interval (CI) 0.8656–1.0142, p = 0.1070; MR-Egger: OR = 0.9404, 95% CI 0.7923–1.1161, p = 0.4834; weighted median: OR = 0.9738, 95% CI 0.8560–1.1077, p = 0.6860; MR.RAPS: OR = 0.9206, 95% CI 0.8353–1.0146, p = 0.0952 for COVID-19 susceptibility] (Table 1). The results did not provide evidence for the causal effects, and were stable and consistent using the Steiger filtered (IVW-fixed: OR = 0.9307, 95% CI 0.8437–1.0268, p = 0.1519 for COVID-19 susceptibility) and outlier removed sets (IVW-fixed: OR = 0.9049, 95% CI 0.8197–0.9988, p = 0.0473 for COVID-19 susceptibility) for the three COVID-19 traits (Table 1 and Fig. 1).
Table 2 displays the heterogeneity analysis results using the IVW, MR-Egger and Maximum likelihood methods, and pleiotropy analysis using the MR-Egger intercept test. At the Bonferroni-corrected p threshold of 0.0167, we observed strong evidence of heterogeneity across complete SNP sets for COVID-19 susceptibility (IVW: Q = 214.2548, p = 1.49E−08, MR-Egger: Q = 214.2489, p = 1.06E−08, Maximum likelihood: Q = 214.2201, p = 1.50E−08), but evidence of heterogeneity was lacking for the COVID-19 severity and hospitalization (IVW: Q = 137.6576, p = 0.0439, MR-Egger: Q = 137.6540, p = 0.0382, Maximum likelihood: Q = 137.6563, p = 0.0439 for COVID-19 severity; IVW: Q = 136.2421, p = 0.0456, MR-Egger: Q = 135.2766, p = 0.0447, Maximum likelihood: Q = 136.2287, p = 0.0456 for COVID-19 hospitalization). After Steiger filtering and outlier removing, we did not observe the evidence of heterogeneity using three SNP sets for the three COVID-19 phenotypes (for COVID-19 susceptibility using outlier removed sets, IVW: Q = 128.5356, p = 0.0592, MR-Egger: Q = 128.4032, p = 0.0525, Maximum likelihood: Q = 128.4887, p = 0.0595). Based on the MR-Egger intercept test, we did not find evidence of directional pleiotropy between the 25(OH)D concentration and three COVID-19 phenotypes using three SNPs sets (complete SNPs sets: intercept = − 0.0001, p = 0.9561 for COVID-19 susceptibility; intercept = 0.0033, p = 0.9573 for COVID-19 severity; intercept = − 0.0037, p = 0.3797 for COVID-19 hospitalization).
We performed “leave-one-out” analysis based on the IVW-random method using the outlier removed sets for three COVID-19 phenotypes and found that there was no potentially influential SNP driving the causal link, and the result was stable (Additional file 9: Tables S9, Additional file 10: Tables S10, Additional file 11: Tables S11). Extended MR analysis using 25(OH)D-related variants from the SUNLIGHT consortium GWAS  also reported no evidence of causal association between 25(OH)D and COVID-19 traits (Additional file 7: Table S7, Additional file 8: Table S8). To avoid horizontal pleiotropy, we performed univariable two-sample MR between hematological traits and COVID-19 phenotypes due to 35 SNPs associated with blood traits. Using the IVW-fixed and IVW-random methods, we found the evidence to report the significant causal associations between the monocyte percentage of white cells and COVID-19 severity risk (beta = 0.1685, IVW-fixed p = 0.0002, IVW-random p = 0.0004) (p < 4.90E−04) (Additional file 12: Table S12, Additional file 13: Tables S13, Additional file 14: Tables S14). Therefore, we removed three SNPs (rs10822145, rs2642439 and rs2756119) associated with the monocyte percentage of white cells in the outlier removed sets and re-performed the MR analysis using IVW-fixed, IVW-random, MR-Egger, weighted median and MR.RAPS methods. The final results were consistent with the outlier removed sets and did not provide evidence for the causal effects between the effect of the serum 25(OH)D concentration on three COVID-19 traits (Additional file 15: Table S15).
Table 3 shows the sample size in the current analysis for the three COVID-19 traits. The total COVID-19 susceptibility group sample size was 1,683,768, of which 38,984 were COVID-19 cases. The overall COVID-19 severity group sample size was 1,388,342, of which 5101 were COVID-19 severe cases. The sample size of all COVID-19 hospitalization was 1,887,658, of which 9986 were COVID-19 hospitalized cases. We presented the power estimations for a range of proportions of 25(OH)D variation explained by genetic variants. Under the current sample size with outlier removed sets, our study had 80% power to detect a causal effect of a relative 12% (ORs less than 0.8994) decrease in COVID-19 susceptibility risk, a relative 32% (ORs less than 0.7251) decrease in COVID-19 severity risk and a relative 22% (ORs less than 0.8004) decrease in COVID-19 hospitalization risk per 1-SD increase in the 25(OH)D concentration.
There is no doubt that immune modulation is the fulcrum of most diseases, and COVID-19 is no exception. Vitamin D has many mechanisms to modulate immunity, such as physical barriers and innate and adaptive immunity, to reduce the risk of microbial and viral infection [36, 37]. Vitamin D was reported to help maintain tight junctions, gap junctions, and adherens junctions to resist infections . It can enhance cellular innate immunity through the release of antimicrobial peptides, such as human cathelicidin , which exhibits direct antimicrobial activities by killing invading pathogens by perturbing the cell membranes and viral envelope, such as SARS-CoV-2 . Vitamin D could also activate T cells, B cells, dendritic cells and macrophages by the vitamin D receptor (VDR) expressed by most immune cells . These cells are able to produce a wide repertoire of responses that ultimately determine the nature and duration of the immune responses. Moreover, vitamin D exerts an inhibitory, anti-inflammatory action on the adaptive immune system. Vitamin D can reduce the production of proinflammatory T helper type 1 (Th1) cytokines, such as tumor necrosis factor α and interferon γ, and increase the expression of anti-inflammatory cytokines by macrophages . It can also promote cytokine production by T helper type 2 (Th2) cells, which helps enhance the suppression of Th1 cells indirectly .
In this MR study, we used the strong IVs from the summary statistics of the largest GWAS conducted for vitamin D and COVID-19 phenotypes in European populations. We aimed to determine whether the relationship between 25(OH)D and COVID-19 was causal by employing a range of two-sample MR methods. We employed a range of methods known to control for pleiotropy, checked the heterogeneity and obtained highly consistent results. Pleiotropic effects were detected by using the MR-Egger intercept and MR-PRESSO method. Using the MR design, we could mitigate the confounding factors due to the application of Mendel’s second law of the random assortment of alleles. Reverse causality was also prevented because genetic variants were fixed at conception and cannot be affected by disease processes. We used the PhenoScanner tool to detect potential pleiotropic SNPs. Some SNPs were found to be associated with hematological parameters. We performed univariable MR and found the causal association between the monocyte percentage of white cells and COVID-19 severity risk. we removed the SNPs associated with the monocyte percentage of white cells in the outlier removed sets and re-performed the MR analysis. The results above showed that the presence of pleiotropic SNPs was minimal. Taken together, our MR results did not support the evidence that the serum 25(OH)D concentration was a causal factor for COVID-19 susceptibility, severity or hospitalization traits. Based on the evidence above, our study did not recommend the vitamin D supplementation during the pandemic.
This is the first study to illustrate the causal relationship between the serum 25(OH)D concentration and COVID-19 using MR analysis. Multiple published traditional epidemiology studies have demonstrated the role of vitamin D supplementation in reducing the risk of COVID-19 and its severity. However, the results were inconsistent and likely to be confounded by multiple unmeasured or improperly controlled variables. Hastie et al. [43, 44] used individual data from UK Biobank participants to correlate historical vitamin D levels checked between 2006 and 2010 with risk for COVID-19 positivity. They found that the 25(OH)D concentration was univariably associated with severe COVID-19 and mortality, but statistical significance was lost after adjustment for confounders. A retrospective, observational analysis including 191,779 patients from across all US states used data from tests performed at a national clinical laboratory . They found a strong association between lower 25(OH)D concentrations and an increased rate of SARS-CoV-2 positivity. This remained significant after adjustment for sex, age, latitude and ethnicity. A retrospective cohort from Chicago included 489 patients tested for COVID-19 who had a vitamin D level checked in the year before testing. The relative risk of testing positive for COVID-19 was 1.77 times greater for patients with likely deficient vitamin D status (25-hydroxycholecalciferol < 20 ng/ml) compared with patients with likely sufficient vitamin D status, a difference that was statistically significant . Carpagnano et al.  analyzed the demographic, clinical and laboratory data of 42 patients with acute respiratory failure due to COVID-19 in Italy. They reported that patients with severe vitamin D deficiency had a significantly higher mortality risk. However, the small sample size and low follow-up of enrolled patients might limit the power of the study. Despite major efforts to control for confounding, such observational studies still remained prone to residual confounding by uncontrolled or imperfectly measured covariates. Additionally, collider bias could arise when researchers restrict analyses on a collider variable in observational studies. The collider bias referred to restricting analyses to some people who experienced a specific event, such as hospitalization with COVID-19, or volunteered their participation in a large-scale study. The association effects in the specific sample will not be a reliable indication of the individual-level causal effects. Therefore, collider bias caused associations to fail to generalize beyond the sample and for causal inferences to be inaccurate within the sample . Reverse causal effects would also violate the results of observational studies. Vitamin D binding protein was reported as a positive acute phase reactant after infections, which might lower the level of 25(OH)D. A decrease in 25(OH)D may also be the result of COVID-19 .
In one quasi-experimental study, Annweiler et al.  included sixty-six residents with COVID-19 from a French nursing home. They reported that vitamin D supplementation represented an effective, accessible and well-tolerated treatment for COVID-19 to reduce the severe COVID-19 cases and improve survival rates. To the best of our knowledge, most RCTs of vitamin D in the community are unlikely to be complete until spring 2021 , although we noted the positive results of a randomized trial  reporting that the administration of a high dose of calcifediol or 25(OH)D significantly reduced the need for intensive care unit (ICU) treatment of patients requiring hospitalization due to COVID-19. However, imperfect double-blinding, uneven distribution and imperfect control for potential confounders were limitations of this trial. Well-designed studies are paramount for more fully exploring and defining the role of vitamin D supplementation in preventing the occurrence and severe course of COVID-19.
Despite the MR design being less susceptible to confounding than other observational studies, limitations still exist. First, potential pleiotropy is the common limitation on all MRs, and our results may still have been affected by unmeasured horizontal pleiotropy. To assess this bias, we assessed potential pleiotropy using the MR-Egger method and MR-PRESSO method. We also used MR Steiger filtering and the PhenoScanner tool and observed no evidence that pleiotropic SNPs existed. Hence, whereas the risk of a residual horizontal pleiotropic effect cannot be ruled out, we still believe it is unlikely to change the conclusions of this study in a clinically meaningful way. Second, the low power of MR might be the reason for null results rather than a true null. However, our study had 80% power to detect the decreased risk of COVID-19 susceptibility, severity and hospitalization traits with ORs of 0.89, 0.72 and 0.80 per SD increase in the serum 25(OH)D concentration. Third, genetic associations may be affected by population stratification, which is another potential bias-inducing factor for MR analyses since the MAF differences among different ancestries and may differ between the populations in the exposure and outcome GWAS. The limitation was minimized by utilizing genetic associations from studies mainly of people of European ancestry with genomic control in our study. However, extending our results to people outside Europe should proceed with caution. Lastly, the MR study only tested the linear effect of serum vitamin D concentration in the general population. Future studies may be designed to comprehensively evaluate any nonlinear relationships between the vitamin D concentration and COVID-19 traits.
In summary, we provided precise evidence that genetically lowered serum 25(OH)D concentrations were not causally associated with COVID-19 susceptibility, severity or hospitalization traits. In balance, the current evidence does not support the need to take the vitamin D supplements in order to reduce the risk of COVID-19 susceptibility, severity and hospitalization. We are looking forward to the results of ongoing well-designed randomized controlled trials providing more evidence concerning the role of vitamin D in the prevention and treatment of COVID-19.
Availability of data and materials
All data generated or analyzed during this study are included in this published article and its additional material.
Coronavirus disease 2019
Genome-wide association studies
Inverse variance weighted
MR Robust Adjusted Profile Score
MR Pleiotropy RESidual Sum and Outlier
Severe acute respiratory syndrome coronavirus 2
World Health Organization
National Institute for Health and Care Excellence
Randomized controlled trials
Single nucleotide polymorphism
Minor allele frequency
- COVID-19 HGI:
COVID-19 Host Genetics Initiative
International Classification of Diseases
Instrument strength is independent of direct effect
Vitamin D receptor
T helper type 1
T helper type 2
Intensive care unit
Wiersinga W, Rhodes A, Cheng A, Peacock S, Prescott H. Pathophysiology, transmission, diagnosis, and treatment of Coronavirus disease 2019 (COVID-19): a review. JAMA. 2020;324:782–93.
WHO. Coronavirus disease 2019 (COVID-19) situation report as of 10:25 am CET, 12 February 2021. https://covid19.who.int. Accessed 12 Feb 2021.
Grant WB, Lahore H, McDonnell SL, Baggerly CA, French CB, Aliano JL, et al. Evidence that vitamin D supplementation could reduce risk of influenza and COVID-19 infections and deaths. Nutrients. 2020;12:988.
London: National Institute for Health and Care Excellence (UK). COVID-19 rapid guideline: vitamin D. https://www.nice.org.uk/covid-19. Accessed 15 Feb 2021.
Guidance on vitamin D supplementation for clinically extremely vulnerable groups. https://www.gov.uk/government/publications/vitamin-d-for-vulnerable-groups/vitamin-d-and-clinically-extremely-vulnerable-cev-guidance. Accessed 17 Feb 2021.
Holick M. Vitamin D deficiency. N Engl J Med. 2007;357:266–81.
Bouillon R, Marcocci C, Carmeliet G, Bikle D, White J, Dawson-Hughes B, Lips P, Munns C, Lazaretti-Castro M, Giustina A, et al. Skeletal and extraskeletal actions of vitamin D: current evidence and outstanding questions. Endocr Rev. 2019;40:1109–51.
Lordan R. Notable developments for vitamin D amid the COVID-19 pandemic, but caution warranted overall: a narrative review. Nutrients. 2021;13:740.
Martineau A, Jolliffe D, Hooper R, Greenberg L, Aloia J, Bergman P, Dubnov-Raz G, Esposito S, Ganmaa D, Ginde A, et al. Vitamin D supplementation to prevent acute respiratory tract infections: systematic review and meta-analysis of individual participant data. BMJ. 2017;356:i6583.
Pham H, Waterhouse M, Baxter C, Duarte Romero B, McLeod D, Armstrong B, Ebeling P, English D, Hartel G, Kimlin M, et al. The effect of vitamin D supplementation on acute respiratory tract infection in older Australian adults: an analysis of data from the D-Health Trial. Lancet Diabetes Endocrinol. 2021;9:69–81.
Camargo CAJ, Martineau A. Vitamin D to prevent COVID-19: recommendations for the design of clinical trials. FEBS J. 2020;287:3689–92.
Emdin C, Khera A, Kathiresan S. Mendelian randomization. JAMA. 2017;318:1925–6.
Lawlor D, Harbord R, Sterne J, Timpson N, Davey SG. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med. 2008;27:1133–63.
Hemani G, Zheng J, Elsworth B, Wade K, Haberland V, Baird D, Laurin C, Burgess S, Bowden J, Langdon R, et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife. 2018;7:e34408.
Machiela M, Chanock S. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics. 2015;31:3555–7.
Burgess S, Small D, Thompson S. A review of instrumental variable estimators for Mendelian randomization. Stat Methods Med Res. 2017;26:2333–55.
Staley J, Blackshaw J, Kamat M, Ellis S, Surendran P, Sun B, Paul D, Freitag D, Burgess S, Danesh J, et al. PhenoScanner: a database of human genotype-phenotype associations. Bioinformatics. 2016;32:3207–9.
Kamat M, Blackshaw J, Young R, Surendran P, Burgess S, Danesh J, Butterworth A, Staley J. PhenoScanner V2: an expanded tool for searching human genotype-phenotype associations. Bioinformatics. 2019;35:4851–3.
Revez J, Lin T, Qiao Z, Xue A, Holtz Y, Zhu Z, Zeng J, Wang H, Sidorenko J, Kemper K, et al. Genome-wide association study identifies 143 loci associated with 25 hydroxyvitamin D concentration. Nat Commun. 2020;11:1647.
Jiang X, O’Reilly P, Aschard H, Hsu Y, Richards J, Dupuis J, Ingelsson E, Karasik D, Pilz S, Berry D, et al. Genome-wide association study in 79,366 European-ancestry individuals informs the geneticarchitecture of 25-hydroxyvitamin D levels. Nat Commun. 2018;9:260.
COVID-19 Host Genetics Initiative. The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic. Eur J Hum Genet. 2020;28:715–8.
Burgess S, Butterworth A, Thompson S. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol. 2013;37:658–65. https://doi.org/10.1002/gepi.21758.
Bowden J, Del Greco M, Minelli C, Davey Smith G, Sheehan N, Thompson J. Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I2 statistic. Int J Epidemiol. 2016;45:1961–74.
Burgess S, Thompson SG. Interpreting findings from Mendelian randomization using the MR-Egger method. Eur J Epidemiol. 2017;32:377–89.
Bowden J, Davey Smith G, Haycock P, Burgess S. Consistent estimation in Mendelianrandomization with some invalid instruments using a Weighted median estimator. Genet Epidemiol. 2016;40:304–14.
Zhao Q, Chen Y, Wang J, Small D. Powerful three-sample genome-wide design and robust statistical inference in summary-data Mendelian randomization. Int J Epidemiol. 2019;48:1478–92.
Zheng J, Brion M, Kemp J, Warrington N, Borges M, Hemani G, Richardson T, Rasheed H, Qiao Z, Haycock P, et al. The effect of plasma lipids and lipid-lowering interventions on bone mineral density: a Mendelian randomization study. J Bone Miner Res. 2020;35:1224–35.
Hemani G, Tilling K, Davey Smith G. Orienting the causal relationship between impreciselymeasured traits using GWAS summary data. PLoS Genet. 2017;13:e1007081.
Verbanck M, Chen C, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet. 2018;50:693–8.
Brion M, Shakhbazov K, Visscher P. Calculating statistical power in Mendelian randomization studies. Int J Epidemiol. 2013;42:1497–501.
Pierce B, Burgess S. Efficient design for Mendelian randomization studies: subsample and 2-sample instrumental variable estimators. Am J Epidemiol. 2013;178:1177–84.
Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, Zhang L, Fan G, Xu J, Gu X, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020;395:497–506.
Zhang J, Dong X, Cao Y, Yuan Y, Yang Y, Yan Y, Akdis C, Gao Y. Clinical characteristics of 140patients infected with SARS-CoV-2 in Wuhan, China. Allergy. 2020;75:1730–41.
Zhou F, Yu T, Du R, Fan G, Liu Y, Liu Z, Xiang J, Wang Y, Song B, Gu X, et al. Clinical course andrisk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet. 2020;395:1054–62.
Astle WJ, Elding H, Jiang T, Allen D, Ruklisa D, Mann AL, Mead D, Bouman H, Riveros-Mckay F, Kostadima MA, et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell. 2016;167:1415–29.
Gombart A, Pierre A, Maggini S. A review of micronutrients and the immune system-working in harmony to reduce the risk of infection. Nutrients. 2020;12:236.
Bilezikian J, Bikle D, Hewison M, Lazaretti-Castro M, Formenti A, Gupta A, Madhavan M, Nair N, Babalyan V, Hutchings N, et al. Mechanisms in endocrinology: vitamin D and COVID-19. Eur J Endocrinol. 2020;183:R133–47.
Schwalfenberg G. A review of the critical role of vitamin D in the functioning of the immune system and the clinical implications of vitamin D deficiency. Mol Nutr Food Res. 2011;55:96–108.
Liu P, Stenger S, Li H, Wenzel L, Tan B, Krutzik S, Ochoa M, Schauber J, Wu K, Meinken C, et al. Toll-like receptor triggering of a vitamin D-mediated human antimicrobial response. Science. 2006;311:1770–3.
Ahmed A, Siman-Tov G, Hall G, Bhalla N, Narayanan A. Human antimicrobial peptides as therapeutics for viral infections. Viruses. 2019;11:704.
Cantorna M, Snyder L, Lin Y, Yang L. Vitamin D and 1,25(OH)2D regulation of T cells. Nutrients. 2015;7:3011–21.
Jeffery L, Burke F, Mura M, Zheng Y, Qureshi O, Hewison M, Walker L, Lammas D, Raza K, Sansom DM. 1,25-Dihydroxyvitamin D3 and IL-2 combine to inhibit T cell production of inflammatory cytokines and promote development of regulatory T cells expressing CTLA-4 and FoxP3. J Immunol. 2009;183:5458–67.
Hastie C, Pell J, Sattar N. Vitamin D and COVID-19 infection and mortality in UK Biobank. Eur JNutr. 2021;60:545–8.
Hastie C, Mackay D, Ho F, Celis-Morales C, Katikireddi S, Niedzwiedz C, Jani B, Welsh P, Mair F, Gray S, et al. Vitamin D concentrations and COVID-19 infection in UK Biobank. Diabetes Metab Syndr. 2020;14:561–5.
Kaufman H, Niles J, Kroll M, Bi C, Holick MF. SARS-CoV-2 positivity rates associated with circulating 25-hydroxyvitamin D levels. PLoS ONE. 2020;15:e239252.
Meltzer DO, Best TJ, Zhang H, Vokes T, Arora V, Solway J. Association of vitamin D status and other clinical characteristics with COVID-19 test results. JAMA Netw Open. 2020;3:e2019722.
Carpagnano G, Di Lecce V, Quaranta V, Zito A, Buonamico E, Capozza E, Palumbo A, Di Gioia G, Valerio V, Resta O. Vitamin D deficiency as a predictor of poor prognosis in patients with acute respiratory failure due to COVID-19. J Endocrinol Invest. 2020;44:765–71.
Griffith G, Morris T, Tudball M, Herbert A, Mancano G, Pike L, Sharp G, Sterne J, Palmer T, Davey Smith G, et al. Collider bias undermines our understanding of COVID-19 disease risk and severity. Nat Commun. 2020;11:5749.
Silva M, Furlanetto T. Does serum 25-hydroxyvitamin D decrease during acute-phase response? A systematic review. Nutr Res. 2015;35:91–6.
Annweiler C, Hanotte B, Grandin D, L’Eprevier C, Sabatier J, Lafaie L, Célarier T. Vitamin D and survival in COVID-19 patients: A quasi-experimental study. J Steroid Biochem Mol Biol. 2020;204:105771.
Chakhtoura M, Napoli N, El Hajj FG. Commentary: myths and facts on vitamin D amidst the COVID-19 pandemic. Metabolism. 2020;109:154276.
Entrenas C, Entrenas C, Vaquero B, Alcalá D, López M, Bouillon R, Quesada G. Effect ofcalcifediol treatment and best available therapy versus best available therapy on intensive care unit admission and mortality among patients hospitalized for COVID-19: a pilot randomized clinical study. J Steroid Biochem Mol Biol. 2020;203:105751.
The authors wish to thank Dr Siying Zhuang for the support in the preparation of this manuscript.
This work was supported by grants from Beijing Municipal Science and Technology Commission (Z181100001718195).
Ethics approval and consent to participate
This analysis of publicly available data does not require ethical approval.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1: Table S1.
The study details included in COVID-19 HGI GWAS meta-analyses of COVID-19 susceptibility.
Additional file 2: Table S2.
The study details included in COVID-19 HGI GWAS meta-analyses of COVID-19 severity.
Additional file 3: Table S3.
The study details included in COVID-19 HGI GWAS meta-analyses of COVID-19 hospitalization.
Additional file 4: Table S4.
Characteristics of SNPs in outlier removed sets used in Mendelian randomization analysis of effect of the serum 25(OH)D concentration on COVID-19 susceptibility risk.
Additional file 5: Table S5.
Characteristics of SNPs in outlier removed sets used in Mendelian randomization analysis of effect of the serum 25(OH)D concentration on COVID-19 severity risk.
Additional file 6: Table S6.
Characteristics of SNPs in outlier removed sets used in Mendelian randomization analysis of effect of serum 25(OH)D concentration on COVID-19 hospitalization risk.
Additional file 7: Table S7.
MR results between the serum 25(OH)D concentration and COVID-19 phenotypes from sensitivity analysis with linkage-disequilibrium coefficient (R2) < 0.001.
Additional file 8: Table S8.
MR results between the serum 25(OH)D concentration and COVID-19 phenotypes from sensitivity analysis with linkage-disequilibrium coefficient (R2) < 0.01.
Additional file 9: Table S9.
"Leave-one-out" analysis of causal associations between the serum 25(OH)D concentration and COVID-19 susceptibility risk using outlier removed sets based on IVW-random method.
Additional file 10: Table S10.
"Leave-one-out" analysis of causal associations between the serum 25(OH)D concentration and COVID-19 severity risk using outlier removed sets based on IVW-random method.
Additional file 11: Table S11.
"Leave-one-out" analysis of causal associations between the serum 25(OH)D concentration and COVID-19 hospitalization risk using outlier removed sets based on IVW-random method.
Additional file 12: Table S12.
Univariable MR results of hematological traits and COVID-19 susceptibility risk.
Additional file 13: Table S13.
Univariable MR results of hematological traits and COVID-19 severity risk.
Additional file 14: Table S14.
Univariable MR results of hematological traits and COVID-19 hospitalization risk.
Additional file 15: Table S15.
Univariable Mendelian randomization of the causal effect of the serum 25(OH)D concentration on three COVID-19 phenotypes after removing the SNPs associated with hematological traits.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Cui, Z., Tian, Y. Using genetic variants to evaluate the causal effect of serum vitamin D concentration on COVID-19 susceptibility, severity and hospitalization traits: a Mendelian randomization study. J Transl Med 19, 300 (2021). https://doi.org/10.1186/s12967-021-02973-5
- Vitamin D
- Mendelian randomization