Simultaneous diagnosis of tuberculous pleurisy and malignant pleural effusion using metagenomic next-generation sequencing (mNGS)
Journal of Translational Medicine volume 21, Article number: 680 (2023)
Metagenomic next-generation sequencing (mNGS) has become a powerful tool for pathogen detection, but the value of human sequencing reads generated from it is underestimated.
A total of 138 patients with pleural effusion (PE) were diagnosed with tuberculous pleurisy (TBP, N = 82), malignant pleural effusion (MPE, N = 35), or non-TB infection (N = 21), whose PE samples all underwent mNGS analysis. Clinical TB tests including culture, Acid-Fast Bacillus (AFB) test, Xpert, and T-SPOT, were performed. To utilize mNGS for MPE identification, 25 non-MPE samples (20 TBP and 5 non-TB infection) were randomly selected to set human chromosome copy number baseline and generalized linear modeling was performed using copy number variant (CNV) features of the rest 113 samples (35 MPE and 78 non-MPE).
The performance of TB detection was compared among five methods. T-SPOT demonstrated the highest sensitivity (61% vs. culture 32%, AFB 12%, Xpert 35%, and mNGS 49%) but with the highest false-positive rate (10%) as well. In contrast, mNGS was able to detect TB-genome in nearly half (40/82) of the PE samples from TBP subgroup, with 100% specificity. To evaluate the performance of using CNV features of the human genome for MPE prediction, we performed the leave-one-out cross-validation (LOOCV) in the subcohort excluding the 25 non-MPE samples for setting copy number standards, which demonstrated 54.1% sensitivity, 80.8% specificity, 71.7% accuracy, and an AUC of 0.851.
In summary, we exploited the value of human and non-human sequencing reads generated from mNGS, which showed promising ability in simultaneously detecting TBP and MPE.
Clinically, patients with pleural effusions (PE) are commonly suspected of having malignant neoplasms or infectious diseases, e.g., tuberculous pleurisy (TBP) [1, 2]. Nowadays, diagnostic methods for tuberculous (TB) infection in clinics mainly include microbial culture, Acid-Fast Bacillus (AFB) test, Xpert MTB/RIF (Xpert) assay, and T-SPOT.TB test (T-SPOT). However, the diagnosis of TBP remains difficult as each approach has pros and cons. For instance, TB culturing requires a significantly long processing time (up to weeks) with a very high specificity . AFB test is fast, but its sensitivity is only around 30% with a restricted ability to differentiate between TB and non-TB infection . The Xpert assay is recommended by the World Health Organization, but its diagnostic sensitivity is not optimal enough . Therefore, the development of optimized TB-detection assays is warranted. Metagenomic next-generation sequencing (mNGS) has become a powerful tool for broad pathogen detection , whose diagnostic value in TBP was also evaluated in multiple studies with higher sensitivity than conventional clinical approaches [4, 7, 8].
The identification of malignant PE (MPE) now mainly relies on pathological and cytologic examinations but with limited diagnostic sensitivity . Genome instability considered an important genetic marker of malignant neoplasms has been studied widely based on various approaches, such as whole-genome sequencing and fluorescent in situ hybridization [10,11,12]. As a large number of human reads sequenced by mNGS are usually deleted without further interpretation, several studies explored the possibility of repurposing mNGS-derived human reads for copy number variant (CNV) analysis and cancer identification [13,14,15]. Herein, by taking advantage of both human and microbial sequencing reads, we evaluated the diagnostic performance of mNGS for simultaneously identifying TBP and MPE in this retrospective study.
Patients and study design
A total of 138 patients with PE who were diagnosed with TBP or other pathogen infections or MPE were enrolled in this study at Beijing Chest Hospital from June 2020 to July 2022. Patients’ demographic characteristics, clinical laboratory results, imaging data, and other medical records were retrospectively reviewed. This study was approved by the Institutional Review Board of Beijing Chest Hospital (Approval ID: 2021LSKY-58). All samples were obtained with the patient’s consent.
Routine TB detection
Microbial culture using MGIT 960 system (Becton Dickinson, Sparks, MD, USA), AFB with Ziehl–Neelsen stain (BASO, Zhuhai, China), Xpert on GeneXpert system (Cepheid, Sunnyvale, CA, USA), and T-SPOT assay (Oxford Immunotec Ltd., Abingdon, UK) were routinely performed by the Department of Pathology for TB detection with PE, sputum, and/or bronchoalveolar lavage fluid (BALF) samples, according to the standard procedures and manufacturer’s protocols. Patients in the TBP-positive subgroup were: (1) showing positive TB culturing or Xpert result (defined as the test-defined TBP subgroup), which represent the gold standard of TB diagnosis according to the WHO guidelines [16, 17]; or (2) based on the comprehensive evaluation of clinical manifestations, auxiliary test results (including AFB, T-SPOT, and mNGS), and outcome assessment after TB drug administration (defined as the comprehensive diagnosis TBP subgroup).
Malignant tumor identification
The diagnosis of MPE was confirmed by pathological examinations with either tissue biopsies or PE sediment specimens using hematoxylin and eosin stain for histomorphology.
Non-TB infection patients had either positive laboratory culturing or mNGS testing result for non-TB pathogen detection, or the comprehensive evaluation result based on clinical manifestations and outcome assessment after non-TB drug administration.
mNGS for TB detection
PE samples were used for DNA extraction using the QIAamp DNeasy Blood & Tissue Kit (Qiagen). DNA libraries were constructed using the KAPA Hyper Prep kit (KAPA Biosystems) according to the manufacturer’s protocols and sequenced on Illumina NavoSeq (Illumina). The basic procedure of mNGS was illustrated in Fig. 1A.
The bioinformatic process for pathogen detection of this mNGS pipeline was described in previous studies [18, 19]. In brief, quality control for sequencing reads was conducted by removing low-quality reads, adapter sequences, and duplicated or short (< 36 bp) reads. The remaining qualified reads were first mapped to the human reference genome (hs37d5) using bowtie2 software and then the non-human reads were aligned to the microorganism genome database for pathogens identification. A sample with at least three non-overlapping reads mapped to the TB genome and over tenfold of TB reads to the no-template control was identified as TB-positive.
mNGS-derived CNV for identifying malignant PE
Sequencing reads that mapped to the human genome were used for genome copy number analysis using the software WisecondorX . We randomly selected 25 non-malignant PE samples (20TB and 5 non-TB) that served as human genome copy number baseline to identify CNV features in the remaining 113 PE samples. CNV feature filtering excluded the features that were only presented in less than 20% of samples and the remaining 2662 CNV features were included for malignant prediction using generalized linear modeling (GLM, h2o.glm function in R). Model performance was evaluated by the leave-one-out cross-validation (LOOCV, pROC package in R).
From June 2020 to July 2022, a total of 138 patients were enrolled in this study, 82 of whom were diagnosed with TBP, 21 with non-TB infection, and 35 having MPE. The clinical characteristics of patients were summarized in Table 1 and the detailed clinical and diagnostic information of each patient including final diagnosis and test results were provided in Additional file 1: Table S1. The median age for the entire cohort was 58 years old, ranging from 19 to 92, and over two-thirds (95/138) were male. Underlying diseases such as diabetes, hypertension, liver diseases, etc., were reported in approximately 60% (84/138) of patients. Blood tests for white blood cell count, plateletcrit (PCT), and C-reactive protein (CRP) levels were routinely performed.
TB-detection performance comparison between mNGS and clinical tests
In this study, multiple clinical tests including culture, AFB, Xpert, and T-SPOT, as well as mNGS using PE samples were performed for TB detection. Due to the nature of the retrospective clinical study, the results of culture, AFB, T-SPOT, and Xpert were undetermined in 21, 24, 40, and 18 patients, respectively (Additional file 1: Table S1; Table 2). In the TB-positive subgroup (N = 82), over 45% of patients (37/82) were defined as test-defined TBP, who had either positive TB culturing or Xpert-positive. While the remaining 45 TBP patients (55%) were diagnosed based on comprehensive clinical evidence (see Methods). In comparison, the T-SPOT assay demonstrated the highest positive detection rate (61%) among all clinical tests (culture 32%, Xpert 35%, and AFB 12%), which was also slightly higher than that of mNGS (49%, Fig. 2). Notably, no false positive TB-detection events were observed in the TB-negative subgroup (N = 56) using mNGS, culture, and Xpert assays. But the false positive rate of the T-SPOT assay reached up to 11%, which was well above other approaches (AFB: 2%).
Among the 52TB-positive patients whose test results were available for all five TB-detection methods, only five of them (9.6%) showed consistently positive results on all tests. Approximately 63.5% (33/52) of them had at least two positive results from the five methods (Fig. 2).
mNGS CNV modeling for identifying malignant PE
To take advantage of the human genome sequencing reads obtained from mNGS, we developed an mNGS-CNV pipeline to assess the genome copy number along the chromosomes. As described in the Methods section, 25 non-MPE samples were randomly chosen as the baseline to normalize chromosome copy number in the remaining 113 patients (35 MPE and 78 non-MPE). As shown in Fig. 3A, CNV events (both copy number gain and loss) were frequently observed in the representative patient with MPE.
GLM was performed to construct a prediction model using the filtered CNV features (frequency ≥ 20%), the predictive power of which was evaluated by LOOCV. Compared to the clinical pathology diagnosis, the mNGS-CNV modeling demonstrated 51.4% sensitivity, 80.8% specificity, and 71.7% accuracy (Fig. 3B, C), with an area under the curve (AUC) of 0.581 based on the receiver-operating characteristic (ROC) curve (Fig. 3D).
In this retrospective study, we explored the diagnostic utility of mNGS in detecting TBP and MPE simultaneously using a single PE sample. In terms of TB diagnostic performance, mNGS produced a sensitivity of 49% and a specificity of 100% on PE samples, which was comparable to previous clinical studies [7, 8]. Shi et al. reported that the diagnostic performance of mNGS on BALF samples was the best (sensitivity 47.9%) compared to conventional microbiological tests (sensitivity from 29.2% to 46.8%) with BALF or sputum samples . Another prospective study using various clinical samples (BALF, PE, cerebrospinal, ascites, etc.) demonstrated an overall sensitivity of 44% and a specificity of 98% of mNGS on all sample types . They also mentioned that positive blood T-SPOT results were observed in 82% of patients with active TB infection and 33% of those without. The relatively high false positive rate of T-SPOT makes it unsuitable serving as a stand-alone tool for diagnosing TB infection, but could be a complementary diagnostic method . In our cohort, T-SPOT produced the highest sensitivity and the lowest specificity among all tested approaches, suggesting the importance of combining multiple methods to detect TB efficiently and accurately in clinical practice. Similarly, AFB itself is not enough for TB diagnosis due to the sub-optimal performance [22, 23].
Previous studies have reported genomic instability as a molecular marker of malignant neoplasms with both copy number gain and loss , but analyzing CNV based on mNGS-derived human reads was less investigated. With this strategy, pathogen detection and malignancy prediction were simultaneous in a single experiment from sample collection to sequencing, significantly shortening the processing time, which was critical in severe conditions. Herein, we explored the diagnostic performance of mNGS-CNV modeling on MPE prediction, which showed 51.4% sensitivity, 80.8% specificity, and 71.7% accuracy. In contrast, Guo et al. reported higher sensitivity (83.7%) and specificity (97.6%) of mNGS CNV analysis on lung biopsy tissue samples instead of PE. Another study using various body fluids such as BALF, PE, peritoneal fluid, etc., showed that the mNGS-CNV test was able to identify 68% of cancer patients who were negative for conventional tests . Furthermore, mNGS was also proven to detect central nervous system malignant neoplasms using cerebrospinal fluids, whose sensitivity reached up to 75% with 100% specificity . Together with our study, mNGS CNV analysis presented great potential in predicting malignant neoplasms with diverse sample types. Optimizing the bioinformatic pipeline may further improve the diagnostic performance but validation in larger cohorts is warranted.
Several limitations of this study need to be noted. First, as a retrospective study, the clinical TB detection tests were performed on multiple specimens, including PE, sputum, BALF, etc., the results of which were undetermined in a small number of patients. Due to the restricted cohort size, we could not split it into training and testing cohorts for mNGS-CNV modeling, especially after excluding the 25 non-MPE samples for setting genome copy number baseline. Thus, we performed LOOCV to evaluate the performance of the mNGS-CNV analysis. Lastly, this study was a pilot study for investigating the potential of repurposing human reads generated from the well-established pathogen-detection mNGS pipeline without any optimizations to better interpret human sequences. We believe further studies on experimental and bioinformatic improvements will increase the sensitivity and specificity with an external validation cohort.
In conclusion, we presented the possibility of detecting TBP and MPE simultaneously using mNGS on PE specimens, with relatively good diagnostic performance. Our study promoted mNGS as a promising tool for pathogen detection and cancer diagnosis, but prospective clinical studies and large-cohort validation are needed in the future.
Availability of data and materials
The data generated in this study has been uploaded to NCBI database (Accession: PRJNA944842).
Area under the curve
Bronchoalveolar lavage fluid
Copy number variant
Generalized linear modeling
Metagenomic next-generation sequencing
Malignant pleural effusion
Leave-one-out cross validation
Light RW. Pleural effusions. Med Clin North Am. 2011;95:1055–70.
Jany B, Welte T. Pleural effusion in adults-etiology, diagnosis, and treatment. Dtsch Arztebl Int. 2019;116:377–86.
Udwadia ZF, Sen T. Pleural tuberculosis: an update. Curr Opin Pulm Med. 2010;16:399–406.
Xu P, Yang K, Yang L, et al. Next-generation metagenome sequencing shows superior diagnostic performance in acid-fast staining sputum smear-negative pulmonary tuberculosis and non-tuberculous mycobacterial pulmonary disease. Front Microbiol. 2022;13: 898195.
Lee HS, Kee SJ, Shin JH, et al. Xpert MTB/RIF Assay as a Substitute for Smear Microscopy in an Intermediate-Burden Setting. Am J Respir Crit Care Med. 2019;199:784–94.
Blauwkamp TA, Thair S, Rosen MJ, et al. Analytical and clinical validation of a microbial cell-free DNA sequencing test for infectious disease. Nat Microbiol. 2019;4:663–74.
Zhou X, Wu H, Ruan Q, et al. Clinical evaluation of diagnosis efficacy of active Mycobacterium tuberculosis complex infection via metagenomic next-generation sequencing of direct clinical samples. Front Cell Infect Microbiol. 2019;9:351.
Shi CL, Han P, Tang PJ, et al. Clinical metagenomic sequencing for diagnosis of pulmonary tuberculosis. J Infect. 2020;81:567–74.
Ferreiro L, Suárez-Antelo J, Valdés L. Pleural procedures in the management of malignant effusions. Ann Thorac Med. 2017;12:3–10.
Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144:646–74.
Zhao EY, Jones M, Jones SJM. Whole-genome sequencing in cancer. Cold Spring Harb Perspect Med. 2019;9: a034579.
Torres-Ruiz R, Grazioso TP, Brandt M, Martinez-Lage M, Rodriguez-Perales S, Djouder N. Detection of chromosome instability by interphase FISH in mouse and human tissues. STAR Protoc. 2021;2: 100631.
Guo Y, Li H, Chen H, et al. Metagenomic next-generation sequencing to identify pathogens and cancer in lung biopsy tissue. EBioMedicine. 2021;73: 103639.
Gu W, Talevich E, Hsu E, et al. Detection of cryptogenic malignancies from metagenomic whole genome sequencing of body fluids. Genome Med. 2021;13:98.
Gu W, Rauschecker AM, Hsu E, et al. Detection of neoplasms by metagenomic next-generation sequencing of cerebrospinal fluid. JAMA Neurol. 2021;78:1355–66.
. Automated Real-time nucleic acid amplification technology for rapid and simultaneous detection of tuberculosis and rifampicin resistance: Xpert MTB/RIF assay for the diagnosis of pulmonary and extrapulmonary TB in adults and children: policy update. Geneva, 2013.
WHO consolidated guidelines on tuberculosis: Module 3: Diagnosis - Tests for tuberculosis infection. Geneva, 2022.
Zeng X, Wu J, Li X, et al. Application of metagenomic next-generation sequencing in the etiological diagnosis of infective endocarditis during the perioperative period of cardiac surgery: a prospective cohort study. Front Cardiovasc Med. 2022;9: 811492.
Ren D, Ren C, Yao R, et al. The microbiological diagnostic performance of metagenomic next-generation sequencing in patients with sepsis. BMC Infect Dis. 2021;21:1257.
Lennart R, Annelies D, et al. WisecondorX: improved copy number detection for routine shallow whole-genome sequencing. Nucleic Acids Res. 2018;47:1605.
Jiang J, Shi HZ, Liang QL, Qin SM, Qin XJ. Diagnostic value of interferon-gamma in tuberculous pleurisy: a metaanalysis. Chest. 2007;131:1133–41.
Lipsky BA, Gates J, Tenover FC, Plorde JJ. Factors affecting the clinical value of microscopy for acid-fast bacilli. Rev Infect Dis. 1984;6:214–22.
Gladwin MT, Plorde JJ, Martin TR. Clinical application of the Mycobacterium tuberculosis direct test: case report, literature review, and proposed clinical algorithm. Chest. 1998;114:317–23.
We would like to thank all research staff involved in this study and all participants.
This work was supported by the National Natural Science Foundation of China (Grant Number: 82072381) and Beijing Municipal Science and Technology Project (Grant Number: Z191100006619079).
Ethics approval and consent to participate
This study was approved by the Institutional Review Board of Beijing Chest Hospital (Approval ID: 2021LSKY-58). All participants signed informed consent forms for participation.
Consent for publication
All participants signed informed consent forms for publishing this study.
Yutong Ma, Qiuxiang Ou, and Yaqiong Jia are employees of Dinfectome Inc., Nanjing, Jiangsu, China. The remaining authors have nothing to disclose.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Xu, F., Wang, Q., Zhang, N. et al. Simultaneous diagnosis of tuberculous pleurisy and malignant pleural effusion using metagenomic next-generation sequencing (mNGS). J Transl Med 21, 680 (2023). https://doi.org/10.1186/s12967-023-04492-x