Skip to main content

Simultaneous diagnosis of tuberculous pleurisy and malignant pleural effusion using metagenomic next-generation sequencing (mNGS)



Metagenomic next-generation sequencing (mNGS) has become a powerful tool for pathogen detection, but the value of human sequencing reads generated from it is underestimated.


A total of 138 patients with pleural effusion (PE) were diagnosed with tuberculous pleurisy (TBP, N = 82), malignant pleural effusion (MPE, N = 35), or non-TB infection (N = 21), whose PE samples all underwent mNGS analysis. Clinical TB tests including culture, Acid-Fast Bacillus (AFB) test, Xpert, and T-SPOT, were performed. To utilize mNGS for MPE identification, 25 non-MPE samples (20 TBP and 5 non-TB infection) were randomly selected to set human chromosome copy number baseline and generalized linear modeling was performed using copy number variant (CNV) features of the rest 113 samples (35 MPE and 78 non-MPE).


The performance of TB detection was compared among five methods. T-SPOT demonstrated the highest sensitivity (61% vs. culture 32%, AFB 12%, Xpert 35%, and mNGS 49%) but with the highest false-positive rate (10%) as well. In contrast, mNGS was able to detect TB-genome in nearly half (40/82) of the PE samples from TBP subgroup, with 100% specificity. To evaluate the performance of using CNV features of the human genome for MPE prediction, we performed the leave-one-out cross-validation (LOOCV) in the subcohort excluding the 25 non-MPE samples for setting copy number standards, which demonstrated 54.1% sensitivity, 80.8% specificity, 71.7% accuracy, and an AUC of 0.851.


In summary, we exploited the value of human and non-human sequencing reads generated from mNGS, which showed promising ability in simultaneously detecting TBP and MPE.


Clinically, patients with pleural effusions (PE) are commonly suspected of having malignant neoplasms or infectious diseases, e.g., tuberculous pleurisy (TBP) [1, 2]. Nowadays, diagnostic methods for tuberculous (TB) infection in clinics mainly include microbial culture, Acid-Fast Bacillus (AFB) test, Xpert MTB/RIF (Xpert) assay, and T-SPOT.TB test (T-SPOT). However, the diagnosis of TBP remains difficult as each approach has pros and cons. For instance, TB culturing requires a significantly long processing time (up to weeks) with a very high specificity [3]. AFB test is fast, but its sensitivity is only around 30% with a restricted ability to differentiate between TB and non-TB infection [4]. The Xpert assay is recommended by the World Health Organization, but its diagnostic sensitivity is not optimal enough [5]. Therefore, the development of optimized TB-detection assays is warranted. Metagenomic next-generation sequencing (mNGS) has become a powerful tool for broad pathogen detection [6], whose diagnostic value in TBP was also evaluated in multiple studies with higher sensitivity than conventional clinical approaches [4, 7, 8].

The identification of malignant PE (MPE) now mainly relies on pathological and cytologic examinations but with limited diagnostic sensitivity [9]. Genome instability considered an important genetic marker of malignant neoplasms has been studied widely based on various approaches, such as whole-genome sequencing and fluorescent in situ hybridization [10,11,12]. As a large number of human reads sequenced by mNGS are usually deleted without further interpretation, several studies explored the possibility of repurposing mNGS-derived human reads for copy number variant (CNV) analysis and cancer identification [13,14,15]. Herein, by taking advantage of both human and microbial sequencing reads, we evaluated the diagnostic performance of mNGS for simultaneously identifying TBP and MPE in this retrospective study.


Patients and study design

A total of 138 patients with PE who were diagnosed with TBP or other pathogen infections or MPE were enrolled in this study at Beijing Chest Hospital from June 2020 to July 2022. Patients’ demographic characteristics, clinical laboratory results, imaging data, and other medical records were retrospectively reviewed. This study was approved by the Institutional Review Board of Beijing Chest Hospital (Approval ID: 2021LSKY-58). All samples were obtained with the patient’s consent.

Routine TB detection

Microbial culture using MGIT 960 system (Becton Dickinson, Sparks, MD, USA), AFB with Ziehl–Neelsen stain (BASO, Zhuhai, China), Xpert on GeneXpert system (Cepheid, Sunnyvale, CA, USA), and T-SPOT assay (Oxford Immunotec Ltd., Abingdon, UK) were routinely performed by the Department of Pathology for TB detection with PE, sputum, and/or bronchoalveolar lavage fluid (BALF) samples, according to the standard procedures and manufacturer’s protocols. Patients in the TBP-positive subgroup were: (1) showing positive TB culturing or Xpert result (defined as the test-defined TBP subgroup), which represent the gold standard of TB diagnosis according to the WHO guidelines [16, 17]; or (2) based on the comprehensive evaluation of clinical manifestations, auxiliary test results (including AFB, T-SPOT, and mNGS), and outcome assessment after TB drug administration (defined as the comprehensive diagnosis TBP subgroup).

Malignant tumor identification

The diagnosis of MPE was confirmed by pathological examinations with either tissue biopsies or PE sediment specimens using hematoxylin and eosin stain for histomorphology.

Non-TB infection

Non-TB infection patients had either positive laboratory culturing or mNGS testing result for non-TB pathogen detection, or the comprehensive evaluation result based on clinical manifestations and outcome assessment after non-TB drug administration.

mNGS for TB detection

PE samples were used for DNA extraction using the QIAamp DNeasy Blood & Tissue Kit (Qiagen). DNA libraries were constructed using the KAPA Hyper Prep kit (KAPA Biosystems) according to the manufacturer’s protocols and sequenced on Illumina NavoSeq (Illumina). The basic procedure of mNGS was illustrated in Fig. 1A.

Fig. 1
figure 1

Workflow of mNGS for TB detection and malignant prediction on PE samples and the cohort overview. A The illustration of mNGS analysis from PE sample collection to the bioinformatic pipeline is shown, where microbial sequencing reads and human genome reads are used for TB detection (left) and CNV analysis (right), respectively. B The cohort overview shows the subgrouping for TB-detection performance comparison (left) and mNGS-CNV modeling (right). PE: pleural effusion; TB: tuberculosis; CNV: copy number variant; TBP: tuberculous pleurisy; MPE: malignant pleural effusion; LOOCV: leave-one-out cross validation

The bioinformatic process for pathogen detection of this mNGS pipeline was described in previous studies [18, 19]. In brief, quality control for sequencing reads was conducted by removing low-quality reads, adapter sequences, and duplicated or short (< 36 bp) reads. The remaining qualified reads were first mapped to the human reference genome (hs37d5) using bowtie2 software and then the non-human reads were aligned to the microorganism genome database for pathogens identification. A sample with at least three non-overlapping reads mapped to the TB genome and over tenfold of TB reads to the no-template control was identified as TB-positive.

mNGS-derived CNV for identifying malignant PE

Sequencing reads that mapped to the human genome were used for genome copy number analysis using the software WisecondorX [20]. We randomly selected 25 non-malignant PE samples (20TB and 5 non-TB) that served as human genome copy number baseline to identify CNV features in the remaining 113 PE samples. CNV feature filtering excluded the features that were only presented in less than 20% of samples and the remaining 2662 CNV features were included for malignant prediction using generalized linear modeling (GLM, h2o.glm function in R). Model performance was evaluated by the leave-one-out cross-validation (LOOCV, pROC package in R).


Patients' characteristics

From June 2020 to July 2022, a total of 138 patients were enrolled in this study, 82 of whom were diagnosed with TBP, 21 with non-TB infection, and 35 having MPE. The clinical characteristics of patients were summarized in Table 1 and the detailed clinical and diagnostic information of each patient including final diagnosis and test results were provided in Additional file 1: Table S1. The median age for the entire cohort was 58 years old, ranging from 19 to 92, and over two-thirds (95/138) were male. Underlying diseases such as diabetes, hypertension, liver diseases, etc., were reported in approximately 60% (84/138) of patients. Blood tests for white blood cell count, plateletcrit (PCT), and C-reactive protein (CRP) levels were routinely performed.

Table 1 Clinical characteristics of patients enrolled in this study

TB-detection performance comparison between mNGS and clinical tests

In this study, multiple clinical tests including culture, AFB, Xpert, and T-SPOT, as well as mNGS using PE samples were performed for TB detection. Due to the nature of the retrospective clinical study, the results of culture, AFB, T-SPOT, and Xpert were undetermined in 21, 24, 40, and 18 patients, respectively (Additional file 1: Table S1; Table 2). In the TB-positive subgroup (N = 82), over 45% of patients (37/82) were defined as test-defined TBP, who had either positive TB culturing or Xpert-positive. While the remaining 45 TBP patients (55%) were diagnosed based on comprehensive clinical evidence (see Methods). In comparison, the T-SPOT assay demonstrated the highest positive detection rate (61%) among all clinical tests (culture 32%, Xpert 35%, and AFB 12%), which was also slightly higher than that of mNGS (49%, Fig. 2). Notably, no false positive TB-detection events were observed in the TB-negative subgroup (N = 56) using mNGS, culture, and Xpert assays. But the false positive rate of the T-SPOT assay reached up to 11%, which was well above other approaches (AFB: 2%).

Table 2 TB-detection performance comparison between mNGS and clinical tests
Fig. 2
figure 2

TB-detection results of mNGS and conventional clinical tests. Positive and negative detection of TB of each method is labeled in green and blue, respectively. The top panel represents the clinically diagnosed TBP patients and the bottom includes 35 MPE and 21 non-TB infected patients as the TB-negative subgroup. Positive and negative detection rates are shown on the right with scaling colors

Among the 52TB-positive patients whose test results were available for all five TB-detection methods, only five of them (9.6%) showed consistently positive results on all tests. Approximately 63.5% (33/52) of them had at least two positive results from the five methods (Fig. 2).

mNGS CNV modeling for identifying malignant PE

To take advantage of the human genome sequencing reads obtained from mNGS, we developed an mNGS-CNV pipeline to assess the genome copy number along the chromosomes. As described in the Methods section, 25 non-MPE samples were randomly chosen as the baseline to normalize chromosome copy number in the remaining 113 patients (35 MPE and 78 non-MPE). As shown in Fig. 3A, CNV events (both copy number gain and loss) were frequently observed in the representative patient with MPE.

Fig. 3
figure 3

Diagnostic performance of mNGS-CNV analysis. A A representative chromosome copy number plot of an MPE patient with both copy number gain and loss events. B A contingency table shows the mNGS-CNV modeling results compared to clinical pathological diagnosis. C The sensitivity, specificity, and accuracy of mNGS-CNV prediction based on B are shown by the bar plot. D The Receiver Operating Characteristic (ROC) curve shows the performance of mNGS-CNV LOOCV result with an area under the curve (AUC) of 0.851

GLM was performed to construct a prediction model using the filtered CNV features (frequency ≥ 20%), the predictive power of which was evaluated by LOOCV. Compared to the clinical pathology diagnosis, the mNGS-CNV modeling demonstrated 51.4% sensitivity, 80.8% specificity, and 71.7% accuracy (Fig. 3B, C), with an area under the curve (AUC) of 0.581 based on the receiver-operating characteristic (ROC) curve (Fig. 3D).


In this retrospective study, we explored the diagnostic utility of mNGS in detecting TBP and MPE simultaneously using a single PE sample. In terms of TB diagnostic performance, mNGS produced a sensitivity of 49% and a specificity of 100% on PE samples, which was comparable to previous clinical studies [7, 8]. Shi et al. reported that the diagnostic performance of mNGS on BALF samples was the best (sensitivity 47.9%) compared to conventional microbiological tests (sensitivity from 29.2% to 46.8%) with BALF or sputum samples [8]. Another prospective study using various clinical samples (BALF, PE, cerebrospinal, ascites, etc.) demonstrated an overall sensitivity of 44% and a specificity of 98% of mNGS on all sample types [7]. They also mentioned that positive blood T-SPOT results were observed in 82% of patients with active TB infection and 33% of those without. The relatively high false positive rate of T-SPOT makes it unsuitable serving as a stand-alone tool for diagnosing TB infection, but could be a complementary diagnostic method [21]. In our cohort, T-SPOT produced the highest sensitivity and the lowest specificity among all tested approaches, suggesting the importance of combining multiple methods to detect TB efficiently and accurately in clinical practice. Similarly, AFB itself is not enough for TB diagnosis due to the sub-optimal performance [22, 23].

Previous studies have reported genomic instability as a molecular marker of malignant neoplasms with both copy number gain and loss [10], but analyzing CNV based on mNGS-derived human reads was less investigated. With this strategy, pathogen detection and malignancy prediction were simultaneous in a single experiment from sample collection to sequencing, significantly shortening the processing time, which was critical in severe conditions. Herein, we explored the diagnostic performance of mNGS-CNV modeling on MPE prediction, which showed 51.4% sensitivity, 80.8% specificity, and 71.7% accuracy. In contrast, Guo et al.[13] reported higher sensitivity (83.7%) and specificity (97.6%) of mNGS CNV analysis on lung biopsy tissue samples instead of PE. Another study using various body fluids such as BALF, PE, peritoneal fluid, etc., showed that the mNGS-CNV test was able to identify 68% of cancer patients who were negative for conventional tests [14]. Furthermore, mNGS was also proven to detect central nervous system malignant neoplasms using cerebrospinal fluids, whose sensitivity reached up to 75% with 100% specificity [15]. Together with our study, mNGS CNV analysis presented great potential in predicting malignant neoplasms with diverse sample types. Optimizing the bioinformatic pipeline may further improve the diagnostic performance but validation in larger cohorts is warranted.

Several limitations of this study need to be noted. First, as a retrospective study, the clinical TB detection tests were performed on multiple specimens, including PE, sputum, BALF, etc., the results of which were undetermined in a small number of patients. Due to the restricted cohort size, we could not split it into training and testing cohorts for mNGS-CNV modeling, especially after excluding the 25 non-MPE samples for setting genome copy number baseline. Thus, we performed LOOCV to evaluate the performance of the mNGS-CNV analysis. Lastly, this study was a pilot study for investigating the potential of repurposing human reads generated from the well-established pathogen-detection mNGS pipeline without any optimizations to better interpret human sequences. We believe further studies on experimental and bioinformatic improvements will increase the sensitivity and specificity with an external validation cohort.


In conclusion, we presented the possibility of detecting TBP and MPE simultaneously using mNGS on PE specimens, with relatively good diagnostic performance. Our study promoted mNGS as a promising tool for pathogen detection and cancer diagnosis, but prospective clinical studies and large-cohort validation are needed in the future.

Availability of data and materials

The data generated in this study has been uploaded to NCBI database (Accession: PRJNA944842).



Acid-Fast Bacillus


Area under the curve


Bronchoalveolar lavage fluid


Copy number variant


C-reactive protein


Generalized linear modeling


Metagenomic next-generation sequencing


Malignant pleural effusion


Leave-one-out cross validation




Pleural effusion


Receiver-operating characteristic




Tuberculous pleurisy


  1. Light RW. Pleural effusions. Med Clin North Am. 2011;95:1055–70.

    Article  PubMed  Google Scholar 

  2. Jany B, Welte T. Pleural effusion in adults-etiology, diagnosis, and treatment. Dtsch Arztebl Int. 2019;116:377–86.

    PubMed  PubMed Central  Google Scholar 

  3. Udwadia ZF, Sen T. Pleural tuberculosis: an update. Curr Opin Pulm Med. 2010;16:399–406.

    Article  PubMed  Google Scholar 

  4. Xu P, Yang K, Yang L, et al. Next-generation metagenome sequencing shows superior diagnostic performance in acid-fast staining sputum smear-negative pulmonary tuberculosis and non-tuberculous mycobacterial pulmonary disease. Front Microbiol. 2022;13: 898195.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Lee HS, Kee SJ, Shin JH, et al. Xpert MTB/RIF Assay as a Substitute for Smear Microscopy in an Intermediate-Burden Setting. Am J Respir Crit Care Med. 2019;199:784–94.

    Article  PubMed  Google Scholar 

  6. Blauwkamp TA, Thair S, Rosen MJ, et al. Analytical and clinical validation of a microbial cell-free DNA sequencing test for infectious disease. Nat Microbiol. 2019;4:663–74.

    Article  CAS  PubMed  Google Scholar 

  7. Zhou X, Wu H, Ruan Q, et al. Clinical evaluation of diagnosis efficacy of active Mycobacterium tuberculosis complex infection via metagenomic next-generation sequencing of direct clinical samples. Front Cell Infect Microbiol. 2019;9:351.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Shi CL, Han P, Tang PJ, et al. Clinical metagenomic sequencing for diagnosis of pulmonary tuberculosis. J Infect. 2020;81:567–74.

    Article  CAS  PubMed  Google Scholar 

  9. Ferreiro L, Suárez-Antelo J, Valdés L. Pleural procedures in the management of malignant effusions. Ann Thorac Med. 2017;12:3–10.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144:646–74.

    Article  CAS  PubMed  Google Scholar 

  11. Zhao EY, Jones M, Jones SJM. Whole-genome sequencing in cancer. Cold Spring Harb Perspect Med. 2019;9: a034579.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Torres-Ruiz R, Grazioso TP, Brandt M, Martinez-Lage M, Rodriguez-Perales S, Djouder N. Detection of chromosome instability by interphase FISH in mouse and human tissues. STAR Protoc. 2021;2: 100631.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Guo Y, Li H, Chen H, et al. Metagenomic next-generation sequencing to identify pathogens and cancer in lung biopsy tissue. EBioMedicine. 2021;73: 103639.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Gu W, Talevich E, Hsu E, et al. Detection of cryptogenic malignancies from metagenomic whole genome sequencing of body fluids. Genome Med. 2021;13:98.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Gu W, Rauschecker AM, Hsu E, et al. Detection of neoplasms by metagenomic next-generation sequencing of cerebrospinal fluid. JAMA Neurol. 2021;78:1355–66.

    Article  PubMed  Google Scholar 

  16. . Automated Real-time nucleic acid amplification technology for rapid and simultaneous detection of tuberculosis and rifampicin resistance: Xpert MTB/RIF assay for the diagnosis of pulmonary and extrapulmonary TB in adults and children: policy update. Geneva, 2013.

  17. WHO consolidated guidelines on tuberculosis: Module 3: Diagnosis - Tests for tuberculosis infection. Geneva, 2022.

  18. Zeng X, Wu J, Li X, et al. Application of metagenomic next-generation sequencing in the etiological diagnosis of infective endocarditis during the perioperative period of cardiac surgery: a prospective cohort study. Front Cardiovasc Med. 2022;9: 811492.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Ren D, Ren C, Yao R, et al. The microbiological diagnostic performance of metagenomic next-generation sequencing in patients with sepsis. BMC Infect Dis. 2021;21:1257.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Lennart R, Annelies D, et al. WisecondorX: improved copy number detection for routine shallow whole-genome sequencing. Nucleic Acids Res. 2018;47:1605.

    Google Scholar 

  21. Jiang J, Shi HZ, Liang QL, Qin SM, Qin XJ. Diagnostic value of interferon-gamma in tuberculous pleurisy: a metaanalysis. Chest. 2007;131:1133–41.

    Article  CAS  PubMed  Google Scholar 

  22. Lipsky BA, Gates J, Tenover FC, Plorde JJ. Factors affecting the clinical value of microscopy for acid-fast bacilli. Rev Infect Dis. 1984;6:214–22.

    Article  CAS  PubMed  Google Scholar 

  23. Gladwin MT, Plorde JJ, Martin TR. Clinical application of the Mycobacterium tuberculosis direct test: case report, literature review, and proposed clinical algorithm. Chest. 1998;114:317–23.

    Article  CAS  PubMed  Google Scholar 

Download references


We would like to thank all research staff involved in this study and all participants.


This work was supported by the National Natural Science Foundation of China (Grant Number: 82072381) and Beijing Municipal Science and Technology Project (Grant Number: Z191100006619079).

Author information

Authors and Affiliations



FX: Conceptualization, formal analysis, writing—original draft. QW: Conceptualization, validation, visualization, writing—original draft. NZ: Conceptualization, data curation, writing—original draft. XX: Data curation, Validation. ZL: Data curation, formal analysis. KL: Validation, visualization. YM: Visualization, writing—review and editing. QO: Validation, writing—review and editing. YJ: Visualization. XC: Data curation. CZ: Data curation, Writing—review and editing. JP: Writing—review and editing, Project administration. NC: Supervision, writing—review and editing. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Junhua Pan or Nanying Che.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Institutional Review Board of Beijing Chest Hospital (Approval ID: 2021LSKY-58). All participants signed informed consent forms for participation.

Consent for publication

All participants signed informed consent forms for publishing this study.

Competing interests

Yutong Ma, Qiuxiang Ou, and Yaqiong Jia are employees of Dinfectome Inc., Nanjing, Jiangsu, China. The remaining authors have nothing to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Table S1.

Patient's clinical characteristics and diagnostic results.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, F., Wang, Q., Zhang, N. et al. Simultaneous diagnosis of tuberculous pleurisy and malignant pleural effusion using metagenomic next-generation sequencing (mNGS). J Transl Med 21, 680 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: