INTRODUCTION
Gold-standard morphological mold identification in the clinical mycology laboratory is slow and laborious, relying on intense training and extensive experience for accurate identification of an increasingly widening spectrum of fungal pathogens. This decade alone has seen the emergence of numerous new species and species complexes, many of which (i) are resistant to antifungals or display atypical susceptibility profiles or (ii) are phenotypically similar but genetically and possibly pathogenically different from their counterparts (
1–6). Such factors have complicated the accuracy of traditional phenotypic mold identification, compelling laboratories, where possible, to adopt methods such as DNA sequencing or matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) to enhance discriminatory power.
MALDI-TOF MS has become a powerful tool in the clinical microbiology setting and has revolutionized workflow in our laboratory, enabling rapid identification of bacteria (
7,
8), yeasts (
9), rapidly growing mycobacteria (
10), and
Nocardia (
11). The accuracy of this technique compares favorably with that of genomic sequencing and is obtained at a significantly lower cost (
12). However, its clinical application for the identification of filamentous fungi has lagged due to challenges in developing an efficient protein extraction method and the limited databases available. Consequently, many groups have developed in-house supplementary databases that target only select pathogens that are most prominent in their patient population (
13–17).
We describe here the development and clinical evaluation of a comprehensive database for the identification of molds grown on solid media by MALDI-TOF MS. Our study highlights the many benefits of MALDI-TOF MS for rapid and unambiguous mold identification when an adequate database is available.
(This work was presented in part at both the 112th General Meeting of the American Society for Microbiology, San Francisco, CA, 16 to 19 June 2012, and at the Next Generation Dx Summit: Mass Spectrometry in Diagnostics of Infectious Disease, Washington, DC, 21 to 23 August 2012.)
RESULTS
Construction of the NIH mold database.
A total of 294 fungal isolates was used to create the NIH mold database. The desired number of acceptable spectra was obtained for all but 11 of the 294 isolates (Aspergillus sclerotiorum, Chaetomium nigricolor, Cladophialophora sp., Cladosporium sp., Exophiala dermatitidis, Exophiala pisciphila, hyaline septate mold [no further differentiation], Microsporum canis, Penicillium marneffei, Phialophora verrucosa, and Pithomyces sp.) for which the number of quality peaks was lowered to a 20:15 ratio (see Materials and Methods) because of the limited number of peaks available for detection. A minimum of 10 quality spectra was obtained for all isolates before entry into the database, except for Pyrenochaeta romeroi, Curvularia lunata, Histoplasma capsulatum, Penicillium marneffei, and Phialophora verrucosa, for which only seven to nine spectra of sufficient quality were obtained.
To confirm specificity, all spectra included in the NIH mold database were analyzed against the NIH database plus the Bruker library (
n = 5,118 MSPs). While all spectra matched to an exceptionally high degree with their own corresponding MSPs, cross-identifications at a score of ≥2.0 were noted for some isolates. These included bidirectional cross-identifications between the following: (i) members of the
Aspergillus section Flavi (
Aspergillus flavus,
Aspergillus oryzae, and
Aspergillus sojae), (ii)
Aspergillus nidulans and
Emericella quadrilineata, (iii)
Fusarium oxysporum and
Fusarium proliferatum, and (iv)
Paecilomyces spectabilis and
Paecilomyces variotii.
Aspergillus fumigatiaffinis,
Aspergillus ochraceus, and
Ulocladium cross-identified with
Aspergillus viridinutans,
Aspergillus westerdijkiae, and
Alternaria, respectively; however, no cross-identifications were observed in reverse. As expected for teleomorphs and anamorphs of the same mold, cross-matching patterns were observed between
Geosmithia argillacea/
Talaromyces eburneus and
Pseudallescheria boydii complex
/Scedosporium apiospermum (recently renamed
Pseudallescheria apiosperma [
21,
22]).
Clinical performance and validation.
When blindly challenged against 421 clinical isolates, the NIH mold database provided species-level (score of ≥2.0) identification for 370 isolates (88.9%) while the most updated Bruker library alone (August 2012) identified only 3 isolates (0.7%) (
Tables 2 and
3). Using the NIH mold database, an additional 18 isolates (4.3%) were identified to the genus level (score between 1.7 and 1.99) (
Tables 2 and
3). No isolates were misidentified by MALDI-TOF MS. A total of 392 isolates (93.1%) failed to provide accurate identification (score of <1.7) when spectra were analyzed against the Bruker database alone (
Tables 2 and
3).
Of the 236 clinical isolates included in the NIH mold database, 109 strains were subsequently added to the database after having been initially tested during the blinded clinical validation. To account for this possible bias, these strains were removed, and analysis was adjusted so that a total of 312 blinded clinical isolates were evaluated. Of these, 262 isolates (84%) were identified to the species level when analyzed against the NIH mold database alone, compared with the Bruker library that identified only 3 isolates (1%).
Four cases of morphological misidentification were detected. These included three Aspergillus niger and one Aspergillus versicolor isolates that were correctly reidentified by MALDI-TOF MS and confirmed by genomic sequencing as Aspergillus aculeatus, Aspergillus sclerotiorum, Aspergillus tubingensis, and Aspergillus sydowii, respectively. Of the 33 isolates (7.8%) for which there was no identification by MALDI-TOF MS, 25 were basidiomycetes not associated with clinical disease, and the remaining 8 were Penicillium species not represented in the database.
Because the NIH mold database was more diverse and contained more than twice the number of MSPs than the Bruker eukaryotic library (
n = 133), numbers from the blinded challenge were adjusted to incorporate only those isolates for which there were representative spectra in the manufacturer's library (
Table 4). Of the 156 isolates included, the NIH mold database provided species- and genus-level identifications for 144 (92.3%) and 4 (2.6%) isolates, respectively. The remaining eight strains were
Penicillium sp. that failed to meet criteria for identification (score of <1.7). When the same 156 spectra were analyzed against the Bruker library alone, only 3 isolates (1.9%) were identified to the species level, and another 26 isolates (16.7%) were identified to the genus level. The remaining 127 (81.4%) were not identified despite having representative spectra in the Bruker library.
The new Bruker fungal library released in July 2012 was not available during clinical validation but was obtained post-manuscript submission. Retrospective analysis against this expanded library did show improved sensitivity as 68 (16.2%) and 80 (19%) isolates were identified to the species and genus levels, respectively. For the 287 isolates that had representative spectra in the new fungal library, 23.7% and 27.9% were identified to the species and genus levels, respectively. However, 48.4% of isolates failed to be identified despite having representative spectra in the new Bruker library.
Different culture conditions, including various media (Sabouraud dextrose, Candida chromogenic agar, buffered charcoal yeast extract, and brain heart infusion with blood, chloramphenicol, and gentamicin) that were incubated at various temperatures (27 to 42°C) and were of various colony ages (2 to 7 days), did not appear to affect the ability to obtain good identification results for the blinded isolates tested. Using the NIH mold database, the manufacturer's original cutoff scores of ≥2.0 for species and ≥1.7 for genus-level identifications were maintained so that specificity was not compromised to improve sensitivity. Matching threshold results between duplicate spots were achieved for 406 (96.5%) samples.
DISCUSSION
To our knowledge, we have developed the most comprehensive mold database to date to supplement the Bruker Biotyper library for the identification of filamentous fungi grown on solid media using MALDI-TOF MS. When challenged, the NIH mold database provided accurate species-level identification for 370 isolates (88.9%), clearly outperforming the Bruker library, which identified only 3 isolates (0.7%) (
Tables 2 and
3). This was partly due to the wider diversity of molds included in the NIH database (294 profiles) than in the Bruker Biotyper library (113 profiles). However, when 156 samples represented in the manufacturer's database were tested, strong performance was maintained by the NIH mold database (92.3% species-level identification), while the Bruker library continued to produce inadequate results (1.9%) (
Table 4). Other investigators have also observed this phenomenon (
13,
15,
17). This discrepancy may because Bruker utilized liquid mold cultures during database construction in an effort to minimize the effect of culture conditions and to aid in the production of uniform mycelium (
23). Liquid mold cultures, however, are rarely employed in clinical mycology laboratories due to the increased risk of aerosolized spore contamination and the inability to visualize phenotypic macro- and microscopic characteristics. The discrepancy between methods used for clinical testing and database construction may explain why the Bruker database failed to identify 127 isolates (81.4%) for which there were representative data (
Table 4). In July 2012, Bruker launched a separate library for the identification of molds grown in liquid media (
23) that must be purchased separately from the primary library. Retrospective analysis against this expanded library (obtained post-manuscript submission) did show improved sensitivity of the Biotyper due to its wider representation of fungal species; however, 48.4% of isolates were not identified despite having representative spectra in the new Bruker library. This, again, may be due to the library's reliance on liquid cultures.
Traditional phenotypic mold identification is laborious and requires considerable training and expertise. Identification is further complicated by sterile molds (
24) and organisms that are genetically distinct from morphologically similar species. Several reports of mistaken identities have drawn attention (
1–3,
25) because these masquerading molds are often refractory to antifungal agents (
6,
24,
25). Correct identification is therefore imperative for appropriate disease management. In our study,
Aspergillus aculeatus,
Aspergillus sclerotiorum, and
Aspergillus tubingensis were morphologically mistaken for
Aspergillus niger, and
Aspergillus sydowii was morphologically mistaken for
A. versicolor. In most clinical laboratories, black aspergilli are generally reported as
Aspergillus niger without further differentiation into
Aspergillus section Nigri. Analysis of more
A. niger isolates in our archive will likely reveal similar cases of mistaken identities and raise questions as to their clinical relevance. The phenotypic misidentification of
A. sydowii for
A. versicolor likely arose from the failure to wait for color formation of the colony, which typically requires an extended incubation for an additional 3 to 5 days. Our study showed that correct identification using the NIH mold database was obtainable from colonies as young as 2 days old on solid media, avoiding the time required for color production, sporulation, and/or temperature studies. This proved particularly useful for rapidly identifying
Fusarium solani complex,
Histoplasma capsulatum,
Coccidioides immitis/posadasii, and members of the
Mucorales during our blind validation. In addition, the number of isolates requiring DNA sequencing and molecular probes has decreased since MALDI-TOF MS has been incorporated into routine workflow.
Several cross-identifications were observed when all spectra included in the NIH mold database were analyzed against the entire NIH database plus the Bruker library (
n = 5,118 MSPs). Although some bidirectional and unidirectional cross-identifications may be a cause for concern, accurate identification was always achieved at a higher score (at least 10%). We do not know whether other groups have observed a similar cross-identification problem as their databases included only some of the species listed above, and/or intralibrary specificity was not tested (
13,
15,
16). However, neither of the aforementioned studies nor our study has documented any instances of mold misidentification by MALDI-TOF MS. In addition, unlike other investigators who lowered the score criteria to improve sensitivity (
16,
17), we were able to maintain the manufacturer's original cutoff scores for species and genus identification without compromising sensitivity. Given time and the continual expansion of the database with more representative isolates, we predict that cross-identifications such as these will diminish.
While we strived to encompass both common and unusual isolates in our library, we recognize that the NIH mold database is not exhaustive and that some organisms (e.g., dermatophytes) were not widely represented due to their rarity at the NIH. We believe, however, that our database has the capacity to identify at least 90% of filamentous fungi isolated in most clinical mycology laboratories. The inherent expandability of the Biotyper software will also allow for inclusion of new species and complexes into our existing database. It is therefore not surprising that 25 nonpathogenic basidiomycetes and eight Penicillium species were not identified during clinical validation as many species in these groups were not represented in our database at all.
The use of extracted protein suspensions versus direct colony deposition (“toothpick method”) for MALDI-TOF MS has remained controversial for many years. Comparative studies on bacteria have shown better identification scores from extracted protein suspensions (
26,
27), presumably because cleaner spectra are produced without interference from salts, lipids, and other cell constituents. No comparative studies have been performed on molds thus far; however, MALDI-TOF MS from water suspensions of mycelia and/or conidia have shown promise (
13,
15). Nonetheless, the risk of spore aerosolization and potential laboratory contamination will likely result in the continued use of protein extraction for filamentous fungi, with the added advantage of better quality spectra and enhanced sensitivity and specificity. While many different extraction procedures have been developed (
14,
17,
28), our alternate extraction procedure is not restrictive to specific portions of mycelia or growth in liquid media. Future comparative versatility studies are needed to help standardize the application of MALDI-TOF MS for mold identification in a clinical setting. We also proved that duplicate spotting of protein extracts is not required since matching threshold results were achieved 96.5% of the time. Duplicate spotting, however, may be useful for assessing method accuracy and reproducibility during the clinical validation phase.
In summary, the NIH mold database is the most comprehensive library developed to date for the identification of molds from solid media by MALDI-TOF MS. Since implementation, laboratory efficiency has improved, with decreased turnaround time for identification and precision equivalent to genomic sequencing. Our protocol is easily adaptable, and the database can be made available to any clinical laboratory for future multicenter studies. We are optimistic that our procedure will be significantly beneficial, especially in laboratories with limited mycological expertise.