Deep Sequencing of Plant and Animal DNA Contained within Traditional Chinese Medicines Reveals Legality Issues and Health Safety Concerns
Traditional Chinese medicine (TCM) has been practiced for thousands of years, but only within the last few decades has its use become more widespread outside of Asia. Concerns continue to be raised about the efficacy, legality, and safety of many popular complementary alternative medicines, including TCMs. Ingredients of some TCMs are known to include derivatives of endangered, trade-restricted species of plants and animals, and therefore contravene the Convention on International Trade in Endangered Species (CITES) legislation. Chromatographic studies have detected the presence of heavy metals and plant toxins within some TCMs, and there are numerous cases of adverse reactions. It is in the interests of both biodiversity conservation and public safety that techniques are developed to screen medicinals like TCMs. Targeting both the p-loop region of the plastid trnL gene and the mitochondrial 16S ribosomal RNA gene, over 49,000 amplicon sequence reads were generated from 15 TCM samples presented in the form of powders, tablets, capsules, bile flakes, and herbal teas. Here we show that second-generation, high-throughput sequencing (HTS) of DNA represents an effective means to genetically audit organic ingredients within complex TCMs. Comparison of DNA sequence data to reference databases revealed the presence of 68 different plant families and included genera, such as Ephedra and Asarum, that are potentially toxic. Similarly, animal families were identified that include genera that are classified as vulnerable, endangered, or critically endangered, including Asiatic black bear (Ursus thibetanus) and Saiga antelope (Saiga tatarica). Bovidae, Cervidae, and Bufonidae DNA were also detected in many of the TCM samples and were rarely declared on the product packaging. This study demonstrates that deep sequencing via HTS is an efficient and cost-effective way to audit highly processed TCM products and will assist in monitoring their legality and safety especially when plant reference databases become better established.
Published in the journal:
. PLoS Genet 8(4): e32767. doi:10.1371/journal.pgen.1002657
Category:
Research Article
doi:
https://doi.org/10.1371/journal.pgen.1002657
Summary
Traditional Chinese medicine (TCM) has been practiced for thousands of years, but only within the last few decades has its use become more widespread outside of Asia. Concerns continue to be raised about the efficacy, legality, and safety of many popular complementary alternative medicines, including TCMs. Ingredients of some TCMs are known to include derivatives of endangered, trade-restricted species of plants and animals, and therefore contravene the Convention on International Trade in Endangered Species (CITES) legislation. Chromatographic studies have detected the presence of heavy metals and plant toxins within some TCMs, and there are numerous cases of adverse reactions. It is in the interests of both biodiversity conservation and public safety that techniques are developed to screen medicinals like TCMs. Targeting both the p-loop region of the plastid trnL gene and the mitochondrial 16S ribosomal RNA gene, over 49,000 amplicon sequence reads were generated from 15 TCM samples presented in the form of powders, tablets, capsules, bile flakes, and herbal teas. Here we show that second-generation, high-throughput sequencing (HTS) of DNA represents an effective means to genetically audit organic ingredients within complex TCMs. Comparison of DNA sequence data to reference databases revealed the presence of 68 different plant families and included genera, such as Ephedra and Asarum, that are potentially toxic. Similarly, animal families were identified that include genera that are classified as vulnerable, endangered, or critically endangered, including Asiatic black bear (Ursus thibetanus) and Saiga antelope (Saiga tatarica). Bovidae, Cervidae, and Bufonidae DNA were also detected in many of the TCM samples and were rarely declared on the product packaging. This study demonstrates that deep sequencing via HTS is an efficient and cost-effective way to audit highly processed TCM products and will assist in monitoring their legality and safety especially when plant reference databases become better established.
Introduction
Traditional Chinese medicines (TCMs) have been an integral part of Chinese culture and the primary medicinal treatment for a large portion of the population for more than 3000 years [1], [2]. Outside of Asia there has been, in recent decades, a growing use of TCMs where they are being taken in conjunction with, or as an alternative to, conventional Western medicine [3], [4]. The increasing popularity of TCM products has seen the monetary value of the industry increase to hundreds of millions of dollars per annum [5], its growth paralleled by the global increase in the use of complementary and alternative medicines. Despite its increased uptake, the therapeutic benefits of only a small number of TCM products have been scientifically validated [6], with their perceived efficacy being based largely on long-standing beliefs [7].
Chinese herbal medicines often contain numerous different plant and animal-derived products that combine to act synergistically to affect a desired outcome [8], [9]. However, due to the proprietary nature of TCM manufacture, coupled with a lack of industry regulation, the biological origin of contents can be difficult to determine with confidence, leading to questions regarding TCM quality, efficacy and safety [10], [11]. Undeclared or misidentified TCM ingredients and adulterants can pose serious health risks to consumers [10], [12], [13]. These include: allergenic substances [14], plant toxins [7], heavy metals such as mercury, lead, copper and arsenic [15], and pharmaceutically active compounds of undetermined concentration [5]. In the early 1990s the misidentification of the toxic herb Aristolochia fangchi for the anti-inflammatory agent Stephania tetrandra led more than a hundred women to suffer kidney failure, with many later developing cancer of the urinary system [13].
In addition to safety concerns, issues of legality also surround TCMs. These concerns fall into three main categories: matters relating to the trade of endangered species; issues pertaining to honesty of food labelling; and adulteration of samples with drugs. Some TCMs contain plant and animal species [16]–[18] that fall under the jurisdiction of the Convention on International Trade in Endangered Species (CITES). CITES-listed species (see appendicies at www.cites.org) that have had long-standing associations and use within TCM include: Asiatic black bear (Ursus thibetanus, Appendix I listed), Saiga antelope (Saiga tatarica, Appendix II listed), rhinoceros (all species, Appendix I listed), and non-cultivated varieties of the plant genus Panax; P. ginseng and P. quinquefolius, (Appendix II listed) [19]–[23]. The CITES appendices include lists of species afforded different levels or types of protection from over-exploitation. Appendix I species are deemed the most endangered and threatened with extinction, with Appendix II and III listed species regarded to be at lower, but still significant, threat levels [24]. With an increased international demand for TCMs, ascertaining the biological origins, and hence the CITES status, of ingredients contained variously in capsules, powders, liquids, and tablets represents a complex problem for customs officials. The second issue of legality concerns the mislabelling of TCMs. This might be done intentionally in order to reduce manufacturing costs, or to circumvent customs' scrutiny, or inadvertently if the TCM practitioner unwittingly uses a misidentified product [25]. For CITES member states to enforce legislation and to prosecute cases of illegal trade, reliable methods of species identification are needed [26]. Lastly, a number of TCM products appear to have been intentionally adulterated with drugs of known pharmacological activity such as anti-hyperglycaemic agents (anti-diabetic medication) and corticosteroids [5], presumably as a means to increase their efficacy.
To date, many of the analyses and identification of botanical components in TCM products have employed chromatographic methods [9], [27]. However, these methods may not be able to identify animal species, or be able to uncover all of the ingredients within heterogeneous samples. DNA technology has the potential to provide information about species composition and the honesty of ingredient declarations. For the identification of botanical constituents used in TCMs, the genetic techniques employed include fragment length polymorphism analysis, dot-blot hybridization, micro-arrays, and sequencing of plastid DNA genes [25], [28]–[33]. Likewise, genetic identification of animal species commonly involves DNA sequencing and characterisation of mitochondrial DNA (mtDNA) genes [1], [32], [34]. Despite the variety of genetic work that has been conducted to date, investigative research seems to have focused on detecting the DNA of specific targets within TCMs [22], [28], [30], [35]–[38] or herbal teas [39] rather than investigating all of the contributing species within a sample simultaneously.
The advent of Second Generation, high-throughput sequencing (HTS) platforms have enabled the rapid sequencing of genes, genomes and metagenomes [40]. The ability of these technologies to deep-sequence both PCR amplified plastid and mtDNA markers (using molecular identifier [MID] tags) has allowed the species composition of a variety of complex substrates including faecal material [41], sediments [42] and even, in a forensic context, microbial communities on computer keyboards [43], to be determined. The application of HTS technologies to analyse complementary medicines has not been previously attempted, but is likely to prove to be the best approach by which to genetically audit the species composition of multiple TCM samples in parallel.
Given the worldwide popularity, growing use and increasing financial significance of TCMs, an effective means of evaluating these medicines is urgently needed – a sentiment echoed by strategy reports from the World Health Organization (WHO) [11]. This study sets out to explore the probative value of HTS approaches by generating species audits from 15 TCMs (Figure 1; Table 1) seized by border protection officials upon entry into Australia.
Results/Discussion
General overview of HTS results
An in-depth genetic audit of the species constituents contained within 15 TCM samples (Figure 1, Table 1) was determined by using amplification of trnL (p-loop, plastid) and 16S rRNA (mtDNA) genes, followed by deep sequencing via HTS (see methods). More than 49,000 sequence reads were obtained from the HTS approach using both trnL c/h and 16S primers, with the analysis of the plant and animal constituents discussed separately below. The DNA isolated from the various TCM samples was highly variable in quality. Using trnL and 16S primers in qPCR assays, DNA of sufficient quality was obtained from 15 of 28 (54%) samples attempted. Some of the TCMs failed to amplify due to severe PCR inhibition, while others yielded little, or no DNA. As with many other degraded/processed substrates it may be necessary to optimise DNA extraction methodologies depending on the physical and chemical properties of the TCM.
To our knowledge, this is the first study to apply an HTS approach to ascertain the species composition of medicinal products. Until recently, to dissect the molecular components of heterogeneous biological samples (such as TCMs) it has been necessary to clone amplicons into plasmid vectors and then sequence the insert. In direct contrast to previous cloning based methodologies HTS provides deeper coverage of more samples in a shorter time period, and represents a cost effective way to audit DNA in heterogeneous samples. The sequencing of indexed (MID-tagged) PCR amplicons [44] allows for the sequencing of multiple samples in parallel, with the GS Junior or Ion Torrent conservatively generating ∼50,000 reads for c. US$1000 [45]. DNA isolation and quantification of 15 TCM samples followed by a single HTS run of the pooled and tagged PCR products, was estimated, in this case, to cost less than $35 per sample (excluding labour). This demonstrates that after an initial outlay for MID-tagged primers this approach is extremely cost-effective. As such, the approach described here is both cost-effective, accessible, and can be easily adapted to profile the molecular constituents of other biologically derived complementary and alternative medicines. One of the aims of this study was to determine the efficacy of HTS auditing approaches specifically with the goal of screening additional samples whose constituents might need to be identified in cases involving illegal imports, food fraud, medicine fraud and forensics.
Taxonomic assignment of DNA sequences to a family, genus or species represents a complex problem, the accuracy of which largely depends on the level of coverage afforded by reference databases, the analytic method used [46] and the accuracy of the underlying taxonomic framework. In the TCM data generated here the vertebrate assignments were relatively straight forward, in contrast to the plant assignments, which were particularly challenging. The detection and identification to the family level, of genetically well-characterised plants and animals is generally uncomplicated. In contrast, if species-level assignments (without uncertainties) are required for each trnL sequence, the task is largely unachievable with current databases. While the MEtaGenome ANalyzer (MEGAN) [47] based assignment approach is not without problems, it is currently the best way to parse thousands of sequence reads. Alternative methods for assigning sequences are also available such as SAP [48] and QIIME [49] although all of these methods are computationally intensive when challenged with large volumes of data. Irrespective of the species assignment methodology used, whether it be phenetic or character-based, all are ultimately dependent on good reference database coverage and a robust taxonomy.
There are a number of caveats with regards to HTS technology that need to be considered when analysing data. Firstly the error rate of 454 Titanium chemistry is estimated to be ∼0.5–1% [50]. On top of this there is the possibility that recombination might occur, albeit at a low (∼0.3% on an Illumina platform) frequency [51]. The likelihood of error and recombination should at least be acknowledged, but with respect to the plastid trnL data presented here it is debatable how significant an impact this is going to have on species assignments due to the presence of both sequence and length polymorphisms in the p-loop region. Lastly, caution also needs to be exercised with drawing correlations between the genetic profiles detected by HTS approaches and the actual composition of the TCM. No genetic audit can detect DNA when it has been completely degraded (for example by processing procedures) and there will always be variation in the DNA concentrations between ingredients. The results should therefore be regarded as a qualitative, and potentially incomplete assessment of composition rather than a quantitative measure of each ingredient.
Within the confines of a manuscript it is impossible to document the significance of each of the ∼50,000 reads in this audit, instead, a summary of the data is presented (Table 2 and Table 3, and Figure S1A–S1N) and the discussion will focus on some of the more common, illegal or hazardous ingredients.
Analysis of plant DNA in the TCM samples
A total of 68 plant families were identified in this study with 48,682 DNA sequence reads (on average 3,745 per TCM sample) generated using the trnL c/h primer set [52] for the 13 analysed samples (Table 2). Six of the most common plant families that were identified included Fabaceae, Asteraceae, Poaceae, Lamiaceae, Solanaceae, and Apiaceae, with 70% of the samples containing at least three of these families (Table 2). Some of the most common plant genera identified in the TCM samples were Glycyrrhiza (liquorice root, Family Fabaceae), found in 62% of samples, Mentha (mint, Family Lamiaceae), found in 46% of samples and Asarum (wild ginger, Family Aristolochiaceae) found in 31% of samples. Mint is commonly included in medicines and is used in TCM to treat gastrointestinal upset, gallbladder problems and upper respiratory symptoms [53]. Likewise Glycyrrhiza uralensis, or Chinese liquorice root, is a common component of TCM remedies and is classified as one of the Chinese 50 fundamental herbs [54]. Containing glycyrrhizin, G. uralensis can be processed by microbes into 18β-glycyrrhetic acid — effective in the treatment of peptic ulcers, as well as having antiviral and antifungal activities [55]. Heavy harvesting of G. uralensis from the wild for TCM products, has resulted in the threat of species extirpation in Chinese provinces such as Gansu [56].
The results of the trnL audit on four samples, Yatong Yili Wan (TCM-016), Laryngitis pills (TCM-006, TCM-021), and Lingxin Mingmu Shangging Wan (TCM-013), indicated they contained DNA with close (>98%) similarity to the genera Ephedra and/or Asarum (Table 2). These TCMs could potentially pose a risk, as compounds from these genera can be poisonous or toxic at high dosages. Ephedra is classed as a poisonous herb, with Ephedra-containing products having been banned by the U.S. Food and Drug Administration (FDA) since 2004 [57]. Remedies that contain Ephedra should only be prescribed by experienced practitioners, as the therapeutic dose range is narrow [8]. Aristolochic acid, the same compound as contained in Aristolochia species, a known nephrotoxin, hepatotoxin, and carcinogen [27], [58], may be contained in certain species of Asarum. Further compound specific testing (via metabolomics) of TCMs from which Asarum DNA was detected (TCM-006; TCM-013; TCM-016; TCM-021, Figure 2, Table 2) would be required to determine whether this acid is actually present in the TCMs analysed here.
One trade-restricted plant species commonly found in TCM preparation is Panax ginseng (CITES Appendix II). Non-cultivated P. ginseng is subject to CITES regulation only when in the form of a whole root, or sliced parts of the root, and not after processing and manufacture [23]. It was not possible using the conservative assignment criteria implemented in MEGAN to definitively identify the genus Panax, this is primarily because the bit-score match was equally good with the genus Hedera (ivies). Both Panax and Hedera are in the family Araliaceae and further molecular characterisation is required to distinguish if one or both of these genera are present in the TCM-001, TCM-011, TCM-018 and TCM-027. Even if Panax is confirmed, the fact that all the TCMs containing Araliaceae sequences are in powdered form render them technically not subject to CITES legislation.
Additional plant taxa with purported medicinal activity identified in the samples include Xanthorhiza simplicissima (Ranunculeae), and Sophora flavescens (Fabaceae). Xanthorhiza simplicissima (Yellowroot) is a native American medicinal containing berberine which is anti-inflammatory, astringent, hemostatic, antimicrobial, anticonvulsant, immunostimulant, uterotonic and can temporarily lower blood pressure [59]: the roots of Sophora flavescens contain alkaloids such as oxymatrine and is commonly used to treat fever, asthma, cancer and viral myocarditis [60], [61]. Plant DNA assigning to the families Cannabaceae, Ranunculaceae, and Solanacea, which are known to contain medicinally important species, were also recovered. However to resolve these sequences beyond the family level another gene region would need to be targeted, and this might reveal, for example, whether the Solanaceae (Nightshade family) identified in four of the TCM samples comprised S. chrysotrichum (Giant Devil's Fig) which has known pharmacological activity [62], or perhaps the less exotic species such as potato or tomato.
The complexity and risk of possible drug interactions for consumers using TCMs in combination with conventional medicines could be heightened when there are poisonous or toxic ingredients of unknown concentrations in herbal remedies that may not be listed on the packaging (Table 1). Further to potential adverse drug interactions is the possibility of ingesting allergenic substances within herbal remedies, such as nuts, which can cause anaphylaxis in those with severe allergy. DNA from the Anacardiaceae (the cashew or sumac family) was detected in two TCMs - nut proteins from this family are know allergens [63]. Likewise, Glycine (soybean) was detected in four TCMs and is known to contain at least 16 potential protein allergens with the potential to cause adverse reactions ranging from mild rashes to life threatening systemic anaphylaxis [64]. However, our results were unable to determine whether the recovered DNA is derived directly from the nut/bean, or originates from other plant tissue.
The variety of species that the HTS technique can reveal when analysing TCMs, is demonstrated by the results obtained for the Yatong Yili Wan pills (TCM-016). This sample was one of the most botanically complex, containing 16 identifiable plant families. 2,124 DNA sequence reads, were assigned to a GenBank reference database sequence (Table 2; Figure 2), based on cut-offs in MEGAN (see methods). SAP analysis was also conducted on representative sequences from each of the terminal nodes. Results generated by SAP were in close accordance with the MEGAN assignments with high posterior support. The two cases where no assignment was made was the result of insufficient database coverage – the method is reliant upon sufficient sequence coverage to construct a phylogeny. A third assignment method was also implemented, QIIME, the results of which were also in broad agreement with the MEGAN and SAP assignments (Figure 2).
What is clear from the plant assignments of the HTS data is that better reference databases involving multiple genes (such as: trnL, rbcL, ITS and matK) are required to improve the species assignment. A medicinal materials DNA barcode database (MMDBD) is currently being generated and compiled to include thousands of DNA reference sequences for these and other genes covering species of plants, animals, insects and fungi that are commonly used in TCM (available at; http://www.cuhk.edu.hk/icm/mmdbd.htm) [31]. The recent work of the China barcode of life group [65] which has sequenced ∼6000 species may soon rectify inadequacies in the plant databases. Despite the constantly improving databases, the taxonomic framework under which the DNA assignments operate also needs to be scrutinised. What is reassuring about HTS data is that while the resolution may not currently be available, efforts to improve databases and the underpinning taxonomies are continually improving and hence the accuracy of assignments can only get better.
With the potentially enormous volumes of plant data produced (over 7,662 reads in the case of TCM-006), it is tempting to look for quantitative signals in results, but owing to various factors including differential preservation of DNA in the raw ingredients, different processing techniques, variation in PCR efficiency (due to amplicon length variation and primer binding site polymorphisms), a universal primer approach should be viewed as semi-quantitative at best. In the worst-case scenario a constituent may be entirely undetected, especially if it occurs at a very low abundance.
Analysis of vertebrate DNA in the TCM samples
With the exception of human-derived sequences (which were excluded), vertebrate genetic signatures were detected in nine samples tested using two universal 16S rRNA primer pairs [66], [67]. A total of eight animal genera were identified from 539 DNA sequences (Table 3). The taxonomic assignments of the vertebrate sequences were simpler in comparison to the plant assignments, due to substantially better GenBank coverage, but as with other forensic studies caution still needs to be exercised when assigning a species in casework [68], [69]. This study identified four TCM samples - Saiga Antelope Horn powder (TCM-011), Bear Bile powder (TCM-015), powder in box with bear outline (TCM-024) and Chu Pak Hou Tsao San powder (TCM-027) – that were found to contain DNA from known CITES listed species. Two of these CITES species are classified by the IUCN Red List as vulnerable (Ursus thibetanus) and one as critically endangered (Saiga tatarica) (Table 3). The threat posed to these and other animal species' survival caused by the demand for TCM products is high [7], [18]. This highlights a serious concern for the conservation of these species and it is evident that illegal hunting still persists despite a high level of legal protection [70]. One hundred and seventy five countries are signatories to CITES, including China (member party since 1981) [24], yet penalties for illegal trafficking are relatively minor and penalties are rarely enforced [18]. DNA testing of highly processed medicines may assist in the successful prosecution of individuals who seek to profit from the illegal trade in endangered taxa. Likewise, such genetic screens will help to legitimise those medicines that contain components that are not trade restricted, but may still be confiscated on grounds of suspicion (e.g. TCM-003, 006 and 021).
Of the samples analysed using the 16S rRNA primers, 44% contained two or more animal species within the same sample (Table 3). Some of these species, such as water buffalo (Bubalus bubalis), Asiatic toad (of the genus Bufo), and domestic cow (Bos taurus), are known for their use in medicinal products [27], [71], whereas use of goat (Capra hircus) is less well represented in the literature and may be used as a substitute for traditionally used animal species. As with all animal-containing products the consumer needs to be aware of the possibility of zoonotic pathogens, such concerns have been raised previously in the context of TCM [7].
Consumers of TCMs need to be wary of honesty of food labelling [72], as in 78% of samples, animal DNA was identified that had not been clearly labelled on the packaging (in either English or Chinese). This adulteration of medicine occurred in the Saiga Antelope Horn powder (TCM-011; Table 1) which claimed to be 100% pure, yet was found to also contain significant quantities of goat (Caprine) and sheep (Ovine) DNA (Table 3). In some TCMs, undeclared ingredients are used to reduce the cost of manufacture of the medicine by increasing the bulk of the powder, but it is impossible to determine why Caprine and Ovine appeared in this product. Water buffalo (Bubalus bubalis), domestic cow (Bos taurus) and deer species were also not listed on the packaging of samples in which they were genetically identified (Table 1 and 3). The inadvertent consumption of undeclared animal products found in 78% of the medicines, such as bovid, risk violating certain religious and/or cultural strictures.
Concluding remarks
The results of this study demonstrate that high-throughput DNA sequencing methods are an invaluable tool for analysing constituents within complex TCMs. The techniques used enabled the identification of a larger number of animal and plant taxa than would have been possible through morphological and/or biochemical means. HTS methodology is well suited to the analysis of highly processed and degraded DNA from TCMs, including powders, crystals, capsules, tablets, and herbal tea. It is manifestly obvious that if there are trade-restricted biological materials in TCMs, or if they contain DNA from species known to synthesise toxic compounds, that better methods of detection are urgently required. Even in the 15 TCMs tested here, the occurrence of CITES-listed species, potentially toxic/allergenic plants and non-declared constituents was all too common. However, it should also be noted that the detection of DNA from a pharmaceutically active species does not necessarily indicate the presence of bioactive compounds: metabolomic analyses can be used in addition for the detection of specific compounds. For example, the bear-bile powder (TCM-015; Table 1 and Table 3) containing Asiatic black bear DNA was analysed using Gas Chromatography Mass Spectrometry and yielded a mass spectra consistent with ursodeoxycholic acid (data not shown), an active component of bile that has been reported to reduce pain and inflammation [73].
In the future, TCM screening approaches that involve both genetic (for species composition) and metabolomic (for compound detection) approaches could represent the best way to audit complementary medicines. With regard to TCMs and complementary medicines as a whole, controls need to be implemented to ensure consumer safety and to minimise impacts on protected biota. It is also important that consumers are made fully aware of legal and health safety concerns that surround TCMs before adopting them as a treatment option. A recent opinion piece [74] stated “if TCM is to take its place in the modern medicine cabinet, then it must develop ways to prove itself” – we endorse this view and note that it applies equally to safety as it does to medical efficacy.
Materials and Methods
Sample collection, DNA extraction, and quantification
Twenty-eight TCM samples were obtained from the Wildlife trade section of the Department of Sustainability, Environment, Water, Population and Communities after being seized by Australian Customs and Border Protection Service at airports and seaports across Australia. The samples were seized because they contravened Australia's international wildlife trade laws as outlined under Part 13A of the Environment Protection and Biodiversity Conservation Act 1999 (EPBC Act). The samples were stored in a quarantine-approved facility within the laboratory after being catalogued. TCM sample types included: powders, bile flakes, capsules, tablets, and herbal tea. Small amounts of each sample (between 70–290 mg) were dispensed into 2.0 mL Eppendorf tubes and digested overnight, on a shaking heat block at 55°C, in 700 µl–1500 µl of tissue digest buffer consisting of; 1 mg per mL proteinase K (Amresco, OH, USA), 20 mM Tris pH 8.0 (Sigma, MO, USA), 2.5 mM EDTA (Invitrogen, CA, USA), 5 mM CaCl2 (Sigma), 20 mM DTT solution (Thermo Fisher Scientific, MA, USA), 1% SDS (Invitrogen), and milliQ water.
All samples were centrifuged after digestion for 3 minutes at 16,813×g. 200 µL of supernatant was mixed with 1 mL of Qiagen (CA, USA) PB buffer and transferred to a Qiagen (PCR cleanup) spin column and centrifuged for 1 minute at 16,813×g. Two wash steps followed (Qiagen AWI then AWII buffer) prior to elution of DNA from the spin column membrane with 50 µL of 10 mM Tris pH 8.0. The DNA extracts were then quantified via real-time quantitative polymerase chain reaction (qPCR; Applied Biosystems [ABI], USA) using trnL g/h [52] and 16S ribosomal RNA (rRNA) [66], [67] primers (Integrated DNA Technologies [IDT], USA) (Primer sequences displayed in Table S1). Samples were assessed for quality and quantity of DNA using qPCR at three DNA dilutions (undiluted, 1/10, 1/100) to determine if successful isolation of DNA was achieved, and to investigate the presence of PCR inhibition. The trnL g/h qPCR assay was conducted in 25 µL reactions using ABI Power SYBR master mix together with 0.8 µM of trnL g and trnL h primers and cycled at 95°C for 5 minutes followed by 40 cycles of 95°C for 30 s, 50°C for 30 s, 72°C for 30 s, with a 1°C melt curve stage and a 10 minute final extension at 72°C. The 16S qPCR was conducted using the same conditions, except for the primer concentration used, which was 0.4 µM and an annealing temperature of 57°C. An optimal DNA concentration, free of inhibition was selected and used for further analysis. Samples with low template amounts and/or severe inhibition were not processed further.
Amplicon generation
Fusion primers with unique 6 bp MID tags were designed [74] for both the 16S rRNA [65], [66] (∼150 bp product for 16Smam, ∼250 bp product for 16S1/2 degenerate primers [Table S1]) and the p-loop region of trnL [52] (c/h primers generating a size variable product averaging ∼250 bp product [Table S1]) (IDT, Australia). The trnL c/h primer sets were used to generate a longer PCR amplicon for future HTS, instead of the trnL g/h primer set (∼100 bp) which were only used for initial quantification. For the most part, when we used qPCR on the c/h and g/h primers, there were no significant drops in detected copy number. For this reason we selected the longer c/h set as it affords greater taxonomic resolution. Ten samples were PCR amplified using both the trnL c/h and 16S fusion primers, three samples were PCR amplified using trnL c/h fusion primers only, and two samples were PCR amplified with 16S fusion primers only. Amplicons were generated via PCR for each sample in triplicate (Corbett Research, NSW, Australia) and pooled in an attempt to reduce the effect of PCR stochasticity. The trnL c/h PCR was carried out in a 25 µL total volume including 4 µL of template DNA, with the following reagents: 2 mM MgCl2 (Fisher Biotec, Aus), 1× Taq polymerase buffer (Fisher Biotec, Australia), 0.4 µM dNTPs (Astral Scientific, Australia), 0.1 mg BSA (Fisher Biotec, Australia), 0.4 µM of each primer, and 0.25 µL of Taq DNA polymerase (Fisher Biotec, Australia). The PCR conditions were as follows: initial denaturation at 95°C for 5 minutes, followed by 50 cycles of 95°C for 30 s, 50°C for 30 s, 72°C for 30 s, and a final extension at 72°C for 10 minutes (Corbett Research, NSW, Aus). The 16S PCR was carried out in 25 µL total volume including 4 µL of template DNA, and the same dNTP, primer and buffer concentrations, but with 2.5 mM MgCl2, 0.4 mg BSA, and 0.25 µL of AmpliTaq Gold DNA polymerase (ABI) instead. The PCR conditions included: initial denaturation at 95°C for 5 minutes, followed by 40 cycles of 95°C for 30 s, 54°C 30 s, 72°C for 30 s, and a final extension at 72°C for 10 minutes (Corbett Research, NSW, Aus).
All PCR amplicons were double purified using the Agencourt AMPure XP Bead PCR Purification protocol (Beckman Coulter Genomics, MA, USA). The purified PCR amplicons were then electrophoresed together on the same 2% agarose gel to confirm the presence of the amplicons and to allow estimates of DNA concentration to be made based on comparisons between band intensity, prior to approximate equimolar amplicon pooling for emulsion PCR.
GS Junior run set up for HTS
To achieve the desired bead∶template ratio, pooled PCR amplicons were quantified using a synthetic 200 bp oligonucleotide standard (of known molarity) with the Roche A and B primers engineered at either end [75]. Quantitative PCR on both the standard and the pooled library was required to approximate the optimal bead∶template ratio. The Roche GS Junior run set up included an emulsion PCR step, bead recovery, and the sequencing run. All of these procedures were carried out according to the Roche GS Junior protocols (http://www.454.com).
Analysis of GS Junior HTS data
The sequencing output Fasta (.fna) and Quality (.qual) formatted files were processed using the following applications. Reads were quality trimmed using BARTAB [76] with a minimum acceptable quality score of 20, averaged over a window size of five bases, then separated into sample batches using a map file containing sample and primer-MID tag information. A non-redundant data set was also generated for each sample. The batched sample read primer and MID tag sequences were masked with the cross_match application [77], for minimum match length of 12 and minimum score of 20, then trimmed using trimseq [78]. An alternative means of data sorting was also employed and involved using the “separate by barcode” and primer trim feature in Geneious (v5.5). Once deconvoluted (based on MID tags) each batch of reads was searched using BLASTn version 2.2.23 [79] with a gap penalties existence of five and extension of two. The low complexity filter option was set to false, and the number of hits was limited to 100 and an expected alignment value less than 1e-10. The BLASTn search was against the National Centre for Biotechnology Information (NCBI) GenBank nucleotide NR database [80], containing all GenBank, EMBL, DDBJ and PDB sequences, a total number of 13,504,325 database sequence entries. This dataset contained no EST, STS, GSS, environmental samples or phase 0, 1 or 2 HTGS sequences, database posted date was Oct 6, 2010 5:44 PM. This pipeline was automated in an Internet-based bioinformatics workflow environment, YABI (https://ccg.murdoch.edu.au/yabi/). The resultant BLAST files were imported into the program MEtaGenome ANalyzer (MEGAN version 4.62.1) [47] for taxonomic analysis and assignment of amplicon plant and animal sequence data, using the following lowest common ancestor parameters: min score of 65, top percent of 5, and min support of 1. To compare the MEGAN assignments with other algorithms we conducted a SAP analysis [48] on a subset of data from TCM-016 where Bayesian trees were constructed from an alignment of at least 30 homologous sequences. QIIME [49] analysis was also implemented. However establishing a valid reference alignment file proved difficult for the trnL of some of the TCM taxa.
Data described herein is available in a processed and annotated form from Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.8ps58rp2. Alternatively in its raw form from the short read archive – accession number SRA047476.
Supporting Information
Zdroje
1. PeppinLMcEwingRCarvalhoGROgdenR 2008 A DNA-Based Approach for the Forensic Identification of Asiatic Black Bear (Ursus thibetanus) in a Traditional Asian Medicine*. Journal of Forensic Sciences 53 1358 1362
2. ZhangYShawPSzeCWangZTongY 2007 Molecular authentication of Chinese herbal materials. Journal of Food and Drug Analysis 15 1
3. TangJLLiuBYMaKW 2008 Traditional Chinese medicine. The Lancet 372 1938 1940
4. MukherjeePKHoughtonPJ 2009 Evaluation of Herbal Medicinal Products: perspectives on quality, safety and efficacy London Pharmaceutical Press 502
5. ErnstE 2004 Risks of herbal medicinal products. Pharmacoepidemiology and drug safety 13 767 771
6. SahooNManchikantiPDeyS 2010 Herbal drugs: Standards and regulation. Fitoterapia 81 462 471
7. StillJ 2003 Use of animal products in traditional Chinese medicine: environmental impact and health hazards. Complementary therapies in medicine 11 118 122
8. YangY 2010 Chinese Herbal Formulas: Treatment Principles and Composition Strategies London Churchill Livingstone Elsevier 450
9. XiePChenSLiangYWangXTianR 2006 Chromatographic fingerprint analysis–a rational approach for quality assessment of traditional Chinese herbal medicine. Journal of Chromatography A 1112 171 180
10. HeublG 2010 New aspects of DNA-based authentication of Chinese medicinal plants by molecular biological techniques. Planta medica 76 1063 1074
11. WHO 2002 WHO Traditional Medicine Strategy 2002–2005 Geneva 1 74
12. SakuraiM 2011 Perspective: Herbal dangers. Nature 480 S97 S97
13. GilbertN 2011 Regulations: Herbal medicine rule book. Nature 480 S98 S99
14. ErnstE 2000 Adverse effects of herbal drugs in dermatology. British Journal of Dermatology 143 923 929
15. ErnstE 2002 Toxic heavy metals and undeclared drugs in Asian herbal medicines. Trends in pharmacological sciences 23 136 139
16. AlvesRRNRosaIML 2007 Biodiversity, traditional medicine and public health: where do they meet? Journal of Ethnobiology and Ethnomedicine 3 14
17. SodhiNSKohLPBrookBWNgPKL 2004 Southeast Asian biodiversity: an impending disaster. Trends in Ecology & Evolution 19 654 660
18. Graham-RoweD 2011 Biodiversity: Endangered and in demand. Nature 480 S101 S103
19. FengYSiuKWangNNgKMTsaoSW 2009 Bear bile: dilemma of traditional medicinal use and animal protection. Journal of Ethnobiology and Ethnomedicine 5 2
20. Milner-GullandEJBukreevaOMCoulsonTLushchekinaAAKholodovaMV 2003 Reproductive collapse in saiga antelope harems. Nature 422 135
21. AminRThomasKEmslieRHFooseTJStrienN 2006 An overview of the conservation status of and threats to rhinoceros species in the wild. International Zoo Yearbook 40 96 117
22. MihalovJJDer MarderosianAPierceJC 2000 DNA identification of commercial ginseng samples. J Agric Food Chem 48 3744 3752
23. Department of the Environment W, Heritage and the Arts 2010 Wildlife trade and conservation: EPBC Act list of threatened fauna Commonwealth of Australia
24. United Nations Environment Programme and World Conservation Monitoring Centre 2011 Checklist of CITES species. CITES Secretariat, Geneva, Switzerland, Cambridge, United Kingdom
25. SucherNJCarlesMC 2008 Genome-based approaches to the authentication of medicinal plants. Planta medica 74 603
26. LinacreA 2008 The use of DNA from non-human sources. Forensic Science International: Genetics Supplement Series 1 605 606
27. HempenCHFischerT 2009 A Materia Medica for Chinese Medicine: Plants, minerals and animal products. Wortman V, translator: Churchill Livingstone Elsevier 1007
28. ZhuSFushimiHKomatsuK 2008 Development of a DNA microarray for authentication of ginseng drugs based on 18S rRNA gene sequence. Journal of agricultural and food chemistry 56 3953 3959
29. ChenSLYaoHHanJPLiuCSongJY 2010 Validation of the ITS2 region as a novel DNA barcode for identifying medicinal plant species. PLoS ONE 5 e8613 doi:10.1371/journal.pone.0008613
30. XuHYingYWangZTChengKT 2010 Identification of Dendrobium Species by Dot Blot Hybridization Assay. Biological & Pharmaceutical Bulletin 33 665 668
31. LouSKWongKLLiMButPPHTsuiSKW 2010 An integrated web medicinal materials DNA database: MMDBD (Medicinal Materials DNA Barcode Database). BMC Genomics 11 1 8
32. HsiehHMLeeJCIWuJHChenCAChenYJ 2010 Establishing the pangolin mitochondrial D-loop sequences from the confiscated scales. Forensic Science International: Genetics 5 303 307
33. SriramaRSenthilkumarUSreejayanNRavikanthGGurumurthyBR 2010 Assessing species admixtures in raw drug trade of Phyllanthus, a hepato-protective plant using molecular tools. J Ethnopharmacol 130 208 215
34. ZhaDXingXYangF 2010 A multiplex PCR assay for fraud identification of deer products. Food Control 21 1402 1407
35. GaoTYaoHSongJZhuYLiuC 2010 Evaluating the feasibility of using candidate DNA barcodes in discriminating species of the large Asteraceae family. BMC Evolutionary Biology 10 324
36. GaoTYaoHSongJLiuCZhuY 2010 Identification of medicinal plants in the family Fabaceae using a potential DNA barcode ITS2. Journal of Ethnopharmacology 130 116 121
37. MaXQDuanJAZhuDYDongTTXTsimKWK 2000 Species identification of Radix Astragali (Huangqi) by DNA sequence of its 5S-rRNA spacer domain. Phytochemistry 54 363 368
38. ZhaoKJDongTTXTuPFSongZHLoCK 2003 Molecular genetic and chemical assessment of Radix Angelica (Danggui) in China. J Agric Food Chem 51 2576 2583
39. StoeckleMYGambleCCKirpekarRYoungGAhmedS 2011 Commercial Teas Highlight Plant DNA Barcode Identification Successes and Obstacles. Sci Rep 1
40. RothbergJMLeamonJH 2008 The development and impact of 454 sequencing. Nature Biotechnology 26 1117 1124
41. BohmannKMonadjemANoerCLRasmussenMZealeMRK 2011 Molecular Diet Analysis of Two African Free-Tailed Bats (Molossidae) Using High Throughput Sequencing. PLoS ONE 6 e21441 doi:10.1371/journal.pone.0021441
42. HaileJFroeseDGMacPheeRDERobertsRGArnoldLJ 2009 Ancient DNA reveals late survival of mammoth and horse in interior Alaska. Proceedings of the National Academy of Sciences 106 22352 22357
43. FiererNLauberCLZhouNMcDonaldDCostelloEK 2010 Forensic identification using skin bacterial communities. Proceedings of the National Academy of Sciences 107 6477
44. BinladenJGilbertMTPBollbackJPPanitzFBendixenC 2007 The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing. PLoS ONE 2 e197 doi:10.1371/journal.pone.0000197
45. GlennTC 2011 Field guide to next generation DNA sequencers. Molecular Ecology Resources 11 759 769
46. LittleDP 2011 DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability. PLoS ONE 6 e20552 doi:10.1371/journal.pone.0020552
47. HusonDHAuchAFQiJSchusterSC 2007 MEGAN analysis of metagenomic data. Genome Research 17 377 386
48. MunchKBoomsmaWHuelsenbeckJPWillerslevENielsenR 2008 Statistical Assignment of DNA Sequences Using Bayesian Phylogenetics. Systematic Biology 57 750 757
49. CaporasoJGKuczynskiJStombaughJBittingerKBushmanFD 2010 QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7 335 336
50. GillesAMegleczEPechNFerreiraSMalausaT 2011 Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing. BMC Genomics 12 245
51. KircherMSawyerSMeyerM 2012 Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res 40 e3
52. TaberletPCoissacEPompanonFGiellyLMiquelC 2007 Power and limitations of the chloroplast trnL (UAA) intron for plant DNA barcoding. Nucleic Acids Research 35 e14
53. GehrmannBKochW-GTschirchCOBrinkmannH 2005 Medicinal Herbs: a Compendium New York The Haworth Herbal Press 228
54. HeSMChanEZhouSF 2011 ADME Properties of Herbal Medicines in Humans: Evidence, Challenges and Strategies. Current Pharmaceutical Design 17 357 407
55. KimYSKimJJChoKHJungWSMoonSK 2008 Biotransformation of ginsenoside Rbl, crocin, amygdalin, geniposide, puerarin, ginsenoside Re, hesperidin, poncirin, glycyrrhizin, and baicalin by human fecal microflora and its relation to cytotoxicity against tumor cells. Journal of Microbiology and Biotechnology 18 1109 1114
56. ZhangLHuaNSunS 2008 Wildlife trade, consumption and conservation awareness in southwest China. Biodiversity and Conservation 17 1493 1516
57. BentSPadulaANeuhausJ 2004 Safety and efficacy of citrus aurantium for weight loss. The American journal of cardiology 94 1359 1361
58. SchanebergBTKhanIA 2004 Analysis of products suspected of containing Aristolochia or Asarum species. Journal of Ethnopharmacology 94 245 249
59. FosterSDukeJA 1998 Field Guide to Medicinal Plants: Eastern and Central North America: Houghton Mifflin 432
60. ZhengPNiuFLiuWShiYLuL 2005 Anti-inflammatory mechanism of oxymatrine in dextran sulfate sodium-induced colitis of rats. World Gastroenterology 11 4012
61. ZhangYZhuHYeGHuangCYangYChenR 2006 Antiviral effects of sophoridine against coxsackievirus B3 and its pharmacokinetics in rats. Life Sci 78 1998 2005
62. Herrera-ArellanoAJiménez-FerrerEVega-PimentelAMMartínez-RiveraMLHernández-HernándezM 2004 Clinical and mycological evaluation of therapeutic effectiveness of Solanum chrysotrichum standardized extract on patients with Pityriasis capitis (dandruff). A double blind and randomized clinical trial controlled with ketoconazole. Planta Medica 70 483 488
63. RobothamJMWangFSeamonVTeuberSSSatheSK 2005 Ana o 3, an important cashew nut (Anacardium occidentale L.) allergen of the 2S albumin family. Journal of allergy and clinical immunology 115 1284 1290
64. CordleCT 2004 Soy Protein Allergy: Incidence and Relative Severity. The Journal of Nutrition 134 1213S 1219S
65. GroupCPBLiD-ZGaoL-MLiH-TWangH 2011 Comparative analysis of a large dataset indicates that internal transcribed spacer (ITS) should be incorporated into the core barcode for seed plants. Proceedings of the National Academy of Sciences 108 19641 19646
66. TaylorPG 1996 Reproducibility of ancient DNA sequences from extinct Pleistocene fauna. Molecular Biology and Evolution 13 283 285
67. DeagleBEGalesNJEvansKJarmanSNRobinsonS 2007 Studying seabird diet through genetic analysis of faeces: a case study on macaroni penguins (Eudyptes chrysolophus). PLoS ONE 2 e831 doi:10.1371/journal.pone.0000831
68. LinacreAGusmaoLHechtWHellmannAPMayrWR 2010 ISFG: Recommendations regarding the use of non-human (animal) DNA in forensic genetic investigations. Forensic Science International: Genetics 5 501 505
69. CoghlanMLWhiteNEParkinsonLHaileJSpencerP 2012 Egg forensics: An appraisal of DNA sequencing to assist in species identification of illegally smuggled eggs. Forensic Science International: Genetics 6 268 273
70. TobeSSLinacreA 2009 Identifying endangered species from degraded mixtures at low levels. Forensic Science International: Genetics Supplement Series 2 304 305
71. LiuRWangMDuanJGuoJTangY 2010 Purification and identification of three novel antioxidant peptides from Cornu Bubali (water buffalo horn). Peptides 31 786 793
72. YeeS-KChuS-SXuY-MChooP-L 2005 Regulatory control of Chinese Proprietary Medicines in Singapore. Health Policy 71 133 149
73. EspinozaEOShaferJAHageyLR 1993 International trade in bear gall bladders: Forensic source inference. Journal of Forensic Sciences 38 1363 1363
74. XuZ 2011 Modernization: One step at a time. Nature 480 S90 S92
75. BunceMOskamCAllentoftM 2012 The Use of Quantitative Real-Time PCR in Ancient DNA Research. ShapiroBHofreiterM Methods in Molecular Biology - Ancient DNA 840 Humana Press Series 121 132
76. FrankD 2009 BARCRAWL and BARTAB: software tools for the design and implementation of barcoded primers for highly multiplexed DNA sequencing. BMC Bioinformatics 10 362
77. de la BastideMMcCombieWR 2007 Assembling Genomic DNA Sequences with PHRAP. Current Protocols in Bioinformatics 17 11.4.1 11.4.15
78. RicePLongdenIBleasbyA 2000 EMBOSS: the European molecular biology open software suite. Trends in genetics 16 276 277
79. AltschulSFGishWMillerWMyersEWLipmanDJ 1990 Basic local alignment search tool. Journal of molecular biology 215 403 410
80. BensonDAKarsch-MizrachiILipmanDJOstellJWheelerDL 2006 GenBank. Nucleic Acids Research 34 D16 D20
Štítky
Genetika Reprodukční medicínaČlánek vyšel v časopise
PLOS Genetics
2012 Číslo 4
Nejčtenější v tomto čísle
- A Coordinated Interdependent Protein Circuitry Stabilizes the Kinetochore Ensemble to Protect CENP-A in the Human Pathogenic Yeast
- Coordinate Regulation of Lipid Metabolism by Novel Nuclear Receptor Partnerships
- Defective Membrane Remodeling in Neuromuscular Diseases: Insights from Animal Models
- Formation of Rigid, Non-Flight Forewings (Elytra) of a Beetle Requires Two Major Cuticular Proteins