2.39: Genetic Variation - Biology

2.39: Genetic Variation - Biology

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

What helps ensure the survival of a species?

Genetic variation. It is this variation that is the essence of evolution. Without genetic differences among individuals, "survival of the fittest" would not be likely. Either all survive, or all perish.

Genetic Variation

Sexual reproduction results in infinite possibilities of genetic variation. In other words, sexual reproduction results in offspring that are genetically unique. They differ from both parents and also from each other. This occurs for a number of reasons.

  • When homologous chromosomes form pairs during prophase I of meiosis I, crossing-over can occur. Crossing-over is the exchange of genetic material between homologous chromosomes. It results in new combinations of genes on each chromosome.
  • When cells divide during meiosis, homologous chromosomes are randomly distributed to daughter cells, and different chromosomes segregate independently of each other. This called is called independent assortment. It results in gametes that have unique combinations of chromosomes.
  • In sexual reproduction, two gametes unite to produce an offspring. But which two of the millions of possible gametes will it be? This is likely to be a matter of chance. It is obviously another source of genetic variation in offspring. This is known as random fertilization.

All of these mechanisms working together result in an amazing amount of potential variation. Each human couple, for example, has the potential to produce more than 64 trillion genetically unique children. No wonder we are all different!

See Sources of Variation at for additional information.


Crossing-over occurs during prophase I, and it is the exchange of genetic material between non-sister chromatids of homologous chromosomes. Recall during prophase I, homologous chromosomes line up in pairs, gene-for-gene down their entire length, forming a configuration with four chromatids, known as a tetrad. At this point, the chromatids are very close to each other and some material from two chromatids switch chromosomes, that is, the material breaks off and reattaches at the same position on the homologous chromosome (Figure below). This exchange of genetic material can happen many times within the same pair of homologous chromosomes, creating unique combinations of genes. This process is also known as recombination.

Crossing-over. A maternal strand of DNA is shown in red. A paternal strand of DNA is shown in blue. Crossing over produces two chromosomes that have not previously existed. The process of recombination involves the breakage and rejoining of parental chromosomes (M, F). This results in the generation of novel chromosomes (C1, C2) that share DNA from both parents.

Independent Assortment and Random Fertilization

In humans, there are over 8 million configurations in which the chromosomes can line up during metaphase I of meiosis. It is the specific processes of meiosis, resulting in four unique haploid cells, that result in these many combinations. This independent assortment, in which the chromosome inherited from either the father or mother can sort into any gamete, produces the potential for tremendous genetic variation. Together with random fertilization, more possibilities for genetic variation exist between any two people than the number of individuals alive today. Sexual reproduction is the random fertilization of a gamete from the female using a gamete from the male. In humans, over 8 million (223) chromosome combinations exist in the production of gametes in both the male and female. A sperm cell, with over 8 million chromosome combinations, fertilizes an egg cell, which also has over 8 million chromosome combinations. That is over 64 trillion unique combinations, not counting the unique combinations produced by crossing-over. In other words, each human couple could produce a child with over 64 trillion unique chromosome combinations!

See How Cells Divide: Mitosis vs. Meiosis at for an animation comparing the two processes.


  • Sexual reproduction has the potential to produce tremendous genetic variation in offspring.
  • This variation is due to independent assortment and crossing-over during meiosis, and random union of gametes during fertilization.

Explore More

Use this resource to answer the questions that follow.

  • Genetic Variation at
  1. What is meant by genetic variation?
  2. Would natural selection occur without genetic variation? Explain your response.
  3. What causes genetic variation?
  4. How would genetic variation result in a change in phenotype?
  5. What are the sources of genetic variation? Explain your response.


  1. What is crossing-over and when does it occur?
  2. Describe how crossing-over, independent assortment, and random fertilization lead to genetic variation.
  3. How many combinations of chromosomes are possible from sexual reproduction in humans?
  4. Create a diagram to show how crossing-over occurs and how it creates new gene combinations on each chromosome.

Epigenetic hereditary transcription profiles II, aging revisited

Previously, we have shown that deviations from the average transcription profile of a group of functionally related genes can be epigenetically transmitted to daughter cells, thereby implicating nuclear programming as the cause. As a first step in further characterizing this phenomenon it was necessary to determine to what extent such deviations occur in non-tumorigenic tissues derived from normal individuals. To this end, a microarray database derived from 90 human donors aged between 22 to 87 years was used to study deviations from the average transcription profile of the proteasome genes.


Increase in donor age was found to correlate with a decrease in deviations from the general transcription profile with this decline being gender-specific. The age-related index declined at a faster rate for males although it started from a higher level. Additionally, transcription profiles from similar tissues were more alike than those from different tissues, indicating that deviations arise during differentiation.


These findings suggest that aging and differentiation are related to epigenetic changes that alter the transcription profile of proteasomal genes. Since alterations in the structure and function of the proteasome are unlikely, such changes appear to occur without concomitant change in gene function.

These findings, if confirmed, may have a significant impact on our understanding of the aging process.

Open peer review

This article was reviewed by Nathan Bowen (nominated by I. King Jordan), Timothy E. Reddy (nominated by Charles DeLisi) and by Martijn Huynen. For the full reviews, please go to the Reviewers'comments section.


Germinating seedlings of monocotyledons have a coleoptile, which is a sheath-like tissue covering the primary leaf to protect the emerging shoot as it breaks through the soil to the surface. The coleoptile stops growing after the first true leaf pushes through the pore at the tip. The coleoptile is essential for early crop establishment and its length determines the maximum depth at which the seed can be sown [36, 40, 45, 46, 57]. If seeds are sown at a depth greater than their coleoptile length, it may result in lower emergence rate, reduced early growth, fewer tiller numbers, and decreased grain yield [17, 45, 50]. In agricultural growing areas prone to drought, the topsoil moisture can be insufficient for seed germination, and seeds need to be sown deeper to access enough moisture [31, 51] and lower temperatures [33]. Therefore, the varieties with longer coleoptiles are preferable in water-limited growing regions for example, winter wheat grown in the low water supply areas of the Pacific Northwest of the United States are sown at depth of 10 to 20 cm [52]. In Australian varieties, coleoptile length promoting genes were found to increase the emergence of wheat seedlings at sowing depths of 12 cm without affecting plant height [12]. Deep seeding may also reduce the threat of damage by mice or other animals [8], and protect the seedlings from pre-emergent herbicides [38].

Auxins are a class of plant hormones that can modify plant cell walls, and are essential for coleoptile cell elongation and expansion [9]. In cereal coleoptiles, the most significant cell wall modifications induced by auxin are the decline of noncellulosic glucan content [23, 35, 48] and the degradation of 1,3:1,4-β-glucan [49], by activating exo- and endo-β-glucanases associated with cell walls [25, 26, 30]. This partial degradation resulted in the cell wall loosening, therefore increasing the cells’ extensibility. Auxin also enhances the synthesis of H + -ATPase level in the plasma membrane to increase the H + extrusion into the apoplast to adjust to an optimum pH for cell wall enlargement [18]. Some studies suggest that the potassium channel gene Zea mays K + channel 1 (ZMK1) is upregulated by auxin and essential for coleoptile elongation by maintaining K + accumulation and turgor [41]. Some other cell wall-bound proteins were also reported to be key regulators for cell extension, such as α-expansin in rice [24] and β-expansin in wheat coleoptiles [14]. Nucleoside diphosphate (NDP) kinase genes were also reported to be involved in coleoptile elongation [39]. Although auxin is known to activate a group of genes responsible for cell expansion, the exact mechanisms underlying coleoptile elongation remain unclear.

Barley varieties show significant differences for coleoptile length. Paynter and Clarke (2010) [33] determined the coleoptile length for a total of 44 barley cultivars with different breeding origins, early growth habits (erect or prostrate) and pedigrees. In this collection the coleoptile length ranged from 38.7 mm (cultivar Morrell from Western Australia (WA)) to 92.9 mm (cultivar Doolup from WA), with an average of 70.2 mm. They concluded that coleoptile length was not associated with breeding origin and early growth habit in their barley collection. Takeda and Takahashi (1999) [58] scored 5082 barley and 1214 wheat varieties and found significant differences in deep-seeding tolerance, which related to the coleoptile length, first internode length and the seed size. Subsequent studies conducted QTL analyses for coleoptile length using several barley doubled haploid (DH) mapping populations: Takahashi et al. (2001) [57] used two different DH populations (Harrington × TR306 and Steptoe × Morex) and identified QTLs for deep-seeding tolerance, coleoptile length and first internode length on the long arm of chromosome 5H, corresponding with QTLs for abscisic acid and gibberellic acid response. Takahashi et al. (2008) [56] used another Harrington × TR306 population and mapped QTLs for coleoptile elongation on chromosomes 1H, 2H, 4H, 5H, 6H and 7H. However, only 127 markers for the Harrington × TR306 population, and 223 markers for the Steptoe × Morex population were available. As a result, detected QTLs spanned 2 to 5 cM intervals across the seven barley chromosomes, which was too low-resolution to pinpoint candidate genes. To date, no specific candidate genes for barley coleoptile length have been reported.

In this study, we performed genome-wide association mapping with more than 30,000 genetic markers to map the marker-trait associations (MTAs) for coleoptile length. We used a worldwide collection of mainly domesticated barley cultivars (a total of 328 accessions), including a large proportion of barley cultivars grown in the driest regions in the world such as Australia. The aims of this study were (i) to investigate the phenotypic variation of coleoptile length in a diverse worldwide collection of barley genotypes, (ii) to determine genomic regions associated with coleoptile length via GWAS, and (iii) to identify and characterise the most likely candidate genes underlying the MTAs.


MA experiments and whole-genome sequencing

We conducted long-term MA experiments on A. thaliana in both single-seed descent lineages and populations grown under Control (day 23 °C / night 18 °C), Heat (day 32 °C / night 27 °C), and Warming (day 28 °C / night 23 °C) conditions (Fig. 1a, b) (see “Methods”). The elevated temperature treatments, especially Heat (32 °C), resulted in various stress symptoms such as significantly decreased leaf size, shorter siliques (Fig. 1c, d), and shorter generation times. We sequenced 35 A. thaliana genomes, including 15 plants from MA lines at generation 10 (G10 five plants from each treatment) and 15 plants from MA populations [five plants each from G16 (Control, A16), G19 (Warming, C19), and G22 (Heat, B22)], spanning 10–22 successive generations, as well as their ancestor genomes (five individual plants from G0). In total, approximately 165 Gb of clean reads (30 libraries) from 30 genomes of progeny, and 25 Gb of clean reads from five libraries (see “Methods”) representing the genetic background of the ancestor (Additional file 1: Table S1) were obtained. For all MA lines and populations, an average of 99.68% of sequenced reads was mapped to the A. thaliana reference genome, with average depths of 52.5×, 49.7×, 47.4×, 42.7×, 37.4×, and 36.3× per individual in D10 (Control), E10 (Heat), F10 (Warming), A16 (Control), B22 (Heat), and C19 (Warming), respectively. Accordingly, an average of 116 Mb (96.9%) of the reference genome was accessible for variant calling (Additional file 1: Table S1). To obtain sufficient coverage of the genetic background of the ancestor, the five G0 libraries (average coverage 37.1×) were combined. This sequencing depth/coverage and number of accessible reference sites allowed for precise detection of mutations at the whole-genome level.

Schematic illustration and morphological comparison of A. thaliana grown under Control, Heat, and Warming conditions. a Schematic illustration of A. thaliana mutation accumulation (MA) lines and populations. Two MA experiments were conducted in this study (see “Methods”). For MA line experiments, seeds from a single Col-0 ancestor plant were grown independently under Control (D), Heat (E), or Warming (F) conditions for 10 successive generations. Five 10th generation (Generation 10, G10) plants (five MA lines) from each treatment (D10, E10, F10) were used for individual whole-genome sequencing. For MA population experiments, seeds from the same ancestor plant as the MA lines were divided into three groups (

35 seedlings per group) and planted under Control conditions (A) for 16 generations, Warming conditions (C) for 19 generations, or Heat conditions (B) for 22 generations [the first 9 generations grown under gradual warming, i.e., increase of 1 °C per generation (from 24/18 °C to 32/27 °C [day/night]) the following 13 generations were grown at constant 32/27 °C]. Five 16th, 22th, and 19th generation plants from each treated population were also randomly selected for sequencing. To maximize coverage and provide progenitor background genetic information (reference genome sequence) for MA experiments, five individuals (G0) were combined for sequencing. Genome-sequenced plants from MA lines and populations are highlighted in yellow- and grey-shaded (blue outline) boxes, respectively see also Additional file 1: Table S1. b Growth status of MA plants exposed to Control, Heat, and Warming conditions at stage 5 (bolting) and stages 8–9 (silique ripening and senescence). Leaves at stage 5 (major axis ≤ 1 cm) were sampled for DNA extraction and sequencing. Scale bar, 5 cm. c Ripened siliques from the Control, Heat, and Warming treatments. Scale bar, 0.5 cm. d Phenotypic statistics of leaf area and silique length under different temperature treatments. Leaves at stage 5 (bolting) and siliques at stage 9 were measured. The experiments were repeated three times and the data are presented as means ± standard errors of the mean (SEMs n = 30). Significant differences were revealed using analysis of variance (ANOVA) with post hoc tests (*p < 0.05, **p < 0.01 vs. Control or Warming)

Accumulated mutations and mutation rates in MA lines and populations under elevated temperatures

We obtained a total of 211 homozygous de novo mutations from MA lines under three temperature treatments (Fig. 2a and Additional file 1: Tables S2-S3), including 39 mutations (31 single-nucleotide variants [SNVs] and 8 indels) in D10 (Control), 98 mutations (69 SNVs and 29 indels) in E10 (Heat), and 74 mutations (54 SNVs and 20 indels) in F10 (Warming). Most (85.9%) of the 57 indels in the MA lines were short (1–3 bp) deletions (dels) and insertions (ins) (Fig. 2c). Furthermore, the indels of E10 (25 dels vs. 4 ins) and F10 (17 dels vs. 3 ins) showed strong biases toward dels. In addition, we also detected 376 homozygous de novo mutations in MA populations, including 70 mutations (60 SNVs and 10 indels) in A16 (Control), 183 mutations (130 SNVs and 53 indels) in B22 (Heat), and 123 mutations (88 SNVs and 35 indels) in C19 (Warming) (Fig. 2b and Additional file 1: Tables S2-S3). Similar to the indels identified from MA lines, most indels in the MA populations were short (1–3 bp) and biased toward dels in B22 (42 dels vs. 11 ins) and C19 (22 dels vs. 13 ins) (Fig. 2d). Moreover, we found no novel transposable element (TE) insertion event in any MA line or population.

Distribution across chromosomes of de novo mutations [single-nucleotide variants (SNVs) and small insertions and deletions (indels)] detected in genomes of Arabidopsis from the Heat, Warming, and Control MA lines and populations. a,b Labels indicate the type of mutation colors indicate the functional class or predicted consequence. Single-base insertions (ins) and deletions (dels) are indicated by base letters preceded by a plus and minus sign, respectively. Large ins and dels are indicated by a plus (with the number of inserted base pairs) and minus sign (with the number of deleted base pairs), respectively. Individual colors indicate intergenic region (red), intron (yellow), synonymous/non-frameshift (orange), nonsynonymous/frameshift/stop gain (blue), UTR3/5 (purple), upstream/downstream (green), splicing (pink), transposable element (violet), and noncoding/pseudogene (lake blue) mutations. Red labels in each MA population indicate the same mutations detected in at least two sequenced samples. c, d Frequencies and categories of ins and dels were determined based on their indel lengths (see also Additional file 1: Table S3)

We further estimated the accuracy of the mutation calling pipelines using two simulation tests [14, 35]. For the first test, we simulated 600 random SNVs using six copies of reference genomes (see “Methods”). After read mapping and SNV filtering against the mutated reference genomes, our pipeline recovered 588 (98%) of 600 expected SNVs (Additional file 1: Table S4). For the second simulation test, we introduced homozygous SNVs and performed heterozygous SNV filtering, resulting in the recovery of 71–91% homozygous SNVs (Additional file 1: Table S5). To confirm our mutation calls, we experimentally examined all SNVs and indels from MA lines by Sanger sequencing. In total, 205 of 211 mutations were confirmed (six mutations were identified as PCR failures) (Additional file 1: Table S6).

We estimated the SNV mutation rate (μSNV) and indel mutation rate (μindel) per site per generation in the MA lines. Mutigenerational growth of A. thaliana under heat conditions caused significant increases relative to Control D in the average rates of SNVs [μE-SNV = 1.18 (± 0.09) × 10 − 8 vs. μD-SNV = 5.28 (± 0.95) × 10 − 9 two-sample t test, p = 1.0 × 10 − 3 ] and indels [μE-indel = 4.94 (± 0.50) × 10 − 9 vs. μD-indel = 1.36 (± 0.43) × 10 − 9 , p = 6.3 × 10 − 4 Fig. 3a]. Similarly, Warming also increased the SNV [μF-SNV = 9.21 (± 0.68) × 10 − 9 , p = 9.9 × 10 − 3 ] and indel [μF-indel = 3.41 (± 0.71) × 10 − 9 , p = 0.04] mutation rates. Furthermore, the mutation rates of dels were more than 5-fold higher than those of ins in both Heat E [μE-del = 4.26 (± 0.47) × 10 − 9 vs. μE-ins = 6.82 (± 1.70) × 10 − 10 ] and Warming F [μF-del = 2.90 (± 0.88) × 10 − 9 vs. μF-ins = 5.12 (± 3.41) × 10 − 10 ], in contrast to the lack of difference observed in Control D. The overall MA mutation rates (SNVs and indels) of the Heat and Warming lines were 1.67 (± 0.06) × 10 − 8 (μE-total) and 1.26 (± 0.13) × 10 − 8 (μF-total) per site per generation, approximately 2.5-fold (p = 8.6 × 10 − 6 ) and 1.9-fold (p = 4 × 10 − 3 ) higher than the Control [μD-total = 6.65 (± 0.83) × 10 − 9 ], respectively (Fig. 3a). In addition, we observed significantly higher rates of total mutations and SNVs in Heat E compared to Warming F (p < 0.05), whereas the difference in indel rates was not significant (p = 0.1).

Estimation of mutation rates of observed mutations (SNVs, indels) and molecular spectra in Control, Warming, and Heat MA lines and populations. a, b SNV, indel, and total mutation rates (per site per generation) of de novo mutations in MA lines and populations subjected to different temperature treatments. Significant differences were revealed using a two-tailed Student’s t test (*p < 0.05, **p < 0.01 compared to the Control or Warming treatments). c, d Mutation rates of different mutation types in MA lines and populations subjected to different temperature treatments. Conditional rates of each mutation type per site per generation were estimated by dividing the number of observed mutations by the number of analyzed sites capable of producing a given mutation and the number of generations of MA in each Control, Warming, and Heat lineage and population lineage. Error bars indicate SEM. e, f Mutation frequencies (per genome per generation) of transition and transversion mutations accumulated in MA lines and populations subjected to different temperature treatments. Significant differences were revealed using a two-tailed Student’s t test (*p < 0.05, **p < 0.01 compared to the Control or Warming treatments). g Transition/transversion ratios (Ts/Tv) of SNVs accumulated in MA lines and populations subjected to different temperature treatments

In parallel, we also estimated the average mutation rates in MA populations (Fig. 3b). The plants grown under Heat conditions had significantly higher SNV rates than Control plants, such as μB-SNV (Heat) = 1.03 (± 0.06) × 10 − 8 vs. μA-SNV (Control) = 6.53 (± 0.76) × 10 − 9 (p = 1.6 × 10 − 4 ), whereas no significant difference was observed between Warming (μC-SNV = 8.08 (± 0.63) × 10 − 9 ) and Control conditions (p = 0.2). However, the total mutation rates of Heat B and Warming C were 1.45 (± 0.09) × 10 − 8 (μB-total) and 1.13 (± 0.09) × 10 − 9 (μC-total), nearly 2.0- and 1.5-fold (p < 0.05) higher than in Control A [μA-total = 7.61 (± 0.08) × 10 − 9 ], respectively (Fig. 3b). Additionally, compared to Warming C, Heat B had significantly higher total mutation and SNV rates (p < 0.05). However, the SNV rates and total mutation rates were lower in both the Heat and Warming populations than in the MA lines under elevated temperatures.

Molecular spectra of mutations in A. thaliana under elevated temperatures

Base substitution mutation spectra varied after multigenerational growth of A. thaliana under Heat, Warming, and Control conditions. We found a strong C:G → T:A bias (driven by C → T and G → A) in six mutational spectra that commonly occurred in MA lines under all three temperature treatments (Fig. 3c) however, C:G → T:A mutations under elevated temperatures (Heat and Warming) had much higher rates compared to Control. Furthermore, compared to Heat E (μE-C:G → T:A = 1.47 × 10 − 8 per site per generation), Warming F exhibited a lower C:G → T:A mutation rate (μF-C:G → T:A = 1.23 × 10 − 8 ). In addition, in Heat E and Warming F, the second most frequent substitution was A:T → T:A (mutation rate, μE-A:T → T:A = 3.20 × 10 − 9 ) however, this differed from Control D, in which the second most frequent substitution was A:T → G:C (μE-A:T → G:C = 2.39 × 10 − 9 ). In general, the mean rate of mutations occurring at C:G sites was nearly 3-fold higher than at A:T sites in Heat E and Warming F (Fig. 3c), in contrast to

2-fold in Control D. In MA populations, we observed similar results under Heat and Warming (Fig. 3d) for example, the most frequent substitutions in Heat B and Warming C were also biased toward C:G → T:A (μB-C:G → T:A = 1.50 × 10 − 8 μC-C:G → T:A = 1.54 × 10 − 8 ) and were higher than in Control A (μA-C:G → T:A = 1.01 × 10 − 8 ). The second most frequent substitutions (mutation rate) in Heat B occurred at A:T → T:A (μB-A:T → T:A = 2.37 × 10 − 9 ) sites, similar to Heat MA lines. By contrast, the second most frequent substitutions in Warming C were A:T → G:C (μC-A:T → G:C = 2.16 × 10 − 9 ), somewhat different from Warming MA lines.

We calculated the transition and transversion frequencies (per genome per generation) for the three treatments. In MA lines, the transition (Ts) and transversion (Tv) frequencies in Heat E (0.84 and 0.54, respectively) and Warming F (0.68 and 0.40, respectively) showed obvious increases compared to Control D (0.46 and 0.16, respectively Fig. 3e,f), resulting in significantly decreased Ts/Tv ratios at both elevated temperatures (Fig. 3g and Additional file 1: Table S7). Moreover, compared to Heat E, Warming F had a higher Ts/Tv ratio, which can be attributed to its lower frequencies of Ts and Tv. The Ts/Tv ratios were higher in Heat and Warming MA populations than in the MA lines (Fig. 3g). Within MA populations, the Ts/Tv ratios were significantly decreased in Heat B (1.83) compared to Control A (2.53 Fig. 3g). Nevertheless, the Warming population showed a high Ts/Tv ratio due to its higher transition and lower transversion rates relative to the Heat and Control populations.

Mutation frequency distribution across different genomic regions in A. thaliana under elevated temperatures

We annotated the mutations and estimated their frequencies across different genomic regions in MA lines and populations. All MA lines showed higher mutation frequencies in intergenic regions than in genic regions. Heat E and Warming F showed > 50% increases in mutation frequencies in both genic (1.4–2.2-fold increases) and intergenic (0.5–1-fold increases) regions compared to Control D (p < 0.05 Fig. 4a Additional file 1: Table S8A). Notably, within genic regions of Heat E and Warming F, higher mutation frequencies occurred in coding regions than in noncoding regions, different from Control D. The predominance of variants in coding regions of Heat E and Warming F was attributed to the disproportionate occurrence of nonsynonymous mutations (Additional file 1: Table S8A). For example, the nonsynonymous mutations in Heat E (0.26) were significantly more frequent than in Control D (0.02 p < 0.05). In addition, Heat E showed higher mutation frequencies in intergenic and genic regions than Warming F, but this difference was not significant (p > 0.05). In noncoding regions, the frequencies of intronic and untranslated region (UTR) mutations were highest in Heat E. Interestingly, more mutations occurred in transposable elements (TEs) under the Heat treatment, with a significantly increased frequency in Heat E compared to Control D (p = 0.02). Finally, we calculated SNV rates within intergenic, genic, and TE regions (Fig. 4a), all of which increased with temperature.

Comparison of mutation frequencies in various genomic regions among the Control, Heat, and Warming lines (a) and populations (b). The numerical values in the stacked bar chart indicate the frequencies of total mutations (SNVs and indels) in the genomic regions. The mutation frequency of each region in each sample was calculated using the formula m = n/g, where n is the number of identified mutations and g is the number of generations. Accordingly, the mean mutation frequency of each treatment (five samples) was estimated by the ∑m/5. Numerical values above the bars indicate SNV rates in the genomic regions. The SNV rates of each genomic region (per site per generation) were estimated by dividing the number of observed mutations by the number of analyzed sites capable of producing a given mutation and the number of generations of MA

In MA populations, we found that the mutation frequencies of intergenic regions and TEs in Heat B were significantly higher than those in Control A (p < 0.01 Fig. 4b). By contrast, the frequency of nonsynonymous mutations in Heat B (0.07) was lower than that in Warming C (0.18) (Additional file 1: Table S8B). Consistently, the mutation frequency of coding regions in Heat B (0.13) was lower than those in Warming C (0.26) (Fig. 4b). Notably, the mutation frequencies of coding regions in Heat and Warming populations were also lower than in the Heat and Warming lines, with a significantly lower frequency of nonsynonymous mutations observed in Heat B population (0.07) than in the Heat E lines (0.26) (Additional file 1: Table S8B) this indicates the stronger selection effects for nonsynonymous mutations in MA populations at high temperatures. To further investigate the selection effects on MA populations, we used the KaKs calculator to determine the ratio of nonsynonymous to synonymous substitutions (Ka/Ks ratio). The Heat E lines had a Ka/Ks ratio of 0.92, whereas the Heat B population had a Ka/Ks ratio > 1 (1.51), suggesting that the Heat MA population had been subjected to positive selection.

Nonsynonymous SNVs, gains and losses of stop codons, and indels within coding regions are likely to affect fitness [36]. Therefore, we estimated the rates of diploid genomic mutations affecting fitness under different treatments. These rates were 0.48 (± 0.1) and 0.36 (± 0.2) per generation in Heat E and Warming F, respectively, and these values were higher than those in Control D (0.13 ± 0.1 Heat vs. Control, p = 0.005 Warming vs. Control, p = 0.16). Similarly, genomic mutation rates affecting fitness were significantly higher in Heat B (0.16) and Warming C (0.24) than in Control A (0.05 p < 0.003). In addition, genomic mutation rates affecting fitness were lower in MA populations than MA lines.

Mutations in functional genes of A. thaliana under elevated temperatures

To investigate the accumulated mutations in functional genes that may be involved in various biological processes underlying high-temperature responses, we performed Gene Ontology (GO) functional analysis of 29, 46, and 55 genes with mutations from the Control, Warming, and Heat MA lines and populations, respectively (Additional file 1: Table S3). We found that these genes were enriched in multiple related terms, including the cellular process, metabolite process, cell part, membrane, binding, and catalytic activity (Fig. 5a). In contrast, elevated temperatures resulted in the enrichment of more genes associated with the “response to stimulus,” “reproductive process,” “development process,” and “biological regulation” terms. Kyoto Encyclopedia of Genes and Genomes (KEGG) functional analysis showed enrichment of common pathways, including “signaling transduction,” “development,” and “replication and repair” at elevated temperatures (Additional file 2: Fig. S1).

Functional enrichment of mutated genes in MA lines and populations. a GO enrichment of mutated genes in the Control (A and D), Heat (B and E), and Warming (C and F) treatments. The arrow indicates an important biological process. b Expression levels of mutated genes under Heat and Warming differed significantly from the Control (expression dataset obtained from NCBI GSE118298). Log10-transformed FPKM expression values for each treatment were visualized using a heatmap. Red color indicates a high expression level, and blue color indicates a low expression level. c Nonsynonymous (orange), frameshift deletion (blue), frameshift insertion (yellow), and stop gain (grey) mutations in gene coding regions of MA lines and populations. Each gene involved in a putative biological process is shown. Defense response- and DNA repair-associated genes are marked in blue, and asterisks indicate the differentially expressed genes shown in (b)

To further determine whether these mutations occurred at genes involved in the transcriptional response to heat and warming, we used a previously obtained RNA-seq dataset to identify potential temperature-responsive (significantly differentially expressed) transcripts among the Heat, Warming, and Control treatments (see “Methods”). Interestingly, 9 (16%) of 55 genes from Heat MA samples showed significantly differential expression between the Control and Heat treatments, and 10 (22%) of 46 genes from Warming MA samples were differentially expressed between Control and Warming (Fig. 5b). In particular, mutations occurred in two genes encoding heat-shock protein 70-17 (HSP70-17) and heat stress transcription factor A-1a (HSFA1A), which were upregulated under Heat treatment and were identified in Heat E and B, respectively.

We further focused on genes with nonsynonymous, frameshift, stop-gain SNVs, or indels (Fig. 5c). In Heat lines, in addition to HSP70-17, described above, a nonsynonymous mutation was found in a gene encoding fumarate hydratase 2 (FUM2), which is associated with respiratory metabolism. Interestingly, a mutation occurred in the gene encoding a proliferating cell nuclear antigen (PCNA) domain-containing protein (AT4G17760), which is associated with DNA repair. Moreover, a frameshift del and a nonsynonymous SNV in the defense-related (i.e., disease resistance) protein Toll interleukin 1 receptor-like nucleotide-binding leucine-rich repeat (TIR-NB-LRR AT5G48770 and AT4G10780) were identified in the Heat E and warming F lines, respectively. In contrast to the accumulated mutations in MA lines, we found many exonic mutations distributed in genes associated with development and signal transduction, such as those encoding the calcium-dependent lipid-binding family protein (AT1G48090) and WD40 repeat-like superfamily protein (AT3G54190), in MA populations (Fig. 5c). Notably, the mutation distribution patterns among all individuals differed significantly between MA populations and MA lines. For example, some mutations within a MA population were shared by different individuals, whereas MA line mutations were scattered widely among individuals (Fig. 5c, Additional file 1: Table S3) these results demonstrated the distinct mutation landscapes of MA populations and MA lines. Given the experimental design applied to MA populations, we speculate that these common exonic mutations probably originated from a parental individual in the same generation (not the ancestor plant), suggesting that some genetic variants are more likely to spread in populations under selective pressure over multiple generations.

Interaction between methylation and TE annotation

We conducted whole-genome bisulfite sequencing of MA lines and identified more methylated cytosines (mCs) in Heat E (10.54%) and Warming F (10.44%) than in Control D (9.78% Additional file 1: Table S9). mCs in CG, CHG, and CHH (where H refers to A, T, or G) contexts are summarized in Supplemental Table 8. Spontaneous deamination of methylated cytosine (mC) to thymine is known to be a major source of mutations, resulting in elevated mutation rates at methylated sites [37]. We thus focused on mutations in the Control, Heat, and Warming MA lines in three contexts methylated and nonmethylated contexts. In the Heat treatment, the proportions of methylation at mutated bases were much greater than the genome-wide occurrence of methylation in the CG (Fisher’s exact test, p = 4.58 × 10 –8 ), CHG (Fisher’s exact test, p = 1.92 × 10 –21 ), and CHH (Fisher’s exact test, p = 1.63 × 10 –3 ) contexts (Fig. 6b). High frequencies of methylation at mutation sites were also found in the Warming (Fisher’s exact test: CG, p = 3.36 × 10 –4 CHG, p = 1.92 × 10 –15 CHH, p = 0.02) and Control (Fisher’s exact test: CG, p = 3.56 × 10 –12 CHG, p = 2.27 × 10 –21 CHH, p = 0.08) treatments (Fig. 6a, c). Because methylation and TEs correlate significantly [35, 38], we further tested the main effects of methylation and TE position (two-way analysis) on mutation rates under elevated temperatures using a logistic regression model. The methylated sites and TE regions were associated positively with mutations in the Control, Heat, and Warming MA lines (Additional file 1: Table S10). In general, methylated sites within and outside TEs had higher mutation rates than did nonmethylated sites in MA lines (Additional file 2: Fig. S2). Compared with those in Control lines, methylated and nonmethylated sites in the Heat and Warming lines showed higher mutation rates regardless of location (within or outside TEs) Heat E had the highest mutation rate on methylated sites outside TEs (Fig. 6d). In addition, we observed that nonmethylated sites within TEs had a higher rate in Warming F than in Heat E, but this difference was not significant.

Estimation of the effects of cytosine methylation and TE region on mutation rates in the Control, Heat, and Warming MA lines. ac Comparison of cytosine methylation percentages at all bases in the genome and mutated bases in the Control D (A), Heat E (B), and Warming F (C) lines. H refers to A, T, or G. The methylation percentage is much higher at mutated bases than the corresponding genome-wide occurrence for all three contexts: CG (Fisher’s exact test, p = 4.58 × 10 –8 ), CHG (Fisher’s exact test, p = 1.92 × 10 –21 ), and CHH (Fisher’s exact test, p = 1.63 × 10 –3 ). d Effects of cytosine methylation and TE region on mutation rates in the Control D (D), Heat E (E), and Warming F (F) lines. The x axis shows log-transformed (log10) mutation rates per site per generation. Mutation rates for non-TE and TE positions are marked in orange and blue, respectively. Mutation rates for nonmethylated and methylated CG positions are indicated with triangles and squares, respectively. Differences in mutation rates among Control D, Heat E, and Warming F lines were assessed using Student’s t test. Error bars indicate SEMs. Asterisks indicate significant differences from Control D at p < 0.05 (*) and p < 0.01 (**), respectively

Mutational bias and context effects of A. thaliana under elevated temperatures

We estimated the relative contributions of various genomic properties to mutation frequency, including gene density and GC content. In both MA lines and populations, we found that fewer mutations in high versus low gene density regions in all three treatments (Fig. 7a). For MA lines, the mutation rate was significantly biased toward low versus high gene density region in Heat E (t test, p = 0.02 Fig. 7b) and Warming F (p = 0.03). By contrast, no significant difference in mutation rate of Control D was observed between the high and low gene density regions (p = 0.45). This result suggests that multigenerational exposure of A. thaliana to high temperatures accelerates the accumulation of DNA mutations toward low gene density region compared to plants under ambient (Control) temperature. We also estimated whether GC content (per 1-kb window) affected local mutation rates in the MA lines and populations of each treatment. For all MA lines and populations, the GC contents and observed mutation rates were not well correlated (Additional file 2: Fig. S3), suggesting that GC content did not affect the mutation rate in our MA experiments.

Mutational biases of Control, Heat, and Warming lines and populations. Analysis of correlations between gene density and mutation rates across chromosomes in Control, Heat, and Warming lines and populations. a Distribution of mutations across A. thaliana chromosomes shown in a Circos plot. From outer circle to inner circle, the plot shows the chromosomes, genes (purple bars), and mutations in A16 (green bars), B22 (yellow bars), C19 (red bars), D10 (pink bars), E10 (purple bars), and F10 (blue bars). Each chromosome is divided into multiple bins (bin size = 100 kb), which are grouped into high and low gene density regions. b Comparison of mutation rates between regions with high gene density and those with low gene density. Significant differences were revealed using two-tailed Student’s t test (p < 0.05, high vs. low gene density regions). ch Neighbor-dependent mutation rates at AT and GC bases estimated for the Control, Heat, and Warming MA lines (ce) and populations (fh). The trinucleotide context-dependent mutation rate is shown for each treatment. The x axis shows the focal nucleotides (uppercase, mutation site) and immediate flanking nucleotides (lowercase), regardless of strand orientation (e.g., the tAt class includes the overall mutation rate at tAt and aTa sites). For each treatment, the mutation rates of G/C bases were generally elevated relative to those of A/T bases. Red dots indicate significantly elevated mutation rates

We evaluated the effect of local sequence context on the mutation rates of A/T and G/C positions flanked by different nucleotides at either site, regardless of DNA strand orientation (for example, AAG and its complement CTT both contribute to the mutation rate in the central AT position under the category AAG). As expected, GC bases had significantly higher mutation rates than did AT bases in all treatments (t test, p < 0.03) except Control D (p > 0.14 Fig. 7c–e). In general, mutation rates of AT bases in all contexts were uniform for each MA experiment (G test, p > 0.15), but mutation rates of GC bases were not (p < 0.01), except in the Warming treatments (Warming F, p = 0.98 Warming C, p = 0. 44). Moreover, the nucleotides located one position upstream or downstream had significant effects on the mutation rate in the Heat B population (t test, p < 3.32 × 10 –4 Fig. 7d), whereas those in other MA lines and populations did not (p > 0.05). Of all 16 possible combinations of flanking nucleotides, GCG in Control A (two-tailed Z test, p = 5.56 × 10 –13 ), GCG in Heat B (p = 7.40 × 10 –14 ), and CCG in Warming C (p = 3.97 × 10 –6 ) had significantly higher mutation rates than did other GC contexts. In contrast to MA populations, the Control D, Heat E, and Warming F lines showed significantly higher mutation rates in the CCC (p = 3.03 × 10 –4 ), CCG (p = 0.02), and GCT (p = 6.24 × 10 –5 ) contexts than in other GC contexts (Fig. 7g, h). However, the trinucleotides CCG (or GGC) and GCG (or CGC) appeared to have high mutation rates in all MA groups, regardless of temperature treatment. In addition, we observed that almost all indels within the Heat (E10 and B22), Warming (F10 and C19), and Control groups (D10 and A16) either occurred near simple repeats or involved tandem-repeat dels and ins (Additional file 1: Table S11), suggesting that the occurrence of indels is strongly biased toward repeat sequences in A. thaliana.

Comparison of de novo mutations with natural genetic variations

The 1001 Genomes Consortium (2016) reported 10,707,430 single-nucleotide polymorphisms (SNPs) and 1,424,879 indels (≤ 40 bp) in 1135 natural accessions of A. thaliana. To compare de novo mutations with natural variations, we merged the mutations from all MA lines and populations into 263 unique SNVs and 93 indels. We found that 64 (24%) of 263 total SNV sites coincide with biallelic SNPs in the 1001 Genomes dataset, and 50 (19% of the total) of these 64 shared SNVs are identical (Fig. 8). Among the 93 indels identified in MA experiments, 40 (43%) overlap with indels from the 1001 Genomes population, 12 (13% of the total) of which are identical. These identical sites (86% of shared SNVs, 75% of shared indels) are derived mainly from the Heat and Warming lines. Compared with the expected overlap (based on a random distribution of mutations and polymorphisms), the overlap between polymorphisms in all of our MA lines and populations with those of natural variants is highly significant (Fisher’s exact test: SNV, p = 2 × 10 –24 indel, p = 1 × 10 –12 Fig. 8). To determine whether the SNVs identified in our MA results were biased toward conserved or substitution sites in A. thaliana, we compared them with the 219,909 ancestral variants (SNPs occurring at substitution sites in A. thaliana) and 1,799,125 derived variants (SNPs occurring at conserved sites) from the 1001 Genomes biallelic SNP dataset (see “Methods”). Among all SNVs identified in our MA results, only four SNVs (1% of SNVs from Warming C and F and Heat E) overlap with ancestral variants and one SNV (from Heat B) is shared with derived variants, indicating a low frequency of de novo SNVs (identified under elevated temperatures) at the conserved sites.

Overlap between mutations identified in MA lines/populations (SNVs) and variants detected in the 1001 Genomes population (SNPs). Comparison of expected and observed proportions of SNVs and indels that overlap the SNPs and indels in the 1001 Genomes dataset. Numbers at the tops of the bars are absolute overlap values. Asterisks indicate p < 0.05 (*), p < 0.01 (**), and p < 0.001 (***) based on Fisher’s exact test with Bonferroni correction

Results and discussion

Assessment of suitable genetic markers for molecular systematics

Using the desirable properties described in the Materials and Methods section, we assessed the four classes of genetic markers for their suitability for application in molecular systematics of three groups of helminths and provided a guide to the genetic markers’ utility and limitations. Tables 1 and 2 summarize each class of genetic marker and its properties for molecular systematics studies the utility and limitations of each class of genetic marker for application are listed in Additional file 4: Table S12.

Suitability of genetic marker based on nucleotide substitution saturation

Analysis of nucleotide substitution saturation, which is an indicator of whether a genetic marker is useful for phylogenetic inferences, in the ITS sequences chosen for investigation across the taxa sampled in this study revealed that the nuclear ribosomal ITS regions were saturated (Table 1), with Iss > Iss.c, suggesting multiple substitutions have occurred. These findings indicate that the nuclear ribosomal ITS regions are not suitable genetic markers for molecular systematics studies, particularly at higher taxonomic levels. We obtained a similar result for nematodes, with the nuclear ribosomal ITS being saturated and not useful for molecular systematics. Moreover, Thaenkham et al. [22] compared the nuclear 18S rRNA gene and the ITS2 region for Opisthorchiidae and Heterophyidae and demonstrated that compared to the 18S rRNA gene, the ITS2 region was not suitable for family-level analysis of the superfamily Opisthorchioidea. Conversely, the nuclear rRNA genes, the mitochondrial protein-coding genes and the mitochondrial rRNA genes were not saturated, with Iss < Iss.c, suggesting that they can be useful markers for inferring phylogenetic relationships.

Genetic distances as a measure of a genetic marker’s suitability for molecular systematics

Comparing the mean genetic distances for each marker revealed a similar trend among the three groups of helminths. As presented in Table 2, the largest genetic distances occurred in the nuclear ribosomal ITS regions of ITS1 and ITS2, suggesting that the spacer regions might not be suitable for inferring phylogenetic relationships across a broad taxonomic hierarchy. The finding is in agreement with previous studies showing that the ITS regions are not appropriate for phylogenetic comparisons between distantly related taxa [54,55,56]. Conversely, the mean pairwise proportion of differences in the nuclear 18S and 28S rRNA genes were the smallest, with the 18S rRNA genes having values of 0.029, 0.036 and 0.039 for nematodes, trematodes and cestodes, respectively, and the 28S rRNA genes had values of 0.050 and 0.120 for nematodes and trematodes, respectively. The mean pairwise proportion of differences among the nuclear rRNA genes was statistically different from that of all other genetic markers (χ 2 = 1519.6, df = 9, P < 0.000001 for nematodes χ 2 = 581.7, df = 9, P < 0.000001 for trematodes χ 2 = 424.3, df = 8, P < 0.000001 for cestodes). The small genetic distance values of the nuclear rRNA genes can be a limiting factor and might render insufficient resolution for species-level identification.

For the mitochondrial genes, the genetic distances were significantly higher than those of the nuclear rRNA genes. Among the mitochondrial genes, the genetic distances seen in the mitochondrial rRNA genes were comparable to those in the mitochondrial protein-coding genes.

The number of monophyletic clades as a measure of the genetic marker’s resolution

The recovery of recognized taxa as monophyletic can also indicate the resolution of the genetic marker. The highly conserved nature of the nuclear rRNA genes makes them suitable genetic markers for molecular systematics [6]. The 18S and 28S rRNA genes have been used in the higher-level classification of nematodes, trematodes and cestodes, allowing construction of the phylogenetic framework for each group of helminths [13,14,15]. Our findings show that compared to other genetic markers, the nuclear rRNA genes and the mitochondrial 16S rRNA gene gave the best phylogenetic resolution for trematodes, recovering three out of four suborders as monophyletic (Table 2). For cestodes, the mitochondrial genes gave the best resolution as compared to the nuclear genes. For nematodes, the mitochondrial 12S and 16S rRNA genes exhibited the best resolution of the genetic markers (apart from NAD1 for nematodes), with four out of six orders as monophyletic. The mitochondrial rRNA genes are more conserved than the mitochondrial protein-coding genes, and this slightly more conserved nature has led to the mitochondrial rRNA genes being used for higher-level classification of organisms [57,58,59]. In helminths, the 16S rRNA gene and the nuclear rRNA genes have been used in conjunction to provide increased resolution for cestode phylogenies [60, 61]. Chan et al. also reported that the mitochondrial rRNA genes provide good resolution and can be used for molecular systematics in nematodes [59].

Thus, the results of our assessment of the genetic markers for their suitability for molecular systematics of helminths indicate that the nuclear ribosomal ITS regions might not be suitable for phylogenetic inferences at a higher taxa level due to nucleotide substitution saturation. In addition, the number of monophyletic clades obtained and sufficient genetic distances supported the resolution of the mitochondrial rRNA genes for molecular systematics, making them comparable to the commonly used nuclear rRNA genes.

Assessment of suitable genetic markers for molecular identification

Using the four above-mentioned properties, we assessed the suitability of the genetic markers for molecular identification of nematodes, trematodes and cestodes. The results are summarized in Table 3.

Interspecific genetic distances and phylogenetic placement as a measure for species discrimination

Sufficient sequence variation among species is an important indicator of whether the genetic marker is sufficiently robust for species discrimination [1, 8]. Interspecific genetic distance analyses across the four genetic marker classes indicated that the nuclear rRNA genes had the smallest sequence variation, with mean values that were statistically significantly different from each other (χ 2 = 161.7, df = 9, P < 0.000001 for nematodes χ 2 = 124.5, df = 9, P < 0.000001 for trematodes χ 2 = 129.0, df = 8, P < 0.000001 for cestodes). For the nuclear rRNA genes, the average genetic distances between species were < 0.03, suggesting low levels of sequence variation. Moreover, for the closely related taxa, sequence variation using the 18S rRNA gene was low (0.001, 0.002 and 0.003 for nematodes, trematodes and cestodes, respectively), possibly leading to inaccurate phylogenetic placement, which is problematic in terms of species identification. Examples of this are between nematodes, such as Toxocara canis versus T. cati and Ascaris lumbricoides versus A. suum, and between trematodes, such as Opisthorchis viverrini versus Clonorchis sinensis (Additional file 3: Figures S1g and S2g). Previous studies using the 18S rRNA gene have also shown low to no sequence variation among Trichuris spp. and no variation between Trichuris muris and T. arvicolae [30]. Similarly, in the tapeworms, Diphyllobothrium dentricum and D. ditremum, Wicht et al. [27] demonstrated that the 18S rRNA gene had lower species discriminatory power than did the nuclear spacer regions and the mtDNA genetic markers.

Conversely, interspecific genetic distances for the nuclear ribosomal ITS spacer regions and mitochondrial genetic markers were higher than are those for the nuclear rRNA genes (except ITS1, which had lower genetic distance for nematodes). The nuclear ribosomal ITS regions tend to be used for species identification because of their faster evolution rate, resulting in highly variable sequences between species [6]. Moreover, several studies have demonstrated the effectiveness of the nuclear ribosomal ITS for the molecular identification of parasitic helminths, usually with species-specific primers, to discriminate between closely related species [10, 24, 25, 62]. For example, using the ITS1 region, Kang et al. showed that genetic distances among the closely related liver flukes were 0.045 between O. viverrini and O. felineus and 0.056 between O. viverrini and C. sinensis [62]. However, in our study, sequence variation for cestodes was unusually high (> 0.300) using the nuclear ribosomal ITS regions, perhaps due to a lack of representative sequences, thus confounding the results.

For the mitochondrial protein-coding genes, interspecific sequence variation was 0.026–0.036 for nematodes, 0.158–0.195 for trematodes and 0.085–0.132 for cestodes. Closely related species in the three groups of helminths could also be differentiated, with genetic distance values of up to 0.166 with the cytB gene for nematodes, 0.195 with the NAD1 gene for trematodes and 0.132 with the NAD1 gene for cestodes. This higher degree of sequence variation seen for the mitochondrial protein-coding genes compared to the nuclear rRNA genes is a clear illustration of their ability to resolve species-level relationships, even among closely related species. Consequently, it is not surprising that the mitochondrial protein-coding genes have been used widely for molecular identification, both at the species level and the population level, and to differentiate helminths from various host species [7, 26, 28, 30, 63, 64].

For the mitochondrial rRNA genes, the interspecific genetic distance values were slightly smaller than those of the mitochondrial protein-coding genes, with means of 0.015 and 0.021 for the 12S and 16S rRNA gene for nematodes, 0.133 and 0.148 for trematodes, and 0.081 and 0.080 for cestodes, respectively. However, the genetic distances were significantly higher than those for the nuclear rRNA genes, rendering the mitochondrial rRNA genes suitable for species identification. In helminths, the 12S rRNA gene has been used successfully for molecular identification, confirming the phylogenetic placement of Setaria digitata among filarial nematodes [65]. Moreover, Chan et al. [66] showed the suitability of the mitochondrial rRNA genes for species discrimination of closely related species in the Angiostrongylus cantonensis lineage.

Thus, the results of our assessment of the suitability of genetic markers for molecular identification of nematodes, trematodes and cestodes suggest that the nuclear rRNA genes might not be suitable because of low sequence variation for species discrimination. Conversely, the mtDNA genetic markers have higher sequence variation to discriminate among species and closely related species, emphasizing their suitability as markers for molecular identification.

Advantageous properties of genetic markers for molecular systematics and identification purposes

The ease of both universal primer design and sequence alignment, in addition to the availability of full-length reference sequences, represent additional advantages that could affect a genetic marker’s suitability and utility for both molecular systematics and identification (Table 1).

First, highly conserved sequences when using the nuclear rRNA genes, as compared to the other genetic markers, can facilitate primer design that is suitable for amplifying a broad range of taxa. Universal primers for the three helminth groups have been developed using the 18S rRNA gene, and these have been used widely in molecular systematics due to their highly conserved nature [16,17,18,19]. Universal COI primers have also been developed and utilized for molecular-based studies [67, 68]. However, the relatively higher sequence variation in the COI gene in helminths compred to other groups of organisms has led to low PCR amplification success and limited taxa for analyses [42,43,44]. In this respect, the mitochondrial rRNA genes, being slightly less variable, possess an advantage over the more variable mitochondrial protein-coding genes and nuclear spacer regions, enabling the design of universal primer sets. Also, as compared to the more variable sequences of the mitochondrial protein-coding genes and the nuclear ribosomal ITS regions, the less variable sequences of the mitochondrial rRNA genes could increase the success of PCR amplification. Universal primers for the mitochondrial rRNA genes have been designed and utilized successfully for molecular identification and molecular systematics in nematodes [59, 66]. Secondly, the lower proportion of insertions and deletions in the sequences of the mitochondrial genetic markers enable easier sequence alignment than possible with the nuclear genetic markers. The lower proportion of indels can allow a comparison over a broader range of taxa across taxonomical levels. Lastly, with the increase in the availability of complete mitochondrial genomes in the NCBI database, full-length sequences of the mitochondrial genetic markers are readily available, presenting an advantage over the nuclear genetic markers.

Based on our evaluation of both molecular systematics and molecular identification in the selected helminths, the mitochondrial 12S and 16S rRNA genes show potential and could be suitable for applications in both contexts.

Generation of suitable genetic distance values for future applications

To create a yardstick for guiding users when adopting genetic distances for helminths, we provide essential points to be considered and an alternative method of using genetic distances through the ‘K-means’ clustering algorithm.

Large genetic variation in nematodes at the same taxonomic level

A wide range of genetic distances for nematodes was observed, in contrast to trematodes and cestodes. To further investigate this observation, we selected the nuclear 18S rRNA gene, the mitochondrial 12S rRNA gene and the COI gene as representative genetic markers to illustrate the broad levels of genetic distances in nematodes at the same taxonomic level.

As shown in Fig. 1a, the genetic distances between nematode genera show substantial variation, with statistically significant differences (χ 2 = 39.8, df = 6, P < 0.000001). The same pattern was observed across the three genetic markers, with Ascaris having the smallest genetic distance and Strongyloides the largest. In contrast, no significant between-genus differences were found for the trematodes and cestodes (Fig. 1b, c). The same finding was also observed at the family level, where there were significant differences between nematode families (Additional file 5: Figure S4). Comparison of values at the same taxonomic level indicates a high degree of sequence variation within nematodes. Thus, our findings reveal that a general assumption of genetic distances might not be suitable and that each group of organisms should have their own genetic distance cut-off values.

Violin-plot of genetic distances of nematodes (a), trematodes (b) and cestodes (c) between genera. Asterisk indicates statistically significant difference between each group, according to the Kruskal–Wallis test with Dunn’s posthoc analysis

Estimation of cut-off values per taxonomic level using the ‘K-means’ clustering algorithm

Previous studies have used genetic distances to determine whether specimens are conspecific, and in most cases, a general genetic distance value has been used as a basis for comparison [8]. In such studies, researchers mainly rely on the genetic distances of organisms that have been studied and try to find similar species to estimate whether it is a similar or different species. To circumvent this, we attempted to utilize a clustering algorithm-based machine learning strategy to estimate suitable cut-off values per taxonomic level for each genetic marker using the ‘K-means’ method and thus provide considerable data for future applications and an alternative method of analyzing genetic distances (Additional file 6: Table S13 Additional file 7: Figures S5–S7).

In our study, each taxonomic level was clearly distinguishable in the three groups of helminths for the 12S and 16S rRNA genes using the ‘K-means’ clustering algorithm, as presented in Fig. 2. Due to the large differences between each nematode order, analyses were performed separately for Trichocephalida, Ascaridida with Spirurida, and Strongylida. Similarly, the other genetic markers also showed distinct clustering patterns for each taxonomic level (Additional file 7: Figures S5–S7). The estimated cut-off values were derived from the minimum and maximum genetic distances of each cluster through the distinct clustering between each taxonomic level, allowing us to provide an estimation of the genetic distance values for each genetic marker, as provided in Additional file 6: Table S13. For example, using the 16S rRNA gene for trematodes, the estimated cut-off values between species ranged from 0.071 to 0.147, with a mean of 0.119, suggesting that the genetic distances between trematode species should fall within the specified range as estimated using the ‘K-means’ method. Likewise, for members of the same genus, the estimated cut-off values using the 16S rRNA gene for trematodes ranged from 0.151 to 0.215, with a mean of 0.181. Thus, using the ‘K-means’ clustering algorithm, we have provided a novel method for analyzing genetic distance values and generated a practical guide for future users with the estimated cut-off values per genetic marker for the helminths studied as a basis for comparison.

Estimated cut-off per taxonomic level of the mitochondrial rRNA genetic markers based on ‘K-means’ algorithm for nematodes belonging to Trichocephlida (a), nematodes belonging to Ascaridida and Spirurida (b), nematodes belonging to Strongylida (c), trematodes (d) and cestodes (e). Each colored circle indicates a genetic distance value that was input into the ‘K-means’ algorithm, and the dashed lines indicate the maximum genetic distance for each taxonomic level estimated with ‘K-means’


This study was limited by the availability and accuracy of the sequences in the NCBI database, which restricted the number of taxa that we could compare and analyze together across the genetic markers. Inadequate sampling can affect clade arrangement as well as the number of taxa recovered as monophyletic. Also, the species complex status for some helminth species was not considered, which could further complicate species delimitation. The results of the assessment of the genetic markers and genetic distance cut-off values were restricted to the helminth taxa that we selected, and future considerations to increase the number of species sampled should be undertaken.


The differences in effects of habitat fragmentation on genetic diversity within metapopulations between selfing and outcrossing Zingiber species

Compared to selfing species, genetic diversity within populations of outcrossing species tends to be higher and differentiation among populations tends to be lower ( Clasen et al. 2011). However, our data revealed that although the level of subpopulation genetic diversity in selfing Z. corallinum was significantly lower than that in outcrossing Z. nudicarpum (h = 0.0662 vs. 0.1464, P = 0.028 I = 0.0995 vs. 0.2257, P = 0.023), the level of metapopulation genetic diversity of selfing Z. corallinum was comparable to that of outcrossing Z. nudicarpum (h = 0.2490 vs. 0.2246, P = 0.295 I = 0.3753 vs. 0.3480, P = 0.438). The mating system (i.e. selfing vs. outcrossing) can strongly influence the vulnerability to fragmentation effects on genetic diversity ( Aguilar et al. 2008). Although populations of outcrossing species can maintain a high level of genetic diversity through frequent exchange of genes with other populations (Honnay and Jacquemyn 2007), sudden decreases in effective population sizes due to habitat fragmentation have strong negative effects on within-population genetic diversity of outcrossing species ( Aguilar et al. 2008), such as tropical Ficus species with specialized pollinator systems ( Nason and Hamrick, 1997), wind-pollinated outcrossing Fagus sylvatica ( Jump and Peñuelas 2006) and insect-pollinated outcrossing Lepidium subulatum ( Gómez-Fernández et al. 2016). Because of increasing habitat destruction and decreasing local population size, the exchange of alleles may be reduced, and genetic diversity may decrease without the possibility of replenishing the alleles ( Honnay and Jacquemyn 2007). On the other hand, with severe inbreeding depression, inbred individuals harbouring deleterious alleles may die or not reproduce, effectively removing these alleles from the population. Thus, inbreeding will tend to purge populations of enough deleterious recessive mutations to reduce inbreeding depression ( Crnokrak and Barrett 2002). Therefore, outcrossing species show stronger negative effects of fragmentation on genetic diversity than selfing species ( Aguilar et al. 2008). In contrast, the level of population genetic diversity of mainly selfing species will be less affected by reduced gene flow because each individual contains most of the genetic diversity of the population ( Honnay and Jacquemyn 2007).

Without migration among demes of subpopulations in metapopulations of selfing species, any mutation that arises in a particular subpopulation may be fixed in that subpopulation and cannot spread to other subpopulations. Therefore, compared to a large primary population, while an individual small population fragment may become homozygous for a particular allele, the overarching metapopulation could still maintain significant genetic diversity because the various population fragments it encompasses may fix different loci ( Frankham et al. 2002). This is may be the case for selfing Z. corallinum in our study. The proportion of common loci within subpopulations of selfing Z. corallinum metapopulations was significantly higher than that in outcrossing Z. nudicarpum (67.6 % vs. 37.7 %, P = 0.041). However, both species contained similar levels of common loci in metapopulations (15.8 % vs. 16.2 %, P = 0.982). In addition, the specific band number within subpopulations and metapopulations of selfing Z. corallinum was higher than that in outcrossing Z. nudicarpum, but not significant (11.3 vs. 5, P = 0.114 and 39.5 vs. 12.5, P = 0.258, respectively). Together, these results imply that local adaptation and/or neutral mutation may have caused differentiation among subpopulations (patches) by fixation of different loci ( Owuor et al. 1999) in selfing Z. corallinum metapopulations. The increased diversity between populations is paralleled by a similarly high level of diversity between allelic classes at polymorphic loci, which have very different allele frequencies among subpopulations, thus resulting in a high level of diversity in the respective metapopulations ( Charlesworth et al. 1997). Due to the lack of gene flow and homogeneous habitat, it is not likely that local adaptation (nature selection) may have contributed significantly to the increased genetic differentiation between subpopulations within Z. corallinum metapopulations. The Mantel tests also show selfing Z. corallinum does not exhibit a pattern of isolation by distance among subpopulations within metapopulations, suggesting that the stochastic force of genetic drift is much stronger than gene flow in determining the structure of subpopulations ( Pettengill et al. 2016) within selfing Z. corallinum metapopulations. Here, we suggest that genetic diversity of selfing Z. corallinum can be maintained at the metapopulation level due to differentiation among subpopulations and that the genetic variability among subpopulations is expected to increase continuously with time, due to new mutations adding in continuously. Compared with the landscape level, selfing Z. corallinum could maintain high genetic diversity through differentiation intensified primarily by the stochastic force of genetic drift among subpopulations at fine-scale level, but not local adaptation.

Genetic structure patterns within metapopulations of selfing and outcrossing species

Outcrossing plants typically show higher genetic variation within populations or subpopulations, whereas in selfing plants most of the genetic variation is found among populations or subpopulations (Honnay and Jacquemyn 2007). Our AMOVA analysis also revealed that the major portion of genetic variation in selfing Z. corallinum metapopulations resides among subpopulations (66.3–90.5 %), while a lower degree of genetic variance (9.5–33.7 %) exists within different subpopulations. However, the majority of variation was also found among subpopulations (62.5 %), rather than within subpopulations (37.5 %) in outcrossing Z. nudicarpum metapopulation HNCJ, in which gene flow was seriously eroded by habitat fragmentation (Nm = 0.7180 < 1) and genetic differentiation among subpopulations was higher. The opposite was true in the outcrossing Z. nudicarpum metapopulation HNBT, in which gene flow was not significantly affected by habitat fragmentation (Nm = 1.9734 > 1). The genetic structure of outcrossing Z. nudicarpum metapopulations can be attributed to the short distances of pollen movement via parasitic bees and the constraints of seed dispersal by gravity. In the metapopulation HNBT, three subpopulations scattered widely in a more or less continuous area of suitable habitat, which separated by 200–550 (average ca. 320) m apart. This isolation degree could not prevent the pollen movement of outcrossing Z. nudicarpum between populations, as evidenced by the significant positive autocorrelation of spatial genetic structure with 100–1500 m. However, the two subpopulations in the metapopulation HNCJ are isolated by 450–1000 (average ca. 725) m of mountain forest. This greater isolation could significantly prevent the pollen migration of outcrossing Z. nudicarpum between populations. In comparison, the genetic structure of selfing Z. corallinum metapopulations is influenced almost entirely by restricted seed dispersal due to gravity alone. Our results suggest that the majority of genetic variation resides among subpopulations in selfing Z. corallinum metapopulations, while the major portion of genetic variation exists within or among subpopulations in outcrossing Z. nudicarpum metapopulations, most probably depending on whether the degree of subpopulation isolation surpasses the dispersal ability of pollen and seed.

Our cluster analysis showed that neighbouring individuals within subpopulations always grouped together in selfing Z. corallinum metapopulations and all seedlings also clustered with their nearest adults. Moreover, the significant positive autocorrelation of spatial genetic structure occurs within only 2–34 m in subpopulations of selfing Z. corallinum. The above autocorrelations over short distances reflect the occurrence of patches of genetically similar individuals ( Torres et al. 2003) in selfing Z. corallinum metapopulations. Previous studies have also shown that a high level of spatial genetic structure is typical of a population of predominantly selfing and gravity-dispersed plants ( Volis et al. 2010, 2016 Barluenga et al. 2011). For selfing Z. corallinum metapopulations, this is the logical consequence of two phenomena, the high levels of self-fertilization leading to inflated inbreeding in the offspring. Restricted gravity-driven seed dispersal around the parents could cause aggregated distribution of offspring in maternal half-sib families or full-sib families ( Bittencourt and Sebbenn 2007). However, many individuals did not aggregate with their neighbours within subpopulations in outcrossing Z. nudicarpum metapopulations, and significant positive autocorrelation of spatial genetic structure occurred over distances of 100–1500 m. This is consistent with the hypothesis that outcrossing species always tend to generate a lower spatial genetic structure than selfing species ( Vekemans and Hardy 2004), presumably due to higher gene flow via pollen ( Duminil et al. 2009). In outcrossing plant species, pollen dispersal contributes to overall gene dispersal, whereas in highly selfing species, seed dispersal alone governs overall gene dispersal ( Vekemans and Hardy 2004). Genetic analyses using the STRUCTURE software package also showed that all individuals within subpopulations in outcrossing Z. nudicarpum metapopulation HNCJ (Nm = 0.7180 < 1) were assigned to the same genetic cluster, but this was not the case in metapopulation HNBT (Nm = 1.9734 > 1). Moreover, UPGMA and NJ analysis revealed that all individuals within subpopulations in metapopulation HNCJ were clustered together as a single clade, but, again, metapopulation HNBT did not conform to this pattern. The results of the PCoA revealed a similar clustering pattern.

In summary, our results indicate that restricted gene flow as a result of gravity-driven seed dispersal contributes to the genetic differentiation between subpopulations or fragments within metapopulations of selfing Z. corallinum. Although limited seed dispersal may have a stronger effect on the genetic structure of a population, pollen movement could promote gene exchange between or within subpopulations or fragments within outcrossing Z. nudicarpum metapopulations. Thus, contrary to our expectations, a weaker genetic structure appears in species like Z. nudicarpum with extensive pollen movement but restricted seed dispersal when such species occur in fragmented habitats.


Plant materials and experimental design

A bread wheat panel of 543 genotypes including cultivars, regional test lines, and introduced parental lines was used, the details have been published in our previous paper [20]. During the two growing seasons, wheat plants grow in three places in Hebei Province. The locations were Baoding (115.5°48′E, 38°85′N), Cangzhou (116°80′E, 38°58′N), and Xingtai (118°9′E, 39°42′N). The six environments were designated as follows: 2016 Baoding (E1), 2016 Cangzhou (E2), 2016 Xingtai (E3), 2017 Baoding (E4), 2017 Cangzhou (E5), and 2017 Xingtai (E6). The field trial was completed using a completely randomized design. Each plot contained three 1.5 m rows with 0.25 m between rows. The plant spacing is about 2.5 cm. Wheat plants were cultivated following normal local practices.

Phenotypic evaluation

Twenty-five phenotypic traits were measured, including growth and development-related traits (FLL, FLW, FLA, FA, MTN, HD, MP, GFP, GFR, TGW, PH, FD, FIL, SD, SIL, and TH), yield-related traits (SL, SNS, KNPS, PET, and EPM), and quality-related traits (GV, GPC, WGC, and FC). The data recorded for each trait are summarized in Table S1. The phenotypic traits were assessed in all six environments. The phenotypic data for each environment and the BLUP data were used for the genome-wide association analysis.

Phenotypic data analysis

The descriptive statistical analysis and correlation analysis for the phenotypic data were completed using the SPSS 25.0 software. Pearson’s correlation coefficients were calculated to evaluate the correlations among the traits.

SNP genotyping

The wheat 90 K Illumina Infinium SNP array was used to genotype the association panel containing 543 accessions. The SNP data were clustered and automatically called using the Illumina BeadStudio genotyping software (Illumina, San Diego, CA, USA). The data were filtered to remove alleles with a detection rate less than 0.1 and a minor allele frequency less than 0.05 [12]. Additionally, samples with a loss rate greater than 10% and a heterozygosity frequency greater than 20% were eliminated.

Genome-wide association analysis

The population structure, relative kinship, and LD were analyzed in a previous study [20]. In the current study, we completed a GWAS using the GAPIT package [40] in the R software. A mixed linear model program (Q + K) [41], with the population stratification results and kinship as covariates, was used to minimize false positives [40]. The P value threshold was calculated based on the number of markers (P = 1/n, n = total number of SNPs used) as described by Li et al. [42]. Regarding the GWAS results, a P value of 1/11,140 (−log10P = 4.05) was used as the criterion for identifying significant SNPs.

Prediction of candidate genes and expression analysis

The ‘Chinese Spring’ Genome database (IWGSC RefSeq v1.0, http: // was used for predicting candidate genes for the significant sites revealed by the genome-wide association analysis. Specifically, candidate genes around the significant sites were identified according to the differences in the LD decay distance among chromosomal groups. The expression profiles of putative candidate genes were analyzed using a wheat gene expression database available online ( This database, which includes 850 wheat RNA-sequencing samples and an annotated genome, reveals the similarities and differences between homoeolog expression levels in diverse tissues, developmental stages, and cultivars [33, 43].

Annual Review of Plant Biology

AIMS AND SCOPE OF JOURNAL: The Annual Review of Plant Biology, in publication since 1950, covers the significant developments in the field of plant biology, including biochemistry and biosynthesis, genetics, genomics and molecular biology, cell differentiation, tissue, organ and whole plant events, acclimation and adaptation, and methods and model organisms.

Fruit Development and Ripening

Studies of dry fruits (such as of the small weed Arabidopsis) and fleshy fruits (such as our friend the tomato) reveal strong similarities in the molecular circuits that control fruit development and maturation, with implications for crop improvement.

Perennial Grains and Oilseed Crops

Current farming practices are generating unprecedented yields, but come at a price to the ecosystem. Soil erosion, greenhouse gas emissions, and water pollution all result from modern farming. What are the possible alternatives? Some scientists are studying the potential for perennial grains and oilseed crops to alleviate these farming challenges. These can support the health of the ecosystem, but can they also continue to meet a growing population's need for more grain-intensive food?

Aiken KA (1977) Jamaica spiny lobster investigations. FAO Fish Rep 200:11–22

Austin HM (1972) Notes on the distribution of phyllosoma of the spiny lobster, Panulirus ssp., in the Gulf of Mexico. Proc natn Shellfish Ass 62:26–30

Avise JC (1992) Molecular population structure and the biogeographic history of regional fauna: a case history with lessons for conservation biology. Oikos 63:62–76

Avise JC, Arnold J, Ball RM, Bermingham E, Lamp T, Neigel JE, Reeb CA, Saunders NC (1987) Intrasepcific phylogeography: the mitochondrial DNA bridge between population genetics and systematics. A Rev Ecol Syst 18:489–522

Baisre JA (1976) Distribution de las larvas de Panulirus argus y Scyllarus americanus (Crustacea, Decapoda) en arguas alrededor de Cuba. Revta Investnes (Centro Investnes pesq, Inst nac Pesca, Cuba) 2:277–297

Brasher DJ, Ovenden JR, Booth JD, White RWG (1992) Genetic subdivision of Australian and New Zealand populations of Jasus verreauxi (Decapoda: Palinuridae)-preliminary evidence from the mitochondrial genome. NZ J mar Freshwat Res 26:53–58

Brooks LH, Niiler PP (1975) The Florida current at Key West summer 1972. J mar Res 33:83–92

Brown WB (1980) Polymorphism in mitochondrial DNA of humans as revealed by restriction endonuclease analysis. Proc natn Acad Sci USA 77:3605–3609

Bucklin A, Rienecker MM, Mooers CNK (1989) Genetic tracers of zooplankton transport in coastal filaments off Northern California. J geophys Res 94(C6):8277–8288

Camper JD, Barber RC, Richardson LR, Gold JR (1993) Mitochindrial DNA variation among red snapper (Lutjanus camperchanus) from the Gulf of Mexico. Molec mar Biol Biotechnol 2:154–161

Cheney RC, Marsh JG (1981) Seasat altimeter observations of dynamic topography in the Gulf Stream. J geophys Res 86:473–484

Cobo de Barany T, Ewald J, Cadima E (1972) La pesca de la langosta en el archipielago de Los Roques, Venezuela. Infme téc Proy Invest Desarrollo pesq, Caracas 43:1–34

Cockerham CC (1969) Variance of gene frequencies. Evolution 23:72–84

Edwards CA, Skibinski DOF (1987) Genetic variation of mitochondrial DNA in mussel (Mytilus edulis and M. galloprovincialis) populations from South West England and South Wales. Mar Biol 94:547–556

Excoffier L, Smouse PE, Quattro JM (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics, Baltimore Md 131:479–491

Farmer MW, Ward JA, Luckhurst BE (1989) Development of spiny lobster (Panulirus argus) phyllosoma larvae in the plankton near Bermuda. Proc Gulf Caribb Fish Inst 39:289–301

Gaines SD, Bertness MD (1992) Dispersal of juveniles and variable recruitment in sessile marine species. Nature, Lond 360:579–580

Glaholt RD, Seeb J (1992) Preliminary investigation into the origin of the spiny lobster, Panulirus argus (Latreille, 1804), population of Belize, Central America (Decapoda, Palinuridea), Crustaceana 62:159–165

Gold JR, Richardson LR (1991) Genetic studies in marine fishes. IV. An analysis of population structure in the red drum (Sciaenops ocellatus) using mitochondrial DNA. Fish Res 12:213–241

Hately JG, Sleeter TD (1993) A biochemical genetic investigation of spiny lobster (Panulirus argus) stock replenishment in Bermuda. Bull mar Sci 53:993–1008

Jeffreys AJ, Wilson V, Thein SL (1985) Hypervariable ‘minisatellite’ regions in human DNA. Nature, Lond 314:67–73

Kanciruk P, Herrnkind WF (1976) Autumnal reproduction in the spiny lobster, Panulirus argus, at Bimini, Bahamas. Bull mar Sci 26:417–432

Kinder TH (1983) Shallow currents in the Caribbean Sea and Gulf of Mexico as observed with satellite-tracked drifters. Bull mar Sci 33:239–246

Kinder TH, Heburn GW, Green AW (1985) Some aspects of the Caribbean circulation. Mar Geol 68:25–52

Kittaka J, Kimura K (1989) Culture of the Japanese spiny lobster Panulirus japonicus from egg to juvenile stage. Nippon Suisan Gakk 55:963–970

Komm B, Michaels A, Tsokos J, Linton J (1982) Isolation and characterization of the mitochondrial DNA from the Florida spiny lobster, Panulirus argus. Comp Biochem Physiol 73B:923–929

Kornfield I, Bogdanowicz SM (1987) Differentiation of mitochondrial DNA in Atlantic herring, Clupea harengus. Fish Bull US 85:561–568

Labisky RF, Gregory DR Jr, Conti JA (1980) Florida's spiny lobster fishery: an historical perspective. Fisheries (Bull Am Fish Soc) 5:28–37

Lee TN, Clarke ME, Williams E, Szmant AF, Berger T (1994) Evolution of the Tortugas Gyre and its influence on recruitment in the Florida Keys. Bull mar Sci (in press)

Lee TN, Rooth C, Williams E, McGoen M, Szmant AF, Clarke ME (1991) Influence of Florida current, gyres and wind-driven circulation on larvae transport and recruitment in the Florida Keys coral reefs. Contin Shelf Res 12:971–1002

Lewis JB (1951) The phyllosoma larvae of the spiny lobster Panulirus argus. Bull mar Sci Gulf Caribb 1:89–103

Litt M, Luty JA (1989) A hypervariable microsatellite revealed by in vitro amplification of a dinucleotide repeat within the cardiac muscle actin gene. Am J hum Genet 44:397–401

Little EJ Jr (1977) Observations on recruitment of postlarval spiny lobsters, Panulirus argus, to the south Forida coast. Fla mar Res Publs 29:1–33

Lyons WG (1980) Possible sources of Florida's spiny lobster population. Proc Gulf Caribb Fish Inst 33:253–266

Lyons WG (1986) Problems and perspectives regarding recruitment of spiny lobsters, Panulirus argus, to the south Florida fishery. Can J Fish aquat Sciences 43:2099–2106

Marchal EG (1968) Sur la capture de long des cotes Africaines de deux speciments de Panulirus argus (Latreille). Bull Mus natn Hist nat, Paris (ser 2) 39:1120–1122

Mattox NT (1952) A preliminary report on the biology and economics of the spiny lobster in Puerto Rico. Proc Gulf Caribb Fish Inst 4:69–70

McLean M, Okubo CK, Tracy ML (1982) mtDNA heterogeneity in Panulirus argus. Experientia 39:536–538

Menzies RA (1980) Biochemical population genetics and the spiny lobster larval recruitment problem: an update. Proc Gulf Caribb Fish Inst 33:230–243

Menzies RA, Kerrigan JM (1979) Implications of spiny lobster recruitment patterns of the Caribbean—a biochemical genetic approach. Proc Gulf Caribb Fish Inst 31:164–178

Moritz C, Dowling TE, Brown WM (1987) Evolution of animal mitochondrial DNA: relevance for population biology and sysematics. A Rev Ecol Syst 18:269–292

Munro JL (1974) The biology, ecology and bionomics of Caribbean reef fishes. Part 5.1. Crustaceans (spiny lobsters and crabs). Res Rep zool Dep Univ W Indies 3:1–57

Nakamura Y, Leppert M, O'Connell P, Wolff R, Holm T, Culver M, Martin C, Fujimoto E, Hoff M, Kumlin E, White R (1987) Variable number of tandem repeat (VNTR) markers for human gene mapping. Science, NY 235:1616–1622

Nei M (1987) Molecular evolutionary genetics. Columbia University Press, New York

Nei M, Li W-H (1979) Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc natn Acad Sci USA 76:5269–5273

Nei M, Tajima F (1981) DNA polymorphism detectable by restriction endonucleases. Genetics, Austin, Tex 97:145–163

Nowlin WD, Hubertz J (1972) Contrasting summer circulation patterns for the eastern Gulf-Loop Current vs anticyclone ring. In: Capurro L, Reid J (eds) Contributions on the physical oceanography of the Gulf of Mexico. Gulf Publishing Co., Houston, Tex, pp 119–138 (Tex A&M Univ oceanogr Stud)_

Ogawa M, Oliveira GM, Sezaki K, Watabe S, Hashimoto K (1991) Genetic variations in three species of spiny lobsters, Panulirus argus, Panulirus laevicauda and Panulirus japonicus. Revta Investnes mar, Habana 12:39–44

Ovenden JR (1990) Mitochondrial DNA and marine stock assessment: a review. Aust J mar Freshwat Res 41:835–853

Ovenden JR, Brasher DJ, White RWG (1992) Mitochondrial DNA analyses of the red rock lobster Jasus edwardsii supports an apparent absence of population subdivision throughout Australasia. Mar Biol 112:319–326

Pella JJ, Milner GB (1987) Use of genetic markers in stock composition analysis. In: Ryman N, Utter F (eds) Population genetics and and fishery management. University of Washington Press, Seattle, Washington, pp 247–276

Phillips BF, McWilliams PS (1986) The pelagic phase of spiny lobster development. Can J Fish aquat Sciences 43:2153–2163

Pollock DE (1990) Palaeoceanography and speciation in the spiny lobster genus Jasus. Bull mar Sci 46:387–405

Pollock DE (1992) Palaeoceanography and speciation in the spiny lobster genus Panulirus in the Indo-Pacific. Bull mar Sci 51:135–146

Richards WJ, Potthoff T (1980) Distribution and seasonal occurrence of larval pelagic stages of spiny lobsters (Palinuridae, Panulirus) in the western tropical Atlantic. Proc Gulf Caribb Fish Inst 33:244–252

Reeb CA, Advise JC (1990) A genetic discontinuity in a continuously distributed species: mitochondrial DNA in the American oyster, Crassostrea virginica. Genetics, Baltimore Md 124:397–406

Roff DA, Bentzen P (1989) The statistical analysis of mitochondrial DNA polymorphism: χ 2 and the problem of small samples. Molec Biol Evolut 6:539–545

Saunders NC, Kessler LG, Avise JC (1986) Genetic variation and geographic differentiation in mitochondrial DNA of the horseshoe crab Limulus polyphemus. Genetics, Baltimore Md 112:613–627

Shaklee JB, Samollow PB (1984) Genetic variation and population structure in a spiny lobster, Panulirus marginatus, in the Hawaiian archipelago. Fish Bull US 82:693–702

Silberman JD (1993) Molecular variation in the spiny lobster Panulirus argus: recruitment aspects. Ph.D. dissertation. University of Miami, Rosenstiel School of Marine and Atmospheric Science, Miami, Florida, USA

Silberman JD, Walsh PJ (1992) Species identification of spiny lobster phyllosome larvae via ribosomal DNA analysis. Molec mar Biol Biotechnol 1:195–205

Sims HW Jr (1966) The Florida spiny lobster. Florida Board of Conservation Marine Laboratory, St. Petersburg, Florida (Mimeogr)

Sims HW, Ingle RM (1967) Caribbean recruitment of Florida's spiny lobster population. Q Jl Fla Acad Sci 29:207–242

Southern EM (1975) Detection of specific sequences among DNA fragments separated by gel electrophoresis. J molec Biol 98:502–517

Stommel H (1965) The Gulf Stream. University of California Press, Berkeley, California

Tautz D (1989) Hypervariability of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Res 17:6463–6471

Williams AB (1988) Lobsters of the world. An illustrated guide, Osprey Books, Huntington, NY

Williams JGK, Kubelik AR, Livak KJ, Rafaski JA, Tingey SV (1990) DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res 18:6531–6535

Witham RR, Ingle RM, Sims HW Jr (1964) Notes on post-larvae of Panulirus argus. Q Jl Fla Acad Sci 27:289–297

Wright S (1965) The interpretation of population structure by F-statistics with special regard to systems of mating. Evolution 9:395–420


Fasciola hepatica, the liver fluke, is a trematode parasite of considerable economic importance to the livestock industry and is a re-emerging zoonosis that poses a risk to human health in F. hepatica-endemic areas worldwide. Drug resistance is a substantial threat to the current and future control of F. hepatica, yet little is known about how the biology of the parasite influences the development and spread of resistance. Given that F. hepatica can self-fertilise and therefore inbreed, there is the potential for greater population differentiation and an increased likelihood of recessive alleles, such as drug resistance genes, coming together. This could be compounded by clonal expansion within the snail intermediate host and aggregation of parasites of the same genotype on pasture. Alternatively, widespread movement of animals that typically occurs in the UK could promote high levels of gene flow and prevent population differentiation. We identified clonal parasites with identical multilocus genotypes in 61% of hosts. Despite this, 84% of 1579 adult parasites had unique multilocus genotypes, which supports high levels of genotypic diversity within F. hepatica populations. Our analyses indicate a selfing rate no greater than 2%, suggesting that this diversity is in part due to the propensity for F. hepatica to cross-fertilise. Finally, although we identified high genetic diversity within a given host, there was little evidence for differentiation between populations from different hosts, indicating a single panmictic population. This implies that, once those emerge, anthelmintic resistance genes have the potential to spread rapidly through liver fluke populations.

Watch the video: Genes, DNA and Chromosomes explained (August 2022).