Microsatellite loci are widely used to understand how the genetic diversity of marine organisms is distributed along the geographic space. In addition, this information can be useful for management of marine resources (Canales-Aguirre et al., 2010a, 2010b, 2016; Galleguillos et al., 2011, 2012; Ferrada-Fuentes et al., 2014). Traditional Short Sequence Repeat (SSR) development is time-consuming and involves laborious iterations of genomic DNA library screening with SSR probes required to isolate microsatellite-containing sequences (Castoe et al., 2012). Next-generation sequencing technologies are remarkably well-developed and are widely used for genome sequencing, transcriptome sequencing, and genome deep-sequencing in animals (Ferrada-Fuentes et al., 2014; Plough & Marko, 2014). The use of new technologies, like Illumina, 454 or SOLiD sequencing, to create and characterize microsatellite loci, quickly obtaining a large number of microsatellite loci without the need to clone a library (Castoe et al., 2012).
The anchovy, Engraulis ringens Jenyns, 1842, is a small pelagic fish exploited for its high commercial value in the Humboldt Current System. This species sustains the world's largest single-species fishery with 6.5 million ton landed per year on average over the last decade (Bertrand et al., 2008). The anchovy plays a key ecological role in the Humboldt Current System because it is the major prey of predators such as fish, marine mammals and seabirds (Espinoza & Bertrand, 2008). Thus, the anchovy is essential for the maintenance of the integrity of this ecosystem (Espinoza & Bertrand, 2008). To date, little is known about their genetic diversity and their spatial genetic structure. Only two polymorphic genes (i.e., Calmodulin and Internal Transcribed Spacers, ITS1) have been used, however, these showed low polymorphisms at the population level (Ferrada et al., 2002). To contribute to filling this gap in knowledge (looking for more polymorphic molecular markers), in this study we report the isolation and characterization of 32 polymorphic loci for E. ringens. These microsatellite loci were developed as a tool for estimating genetic diversity and population genetic structure in this species, with the goal of providing baseline information for management plans aimed at their protection.
The samples used in this study were collected in accordance with the national legislation of the country (Chile). We did not kill fishes for the purpose of this study. We obtained tissue samples after samples were fished from authorized purse seine fishing commercial vessels. No specific approval is required for this vertebrate. Total genomic DNA was extracted from four individuals of Engraulis ringens collected in the Arauco Gulf, Chile (36°55.2'S, 73°22.8'W), using Nucleospin Tissue Kit (Machery and Nagel), following manufacturer protocol. DNA samples were checked with the Bioanalyzer Agilent Model 2100. The enriched library was built using a range of 500 ng to 1 μg of DNA, the GS Rapid Library Preparation kit and a single lane run on a Roche 454 GS Junior system were used to sequence a part of the genome. The NGS was performed at OMICS Solutions (http://omicssolutions.cl). Sequencing gene-rated a total of 80.7 Mb of quality-filtered data, corresponding to 136,537 non-redundant reads. The MISA 4.0 software (http://pgrc.ipk-gatersleben.de/misa/) was used to search for repeated motifs (di-, tri-, tetra-, penta-, and hexanucleotide), and primers were designed using Primer3 (http://bioinfo.ut.ee/primer3-0.4/). A total of 27,352 reads with microsatellites were detected, resulting in 13,211 reads with primers.
Genomic DNA from 45 individuals was used to test 80 microsatellites. Polymerase chain reaction (PCR) amplifications were performed in a 10 μL volume (10 mM Tris pH 8.4, 50 mM KCl, 25 μg mL-1 BSA, 0.4 μM unlabeled reverse primer, 0.04 μM of forward primer (fluorescently labelled 6-FAM, NED, PET or VIC), 3 mM MgCl2, 0.8 mM dNTPs, 0.5 units Taq Polymerase (Invitrogen), and 20 ng DNA template using an Applied Biosystems GeneAmp 9700. A touchdown thermal cycling program (Don et al., 1991) encompassing a 10°C span of annealing temperatures ranging between 65-55°C was used for all loci. Touchdown cycling parameters consisted of an initial denaturation step of 5 min at 95°C followed by 20 cycles of 95°C for 30 s, highest annealing temperature of 65°C (decreased 0.5°C per cycle) for 30 s, and 72°C for 30 s; and 20 cycles of 95°C for 30 s, 55°C for 30 s, and 72°C for 30 s, with a final extension at 72°C for 5 min. PCR products were run on an ABI-3130xl sequencer using GS-500 (LIZ) as an internal size standard. Microsatellites were analyzed using Peak Scanner v1.0 (Applied Biosystems). Unambiguous scoring was possible for 32 polymorphic loci. We assessed the genetic diversity of the 32 polymorphic loci in 45 specimens collected from the Arauco Gulf, Chile.
Characteristics of the loci are provided in Table 1. We estimated the number of alleles per locus (NA), observed and expected heterozygosity (HO and HE), and the probability of identity (PI) using GenAlEx v6.5 (Peakall & Smouse, 2012). The presence of null alleles was evaluated using MicroChecker v.2.2.3 (Van Oosterhout et al., 2004). Tests for deviations from Hardy-Weinberg equilibrium (HWE) and for linkage disequilibrium were conducted using Genepop v4.0 (Rousset, 2008). Parameters for linkage disequilibrium were as follows: 1000 dememorization steps, 100 batches, and 1000 iterations per batch.
Table 1 Characterization of 32 polymorphic microsatellite loci developed for Engraulis ringens from central Chile.
Locus (accession no) | Primer pair sequence (5′-3′) | Dye | Repeat motif | Size range (bp) | Ni | NA | Ho | HE | PI | P-value (HWE) |
---|---|---|---|---|---|---|---|---|---|---|
6μER (KY073503) | F: TGGGTTGATAAATAGACTAGA R: AGTATTAACACTTGTAGGTGC |
6-FAM | (ATGA)5 | 175-218 | 35 | 6 | 0.771 | 0.677 | 0.150 | 0.913 |
7μER (KY073509) | F: GGATGATATTTCTCACTTTG R: GTTTTTCACACTCTAAATGTC |
VIC | (CACT)5 | 226-255 | 37 | 4 | 0.297 | 0.408 | 0.414 | 0.004 |
9μER(KY073S17) | F: GATAAAAGCACTGTCTGTATT R: ACTAATGAATGTTAAGCAGTC |
NED | (GTCT)5 | 257-320 | 37 | 8 | 0.730 | 0.787 | 0.078 | 0.444 |
11μER (KY073523) | F: GTCAAGGAAAACAGTTTATT R: AGAGCACAATAGAAGTTGATA |
PET | (TATT)5 | 168-273 | 38 | 6 | 0.474 | 0.667 | 0.144 | 0.009 |
17μER (KY073515) | F: TTAGTA1ATGGGTATGTGTCC R: CAAGATTCACACTATGTAAGC |
6-FAM | (TGA)7 | 155-218 | 35 | 14 | 0.200 | 0.875 | 0.027 | 0.000† |
18μER (KY073496) | F: AAACACTACACTCATGAACTG R: GAGTCTACATGTGTAAAGTCG |
VIC | (AC) 12 | 113-255 | 44 | 20 | 0.750 | 0.905 | 0.016 | 0.000† |
21μER (KY073495) | F: ATGTACAACTTCCAAAATCT R: ATIACTGGTATGAAATGAGTG |
PET | (CTA)8 | 117-229 | 41 | 18 | 0.366 | 0.911 | 0.014 | 0.510 |
22μER (KY073505) | F: GTGTGTATGTTTCTTTTCAA R: TCTCTATGGGACTTTAACATA |
6-FAM | (CAGA)6 | 273-296 | 43 | 6 | 0.674 | 0.659 | 0.166 | 0,003 |
27μER (KY073519) | F: GCTTTCTGGATGTTTTAGAT R: AACACTATCTGACAACTGACA |
VIC | (TGTC)6 | 149-257 | 44 | 4 | 0.455 | 0.513 | 0.347 | 0.833 |
28μER (KY073499) | F: CATGGTTTTAAATCTGTGAC R: TACAAATGAGAGCAAATACA |
NED | (TTTA)6 | 239-297 | 41 | 12 | 0.854 | 0.857 | 0.036 | 0.972 |
25μER (KY073504) | F: CATCAAAGTTATTCACTTCAC R: GAATTCTTCTAACTGACACT |
PET | (TCAT)6 | 145-287 | 35 | 7 | 0.886 | 0.646 | 0.180 | 0.310 |
44μER (KY073512) | F: TACACAAGACGTTTCAGAGT R: TACATACAACCTTGGAGACT |
6-FAM | (TGG)9 | 129-160 | 33 | 11 | 0.273 | 0.884 | 0.025 | 0.000† |
36μER (KY073518) | F: GTGTATATTTGATGGCACTT R: CAGAGTACTCTTTGAGTGTTG |
VIC | (TGGTG)5 | 142-194 | 39 | 7 | 0.205 | 0.464 | 0.308 | 0.000† |
38μER (KY073510) | F: GACAAGAGATTAACATTACCA R: AATAACTGTAAGTCGCTTTG |
NED | (TTACA)5 | 147-158 | 41 | 3 | 0.171 | 0.298 | 0.513 | 0.003 |
35μFR (KY07352I) | F: CTCAGTGGAAACAAGTCACT R: TAAATACATGCTTAAGAGTCC |
PET | (GCAAG)5 | 147-174 | 40 | 4 | 0.425 | 0.457 | 0.332 | 0.060 |
75μER (KY073497) | F: CTGTAATATCCACTCAAAGAT R: TTTTTCACAGTATAATGCTG |
6-FAM | (ACACAG)9 | 172-358 | 39 | 16 | 0.359 | 0.923 | 0.014 | 0.000† |
45μER (KY073502) | F: ATAAAAAGTTGAGGCTGTTT R: GACTTTGAAGACAGCTGTAC |
VTC | (ATTC)7 | 162-203 | 44 | 9 | 0.795 | 0.765 | 0.088 | 0.460 |
48μFR (KY073520) | F: CCAATAGTTCAGTAGTACCAG R: ACAGTAAGCTAGAGTATCCAG |
NED | (GCT)10 | 136-175 | 43 | 12 | 0.860 | 0.832 | 0.048 | 0.995 |
39μER (KY073522) | F: GTTGTCAGAAGCTTTAGTCA R: GATAAGAAATACACAGGAAAG |
PET | (TTTCC)5 | 143-193 | 44 | 10 | 0.636 | 0.829 | 0.048 | 0.032 |
50μER (KY073514) | F: GTCTAGGGGTGTAAATAATAA R: TCCACTTCTATTATGTTATGG |
6-FAM | (ATCGT)6 | 157-163 | 42 | 2 | 0.976 | 0.500 | 0.375 | 0.000† |
53μER (KY073498) | F: TGTAGAAGAAAGTGACAGAGA R: GTTTATTGGTGTGAGTCATT |
VIC | (GA)17 | 136-235 | 45 | 33 | 0.822 | 0.943 | 0.006 | 0.033 |
61μER (KY073501) | F: GAGATIIAGCAGACAGATGT R: TAGTATTCTCAACACGTAGCT |
NED | (AG)20 | 190-226 | 45 | 11 | 0.756 | 0.880 | 0.027 | 0.258 |
49μER (KY073508) | F: ATCATTATGCTAATGTCTGC R: GGACTTTTTAGCATCAGTAT |
PET | (ACAGC)6 | 126-171 | 44 | 10 | 0.773 | 0.830 | 0.050 | 0.427 |
65μER (KY073511) | F: TCTGTGACTTCTGTAACTCAG R: GTGTGTGAGGTTTAGATGTG |
VIC | (GT)21 | 139-252 | 43 | 18 | 0.721 | 0.880 | 0.026 | 0.000† |
67μER (KY073525) | F: GTTCATTAATAAGCAGAAGAG R: TTCACCAAGATATTACTCACT |
PET | (CACGCA)6 | 232-297 | 42 | 14 | 0.690 | 0.822 | 0.048 | 0.001 |
63μER (KY073524) | F: TTCATTATCACAGCTAGTAGC R: GAACTGATAAAGAGGAGAGAT |
PET | (CTTCT)8 | 118-208 | 42 | 14 | 0.643 | 0.848 | 0.039 | 0.000† |
72μER (KY073506) | F: TTTTCTTTACATTAGCACAG R: AATTTGTAGTACAGCTGTGTC |
NED | (ACACAT)5 | 187-308 | 45 | 11 | 0.444 | 0.660 | 0.174 | 0.001 |
73μER (KY073516) | F: ACTTTGAGTCTGGAATAAAGT R: CCATAGATTAGAGGACAATAA |
NED | (TGA)17 | 106-185 | 44 | 20 | 0.932 | 0.915 | 0.014 | 0.678 |
69μER (KY073513) | F: GGCCTACATTAATAACATACT R: CTAATGTGGGAATATAGTGAG |
PET | (TAA)15 | 122-174 | 45 | 18 | 0.756 | 0.915 | 0.013 | 0.000† |
26μER (KY073500) | F: ACTACAGTAACTTCATGATGG R: TGGAATACAGTAGAGTAGGTG |
NED | (TGGA)6 | 133-229 | 39 | 15 | 0.333 | 0.842 | 0.040 | 0.000† |
75μER (KY073497) | F: CTGTAATATCCACTCAAAGAT R: TTTTTTCACAGTATAATGCTG |
6-FAM | (TTAGGG)5 | 223-323 | 39 | 16 | 0.359 | 0.911 | 0.014 | 0.000† |
24μER (KY073494) | F: ATAGTAGGCCACACTCACTC R: ATGACATCATTGTGAGAACT |
VIC | (TCAC)6 | 201-250 | 44 | 10 | 0.500 | 0.819 | 0.056 | 0.000† |
Size range: indicates the range of observed alleles in base pairs and not includes primer; Ni. number of individuals genotyped; NA. number of alleles observed; Ho and HE: observed and expected heterozygosity, respectively; PI. probability of identity for each locus. HWE: exact P-values of Hardy-Weinberg Equilibrium test;
†significant deviations from Hardy-Weinberg expectations after Bonferroni corrections (corrected alfa = 0.001).
Loci with the possible presence of null alleles are in bold.
Thirty-two of the tested primer pairs amplified high-quality PCR product exhibiting polymorphism. We observed a low percentage of amplification failure (maximum of 26% 44μER: Table 1) among the samples used in this study.
The number of alleles per locus ranged from 2 (50μER) to 33 (53μER), observed heterozygosity ranged from 0.171 (38μER) to 0.976 (50μER), and the probability of identity values ranged from 0.006 (53μER) to 0.513 (38μER). No linkage disequilibrium was found between any pair of loci, indicating that the markers were independent.
Twelve loci showed moderate polymorphisms (2-9 alleles) and 20 loci were highly polymorphic (10-33 alleles). The number of alleles and observed heterozygosity obtained in E. ringens was higher than previously reported in marine fishes (NA = 20 and HO = 0.790) (DeWoody & Avise, 2000). High NA and HO are common in species that have a large population size and/or high mutation rate (higher than 10-4 mutations/gene/generation) (Jarne & Lagoda, 1996). After Bonferroni correction for multiple comparisons, 12 loci showed significant deviations from expectations under HWE. There are several explanations for the deficits of heterozygosity in these 12 loci. Rico et al. (1997) reviewed the possible causes of excess homozygosity, evaluating various hypotheses concerning genotyping errors, the presence of null alleles, the Wahlund effect (Wahlund, 1927) inbreeding, assortative mating, and/or selection. Deviations from HWE associated with an excess of homozygotes seem to be common in fish species (O'Connell & Wright, 1997). Moreover, the occurrence of null alleles has been commonly reported during the characterization of microsatellite loci and in population genetics studies (Dakin & Avise, 2004).
Therefore, the low occurrence of amplification failure in these loci may indicate that a significant excess of homozygotes may be a common characteristic in fish populations resulting from biological phenomena instead of the presence of null alleles. We recommend evaluating these hypotheses before using disequilibrium loci in population studies.
These new loci will provide tools for examining population structure of this species and aid in better understanding of the ecology and conservation of E. ringens. This information will help determine appropriate fisheries management action along the Humboldt Current System coast for this highly exploited species.