Can insect data be used to infer areas of endemism ? An example from the Yungas of Argentina ¿

El objetivo principal de este trabajo es analizar si las areas de endemismo pueden ser caracterizadas cuantitativamente utilizando insectos, los cuales generalmente se encuentran mucho mas pobremente muestreados que vertebrados y plantas. La busqueda de areas de endemismo fue realizada utilizando un criterio de optimalidad sobre aproximadamente 1,100 georreferencias de 288 especies de insectos holometabolos presentes en la region de estudio. Esta corresponde al noroeste de la Argentina, especificamente en las Yungas (un bosque lluvioso montano muy humedo). El software NDM/VNDM, que aplica dicho criterio de optimalidad, fue usado para buscar areas de endemismo (i.e. conjuntos de celdas definidos por dos o mas especies). Se utilizaron cinco tamanos de grilla: tres cuadrados (Io, 0.5° y 0.25°) y dos rectangulares (0.25° x 0.5° y 0.5° x 0.25°). Los resultados de este estudio indican que las Yungas pueden ser caracterizadas como una unidad biogeografica con identidad propia y estos resultados concuerdan con propuestas biogeograficas previas. Se obtuvieron 26 areas de endemismo con 23 especies endemicas de insectos (en 14 familias) restringidas a Yungas y 46 especies (en 10 familias) endemicas, presentes en Yungas y habitats adyacentes. Nuestro analisis sugiere que el uso de insectos puede ser una herramienta poderosa para identificar areas de endemismo, aun considerando lo fragmentario del conocimiento actual de estos grupos en America del Sur. El uso de diferentes tamanos de grilla fue crucial. Tamanos pequenos y medianos son altamente recomendados para identificar patrones diferentes. El metodo cuantitativo utilizado permitio identificar areas de endemismo dificiles de reconocer con metodos biogeograficos tradicionales, tales como areas disyuntas o parcialmente superpuestas.


INTRODUCTION
Many biogeographic proposals that describe different regions, provinces, or domains in South America have been put forward (Cabrera 1971, Cabrera & Willink 1973, Hueck 1978, Morrone 2000, 2001, 2006, Willink 1991).Although based on the vast experience of one or more specialists, most of these compilations are of a qualitative nature and based solely on the authors' common sense.As a result, the validity of many of the areas proposed in these studies is difficult to reformulate and/or assess.Such is the case of Yungas, a territory which covers over 4,000 km from Venezuela to the north-west of Argentina, which has been characterized almost exclusively by its flora (Cabrera 1971, Hueck 1978).
According to some authors (Cabrera & Willink 1973, Brown 1995, Graham 1995, Prado 1995), the Yungas are a heterogeneous unit, whose major differences regarding fauna and flora are the result of climatic and historic factors.Cabrera & Willink (1973) proposed that the Yungas can be characterized on the basis of floristic components (even when exclusive floral elements are scanty), and that the fauna is mostly composed of taxa from nearby areas (without unique elements); this is particularly clear in north-western Argentina, where the Yungas display a combination of plants from arid and semiarid Chaco and the Paranaense forests.
In contrast to Cabrera & Willink (1973), Morrone (2000, 2001, 2006) compiled a list of apparently endemic taxa, which includes some insect species.However, under closer examination, most of the taxa in his list cannot be used to characterize the Yungas as a unit, since they are present only in small sectors of the Yungas.For example, Nothocercus nigrocapillus (Tinamidae) is only present in Perú and part of Bolivia (Fjeldsa & Krabbe 1991).
The proposals made so far disagree both on the role given to fauna and flora in characterizing the Yungas and on the number of boundaries of regions and provinces.
However, the biogeographers generally agree upon the fact that these biogeographic units both show a characteristic landscape and have endemic species.
Areas of endemism are the study units in biogeographic and conservation research, but their recognition has been hindered by the lack of appropriate methodology.The most commonly used methods to determine areas of endemism are Parsimony Analysis of Endemicity (Morrone 1994) and UPGMA (Linder 2001).However, both methods were developed to identify patterns outside the field of biogeography.The method developed by Szumik et al. (2002) and Szumik & Goloboff (2004) attempts to remedy this situation by applying an explicit criterion of optimality to evaluate areas of endemism, that is, the distributional congruence of taxa (Platnick 1991).This includes the spatial component lacking in parsimony or UPGMA, and is implemented in the programs NDM/VNDM (available at http://www.zmuc.dk/public/phylogeny/endemism).
Our data set consists of about 1100 records of 288 species for some important insect groups (Hymenoptera, Diptera, and Lepidoptera) which are clearly much more poorly sampled than vertebrates or plants (see Aagensen et al. 2009).Additionally, the information on insect distribution in South America clearly indicates that we are really far away from a perfectly sampled zone (as in European or North American biodiversity studies).
This work attempts to evaluate whether applying formal methods leads to reasonable conclusions despite the low sampling data.The question quantitatively posed is whether such a low sampling density is enough for reaching conclusions which are qualitatively similar to those established in previous, informal analyses.Finding patterns, the Yungas in this case, previously described by prestigious biogeographers is a powerful hint of the potentiality of the method.It is important to remark that the sole assumption of this method is that concordance on the distribution of various species would indicate the presence of endemism, which is the result of historical and ecological factors (Szumik et al. 2002) The endemic species should have significantly similar distributions, and to be considered as endemic of an area, a species must be found throughout the area; that is to say, species with non congruent distributions cannot be seen as part of the same phenomenon or the same area of endemism (Casagranda et al. 2009).

Study region
The Yungas, one of the most important biogeographic areas present in South America, reach on the north, to Venezuela (or only to Perú, according to some authors), and, on the south, the northwest of Argentina.They are located on the eastern slopes of the Andes, between 300 and 3,500 m of altitude (Cabrera 1971, Cabrera & Willink 1973, Brown 1995, Morales et al. 1995, Morrone 2000, 2001, 2006).In Argentina they span from north to south along over 600 km, with a surface of 4.5 million ha (in the provinces of Jujuy, Salta, Tucumán and Catamarca; see Fig. 1), and have an altitudinal range of 400 to 3000 m (Cabrera 1976).Over 150,000 ha of Argentinean Yungas are protected areas (Fig. 1) like El Rey, Calilegua, Baritú, San Javier, etc.

Taxa used in this study
Our data set consists of about 1,100 records of 288 species (with some species having more than 20 records, and others just two or three).This means that the number of records per 100 square kilometers of our data set is 2.4.However, the sampling density in biogeographic studies of this kind, which rely on taxonomic rather than ecological information, is usually very low; when reported, it is typically as low as or lower than the density in the present study (e.g.~1.6 records 100 km Three orders of holometabolous insects have been included in this study: Lepidoptera, Diptera and Hymenoptera.Of the 288 species included here 31 belong to two families of Lepidoptera, 140 to 24 families of Diptera, and 117 to Hymenoptera (Formicidae).See Table 1 to 6 for details.The records used for those species come from specimens of the collection in the Instituto-Fundación Miguel Lillo, Argentina, as well as recent reviews and catalogs on these families (Papavero 1966-1984, Kempf 1972, Lizarralde de Grosso 1989, Poole 1989, Brandão 1991, Cuezzo 1998, Lizarralde de Grosso 1998, Scoble 1999).All the records were georeferenced (using plane coordinates), with information supplied by the Instituto Geográfico Militar (http:// www.igm.gov.ar) and Biolink (http:// www.biolink.csiro.au).We include records not only from Yungas but also from surrounding areas (e.g.Chacoan, Espinal and Paraná subregion).Cuezzo et al. (2007) provide preliminary information related to this paper.

Identification of areas of endemism
The data matrix of 288 species and 1092 georeferences was analyzed using the gridbased method to identify areas of endemism proposed by Szumik et al. (2002) and Szumik The general idea of areas of endemism is not associated with a specific type of causal factor, but only with the existence of a common one; if a single factor affects the distribution of numerous groups of organisms at the same time, the distributions of those organisms are expected to show similar patterns, regardless of whether the causal factor is historical or ecological (Szumik & Goloboff 2004).The method basically evaluates spatial concordance between two or more taxa for a given area (set of cells).
VNDM reads the records as coordinates and allows converting them easily into presence/ absence data on grids of different sizes.This method assigns, for each species, a score of endemicity (e) to sets of cells (= areas) according to how well the species distribution matches the area.For a given species, the score increases as fewer records exist outside the area, and more records exist inside.The total endemicity score (E) is the summation of the values for each species (e).

E = ∑ e i.
Where e i is the endemicity score (E) of individual species i.The value (e) for a given species varies between 0 (non scoring) and 1 (maximum score: species found in all cells of an area, and no cells outside).The method also allows distinguishing between actually observed records, probable, and inferred.See Szumik & Goloboff (2004) for more details.  . 44,116,119,127,155,161,184,199,201,212,243,247,248,Spp. 119;127;152,257,271 251,257,271 MFSB Spp. 94,119,127,136,155,160,161,177,199,201,207,212,221,Spp. 118 243,245,247,248,249,257,271,286 MFSM Spp. 84,94,118,119,136,155,160,207,221,249 The optimality criterion consists of selecting those areas with maximum value of endemicity.
In many studies of this kind, a single grid cell of 1° x 1° is used.However, there is no formal argument to use only one grid size, which means that there is no criterion to select an accurate cell size for the study.Then, we analyze the data set in five different grid sizes, three of which are square (1°, 0.5°, and 0.25°) and two are rectangular (0.25° x 0.5° and 0.5° x 0.25°).Then, those areas which survive changes in grid size can be considered more strongly and clearly supported by the data (Aagesen et al. 2009)

The Yungas as a unit
The insect data allow full recovery of the Argentinean Yungas as a biogeographic unit, in two main areas.One of these (Fig. 2A; E= 15.30-16.61and 29 endemic species) includes roughly the whole Yungas.The other (Fig. 2B), with 10 endemic species, is included within the first one (Fig. 2A) and shares four species.Fig. 2 show the results for the 1° x 1° grid, which clearly place the Yungas mostly beyond the level of resolution of such a loose grid.While Fig. 2 roughly follow the contour of the region generally recognized as Yungas, some cells are clearly outside those regions (e.g. the southeastern cell of Fig. 2A does not comprise any area of Yungas).The inclusion of those cells is not strongly supported by the data (i.e.removing them lowers the endemicity score no more than 1.5 to 3 % of the score of the respective areas), but is supported nonetheless (i.e. the endemicity score does decrease when removing them).Were the available data much more detailed, this would probably be remedied by using smaller grid sizes, but (given the low density of the records known at the present), the smaller grid sizes produce very low scores of endemicity and a more diffuse identification of areas of endemism (vide infra).Concomitantly with this imperfect recognition of the Yungas, a small fraction of the species identified in our analysis as «endemic» for these two areas are in fact associated with other environments [Arid and semi-arid Chaco (dry forest), Montane or Highland Chaco, Monte (extensive shrubland), Prepuna and Puna (montane grasslands)] (Table 1).In part, another cause for these species to be identified as «endemic» is that many cells are large enough to contain records from both Yungas and non-Yungas habitats.Thus, the area of Fig. 2A has 14 species (48 % of its endemic species) only present in Yungas environment (Formicidae: three; Diptera: 10; Geometridae: one), 13 species which are also distributed in neighboring zones of other environments, and two species (7 %) never found in Yungas-type habitats (Table 1).
Unlike the northern sector, the southern sector appears under all grid sizes and is represented by seven areas.One area (1° grid), which is the most inclusive (Fig. 3B), has an endemicity value (E) of 9.89-10.57(Table 3, third column) and includes 19 endemic species, of which nine species (Table 3; Formicidae: 1; Diptera: 5; Noctuidae: 2; Geometridae: 1) are restricted to Yungas environments, seven are present in Yungas and other environments (Grasslands, Arid and semi-arid Chaco, Monte and Puna), and three species occur only in environments outside Yungas (Grasslands, Arid and semi-arid Chaco, Montane or Highland Chaco and Monte).
In turn, another area (1° grid) covers the same surface as the previous one except for central Salta (Fig. 3C).It is defined by 10  1].
The combination of the northern and central sectors as an area (Fig. 4A) appears only under a 1° grid with an endemicity value (E) of 3.50-3.75.Five species are identified as endemic (Table 4): two (40 %) restricted to Upper Montane Forest and Montane Cloud Forest and three (60 %) in Yungas as well as other environments.The combination of northern and southern sectors appears recognized in three grid sizes (1°, 0.5° and 0.25° x 0.5°) with seven areas altogether (Table 5).This combination has already been proposed as an area of high density of endemism, on the basis of qualitative analyses (Brown et al. 2001).The species that give score to this combination do not define each sector separately.Furthermore, they are completely different from those that define the combination of central and southern sectors, with the exception of one shared species.
The two areas in the Figs.4B and 4C (1°g rid) share four cells and seven endemic species: five belong exclusively to the Yungas and associated to Premontane and Montane Forests (Selva Basal), and two belong to both Yungas and other surrounding environments.The first area has 14 species identified as  Spp. 127,199,Spp. 243 248,Spp. 199,243 Spp. 199,243 Spp. 127,201,155,161,199,199,201,201,247,251 257 247 201,212,243,212,243,247,248,251,257,276 257,276 MFSB Spp. 127,131,Spp. 131,Spp. 127,199,Spp. 243,248,Spp. 199,243 Spp. 199,243 Spp. 127,201,155,161,199,199,201,201,247,286 257,286 247 201,212,243,212,243,247,248,257,257,276 276,286 MFSM Spp. 155 endemic, 86 % of which are exclusively associated to Yungas (Table 5, second column), whereas the second area has 8 species identified as endemic, 63 % of which are exclusively associated to Yungas (Table 5, third column).With a few differences in endemic species composition and/or extension, it is clear that the same pattern is obtained with different grid sizes (Fig. 4D).
The combination of the central and southern sectors appears in areas on three grids (Table 6).In fact, the association of Premontane and Montane Forests of both sectors was proposed by Morales et al. (1995).The 15 endemic species which give score to this combination, also define other areas (e.g.southern sector, the Yungas).Two areas (0.5° grid) (Fig. 4E) are disjoint units and include the Montane Forest in the central sector and patches of Montane Forest in the southern sector.Regarding the smaller grids, although there is a reduction of the surface and numbers of endemic species, the pattern is similar to that for the 0.5° grid (Fig. 4F).

Other biogeographic units
A combination of patches of Yungas and Oriental Chaco (Fig. 5A) can be seen in all grid sizes.Some tree species of Premontane Forest that occasionally appear within Chaco have been reported (Prado 1995); they are usually related to gallery forests and they are considered of 'non-chacoan lineage' (Adámoli et al. 1972) or as 'subtropical forest transchacoan elements' (Morello & Adámoli 1974).During the climatic fluctuations of the Pleistocene, according to Prado (1995), the Yungas covered the whole of northern Argentina (reaching Córdoba province to the South) and these patches are remnants of that formation.Here, the area seen in Fig. 5B (1°g rid) shows this combination of patches of Yungas, Chaco and Espinal.
Only in the smaller grid sizes (0.25°, 0.25°x 0.5°, 0.5° x 0.25°) is a combination of Yungas and Paranaense (NE of Misiones province) forests recognized.This combination probably represents neotropical tails which are the southern section of the Neotropical region.In the 1° x 1° grid this combination appears together with Oriental Chaco cells.
Although the analysis of the formation of the Paranaense Forest is not within the scope of the present paper, it is identified in all grid sizes (Fig. 5C), except the 0.25° x 0.25° grid.Also, there are areas that include patches of Misiones Forest and Oriental Chaco, or Misiones Forest and Espinal.

Altitudinal levels
Sixty-nine species of insects are recorded in the Yungas, 23 of which are endemic to this environment (Table 7).From the analysis of these records it can be concluded that each altitudinal level has some exclusive species, while some others are shared by two altitudinal levels (Table 7).This qualitative assessment is partially supported by the endemicity analysis.The three altitudinal levels are present in one area (0.5º grid) (southern and northern sectors) sharing three endemic species.According to our results, the Premontane Forest is always associated to Selva Basal (Lower Montane Forest) or to the whole Montane Forest (Selva basal and Selva de Mirtáceas).Finally, the Montane Forest is the only altitudinal level that stands on its own in this analysis (0.5º x 0.25º grid).

DISCUSSION
In agreement with the proposal of Morrone (2000,2001,2006) our results indicate that the Yungas can be characterized as a biogeographic unit with its own identity, where insects could be an excellent tool to identify areas of endemism.The extensive spatial concordance between our results and previous proposals for the Yungas indicate that insect data, even if fragmentary, can be used as reliable indicators of areas of endemism.The degree of spatial concordance between the areas recognized by our quantitative analysis and previous qualitative hypotheses could hardly be the result of chance, which would be an astonishing coincidence.
The use of the quantitative method to identify areas of endemism developed by Szumik et al. (2002) and Szumik & Goloboff (2004) has many advantages.First, the discontinuous distributional pattern of the Yungas, due to habitat fragmentation, can be recognized as such, given that this method identifies disjoint areas.Second, this method also allows the identification of partially  overlapping areas when they have different sets of endemic species.Third, the programs used offer facilities that considerably simplify summarizing the results.In some cases, many sets differ simply by one or two cells, and have their scores given by the same species; with these programs, similar areas that differ by one or a few cells could be considered as a unit.
Another aspect studied here is the effect of the grid cell size on the results.The present analysis suggests that the use of several grid sizes is crucial; medium and small sizes in particular are highly recommended as both identify seemingly different patterns.
A total of 26 areas related to Yungas have shown 23 species (in 14 families) as endemic restricted to Yungas environment (Table 2), and 46 species (in 10 families) as endemic present in Yungas and surrounding habitats (Chacoan and Parana subregions, sensu Morrone 2006).
According to our results, the use of insect data to identify areas of endemism has shown to have both strengths and weaknesses.It is evident that the knowledge of these groups is too superficial for analyses to resolve the finer details and boundaries of the areas in question.Until the knowledge of these groups in the tropical and subtropical areas is as detailed as that of the butterflies of Europe (e.g. as in Kudrna 2002) little more can be discussed, but that might take decades or centuries.Considering that even the current fragmentary knowledge of these groups allows to identify, if perhaps somewhat imprecisely, the main areas of endemism recognized before, insects are clearly a promising line of evidence.As research and collecting on these groups accumulates, they will probably be one of the key factors in identifying the main biotic regions of the Neotropics.

Fig. 2 :
Fig. 2: The Argentinean Yungas have been fully recovered as a biogeographic unit it by means of NDM in two areas: A and B (grid of 1º).Las Yungas argentinas han sido recuperadas totalmente como una unidad biogeográfica por medio de NDM en dos áreas: A y B (celda de 1 º).
. Additionally, different grid sizes will be able to identify different patterns if some of the taxonomic groups display congruence at different scales (see discussion and hypothetical examples inCasagranda et al. 2009, p. 272)

TABLE 2
Specific composition of areas of endemism considering only northern sector [Abbreviations as in Table1].
Specific composition of areas of endemism considering only southern sector [Abbreviations as inTable

TABLE 5
Specific composition of areas of endemism considering combination of northern and southernsectors [Abbreviations as in Table1].

TABLE 6
Specific composition of areas of endemism considering combination of central and southern sectorsof Yungas [Abbreviations as in Table1].Composición específica de áreas de endemismo resultantes considerando los sectores central y sur combinados [Abreviaturas como en Tabla 1].