44 Dudley, E. C. (Ed.). 1991. The Unity of Evolutionary iiology: Proceeding. of the Fourth International Congress of Systematic and Evolutionary iiology, Dioscorides Press, Portland OR., 2 vols. 1048pp. CRITICAL ISSUES IN BIODIVERSITY?SYMPOSIUM Designing and Testing Sampling Protocols to Estimate Biodiversity in Tropical Ecosystems Jonathan A. Coddington, Charles E Griswold, Diana Silva Davila, Efrain Peftaranda, and Scott F. Larcher Abstract. Sampling methods to estimate total species richness of a defined area (conserva- tion unit, national park, field station, "community") will play an important role in research on the global loss of biodiversity. Such methods should be fast, because time is of the essence. They should be reliable because diverse workers will need to apply them in diverse areas to generate comparable data. They should also be simple and cheap, because the problem of extinction is most severe in developing tropical countries where the scientific and museum infrastructure is often still rudimen- tary. In the past, two scientific fields have been mainly responsible for providing such data: systematics and community ecology. Simplistically summarized, samples collected by systematists better represent the species richness in an area but are intractable statistically, whereas the analysis of samples collected to answer ecological questions is usually straightforward but often poorly represents the total fauna. The two approaches are complementary and we propose a set of methods that seeks the optimal compromise. We applied these methods to sample and estimate species rich- ness of Araneae in Bolivia, but we present criteria so that the methods can be modified for other broadly similar groups. We describe the methods used, present preliminary analyses of the effect of four variables (site, collecting method, collector, and time of day) on total number of adult indi- viduals taken, and discuss analytical approaches that employ such data to estimate total species rich- ness. We also present data from Peruvian canopy fogging samples to show that estimates of species richness of diverse tropical arthropod taxocenes are obtainable in principle. INTRODUCTION The alarmingly rapid extinction of species currently underway is mainly a tropical phenomenon. While many authors have decried this loss and suggested sweeping policy initiatives to reverse the trend (Wilson & Peter, 1988), rather less attention has been given to the development of pragmatic inventory methodologies that will aid in addressing the problem Obviously the cause of accelerated rates of extinction is complex and its amelioration requires diverse approaches. However, we presume that fast, reliable, and comparable estimates of total species richness in areas of interest will be important data on which to base conservation decisions, allocation of resources, and land use planning Rela- tive richness of endemics might be an even more useful datum, but it is far more difficult to obtain. Scientists traditionally involved in the description and inventory of communities Drs. Coddington and Griswold are with the Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA Dr. Davila is with the Museo de Historia Natural de la Universidad de San Marcos, Av. Arenalesl256, Apto. 14 0434, Lima 14, Peru. Dr. Pefiaranda is with the Instituto de Ecologia, Casilla 10077, La Paz, Bolivia. Please address correspondence to Dr. Coddington. 45 and biotas can make significant contributions by designing and testing such sampling protocols. In the past, estimating and describing total species richness of an area has been the purview of two scientific disciplines: systematics and community ecology. Systematists, particularly those associated with museums, are often expert in efficient inventory tech- niques, where "efficient" basically means maximizing the number of species discovered in a given sampling area per unit effort Usually museum personnel sample as many habitats as possible, switching emphasis as soon as a good series of each species is obtained Some museum personnel use quantitative techniques in which quantity of habitat sampled and sampling effort are monitored, but many do not Results of such latter efforts, hereafter termed "museum collecting," is often no more elaborate than a list of species encountered. Not only relative abundance but the proportion of the fauna as yet undiscovered remains unknown and unknowable through museum collecting as it is often implemented (at least within any reasonable time frame for any reasonably diverse arthropod fauna). Moreover, museum collecting also often fails to take account of two variables fundamental to any quantitative sampling program; sampling effort and the possibility of biased representa- tion of the true relative abundances of species in the area or community sampled Museum collections doubtless underestimate the true relative abundance of the most common species For example, the seminal paper on estimating species richness was partly inspired by a museum collection of Malaysian butterflies (Corbet in Fisher et aL, 1943). R A. Fisher, who invented the log series in order to cope with these data, evaded the question of biased relative abundance by assuming the collector was unbiased in his sampling behavior, but had stopped after collecting 20 individuals of each species. However, he could only ignore the problem of differential sampling effort (eg by date, taxon, ormicrohabitat) because no data were available. As such the collection of butterflies had to be regardedasa single large sample; the potential analytical insight and power that would have resulted from having a series of replicate samples representing equivalent sampling effort was lost Accurate estimates of total species richness in complex habitats have not received the attention they deserve from community ecologists. It must be admitted that the total number of species present is an abstract and perhaps not very interesting quantity, ecologically speaking That situation will probably change, given that species richness is fast disappearing, that the trend can be slowed but not stopped, and that in the complex nexus of "what to save, when, how, at what cost," an important datum, species richness or biodiversity, is typically lacking. Although the work of Preston (1948 and later papers) and, especially, C. B. Williams (1964) showed that total species richness could be estimated most workers of the time were more interested in what the results implied about ecological process than in the esti- mates of that bald number. Interest in this approach waned after May (1975) suggested that the ubiquitously observed lognormal distribution of species abundances was due not to ecological process or law, but mostly to the "law of large numbers" or the central limit theorem. Be that as it may, and regardless of later arguments that the lognormal distribu- tion is not entirely artifactual (Sugihara, 1980), the generally good fit of the lognormal to empirical data remains a striking observation (Taylor, 1978; Magurran, 1988), although largely untested in tropical ecosystems. If anything, the explanation of the fit as an effect of large numbers is a prescription, not a proscription, of its use for estimating total species richness in tropical ecosystems. From the point of view of practical work on biodiversity, however, this body of ecological theory and method has several limitations. First one ought to be as interested in the confidence interval associated with any estimate of total species richness as with the estimate itself. However, an analytical derivation of the confidence interval associated with the lognormal estimate of species richness is still lacking (Pielou, 1975). Boot- 46 strapping techniques may help, although we know of no implementations of the boot- strap easily applicable to such species abundance distributions. Other theoretical distribu- tions, such as the negative binomial, provide such confidence intervals, but generally do not fit the empirical data as welL Completely different methods, such as the non- parametric jackknife (Heltsche & Forrester, 1983) are also available, provide confidence intervals, and show promise. The jackknife thus far has been applied only to quadrat based sampling methods. Its extension to plotless methods remains untested Finally, one may simply plot the accumulation of species as a function of sample number or indi- viduals taken While straightforward and revealing, this method is ad hoc and it is unclear how to estimate the asymptote and confidence interval. However, fitting such curves to models that have an asymptote (e.g., hyperbolic functions) may be justified. If the assump- tion of applicability is made, the confidence interval could be estimated by a least squares procedure. Because the set of analytical approaches is still fairly small and relatively untried on tropical data, we advise the use of all techniques applicable to a given data set After all, convergence of estimates based on completely different theoretical approaches might be evidence that each (or all) is measuring the same quantity. Tractability to dif- ferent analytical approaches should be a chief design criterion for the sampling protocol. By this we mean for example, that merely because the lognormal can be fit to data viewed as a single large sample, large samples should still be composed of many smaller replicate samples. Applicability of the lognormal is not thereby sacrificed, and other methods that do assume replicate samples, such as the jackknife or species accumulation curves, become applicable. Despite the feasibility of analysis, ecologically-oriented sampling has some draw- backs. The emphasis on statistical tractability predisposes workers to prefer methods of rigorously controlled and known bias, such as automated traps, in rigorously defined sampling units, such as plots or quadrats (Southwood, 1978). Automated methods may minimize variability but they can also sample taxa differentially or be unsuitable for some taxa. In addition, setting up, using, and maintaining plots in remote tropical field sites that may be visited only once (typical for systematists) is time-consuming, difficult and expen- sive. Therefore the need exists to develop plotless sampling techniques that are better suited to the salvage or "hit-and-run" style of inventory work that the biodiversity crisis is likely to necessitate. In sum, therefore, several considerations suggest a blend of these two traditional approaches to estimating biodiversity. We perceive the following as important criteria for sampling protocols: 1. The usual complement of collecting methods used by museum personnel involved in inventories (whatever that may be for particular groups), should be modified as little as possible in order to yield analytically tractable data. We start with museum techniques because museum personnel have the most expertise in actually accessing or sampling the total fauna in a region In the decade to come most of the inventory work will probably be initiated by museums or analogous institutions. 2. The number of collecting methods used should be as few as possible, and the classification of microhabitats should be as simple as possible, in order to minimize the complexity of the sampling protocol. Methods should be chosen based on their efficiency and independence (= low overlap with) other methods. For the pur- pose of estimating total species richness, microhabitats that are especially difficult to sample yet productive of relatively few species can probably be ignored except in exhaustive inventory projects. 3. The sampling protocol should work well in both plot-based and plotless sampling situations. Time spent sampling may be a good choice as the measure of the sampling unit because of its simplicity and universality. While time is not the most 47 exact measure for all collecting methods, it is appropriate for many favored by systematists and is easy to measure in the field. In those methods for which another unit is better suited, such as discrete number of events or area, the sample size of these can be at least be adjusted so that total sampling effort (eg., hours spent doing it) is in some sense comparable over all methods. 4. The sample unit should be large enough to yield adequate numbers (individuals and/or species taken) within samples, but small enough so that the total number of samples taken is large enough for between-factor statistical comparison 5. Data should be collected so that variation can be estimated and analyzed Contributing factors at a minimum include: site, individual collector, time of day (and season), and sampling method. Although inventory work is not undertaken to test hypotheses about the effects of these factors on variation in number of indi- viduals or species taken, these exploratory analyses do reveal the quality and struc- ture of data used to achieve estimates of species richness, especially if the analytical method assumes replicate samples. 6. Data on numbers of individuals and species taken should be able to be combined to produce species abundance distributions. These can be used to estimate species richness in three broad ways. They can be fitted to appropriate parametric distribu- tions (negative binomial, lognormal); they can be subjected to the non-parametric jackknife estimate; and they can be used to produce species accumulation, or collector's, curves. 7. Finally, if possible, the analytical approach should yield confidence intervals on the estimates. Although the confidence interval on any empirical estimate is an impor- tant scientific consideration in its own right, it is especially important in the present context in which a rough but fast and reliable method to estimate a complex para- meter is sought. In this report we present results from a preliminary trial of the above principles to esti- mate species richness in a variety of tropical evergreen forests. We focused on arthropods because they comprise nearly all the species richness of any tropical evergreen site and specifically on spiders (Araneae). Besides the obvious fact that we are all araneologists, spiders are a good choice for several reasons. First they are poorly known taxonomically, and thus typical of many extremely diverse groups. Second, they are among the six or seven most diverse ordinal taxa in terms of species described or estimated to exist (Coddington, 1990, Coddington etaL, 1990). Asa rough guess, a hectare of typical tropical forest probably supports 300-800 spider species. More than 1000 genera occur in the Neotropics, about a third of the known world genera. Third, spiders must generally be caught by hand, in contrast with other groups such as mites, Lepidoptera or parasitic Hymenoptera for which efficient mass sampling techniques have been devised. Fourth, spiders are generally a good mix of apparent and cryptic species?not so apparent as butterflies, nor so cryptic as centipedes. Fifth as obligate and abundant arthropod predators near the top of the invertebrate food chain, spiders are an important compo- nent of any ecosystem. Insofar as much of the work on tropical biodiversity has concerned herbivorous arthropods or groups of mixed feeding strategy (Farrell & Erwin, 1988; Erwin & Scott 1980, Erwin, 1989; Broadhead, 1983; Mound & Waloff, 1978), data on obligate carnivores may help to round out the emerging picture on the distribution of species rich- ness across body size and trophic level (May, 1988). 48 METHODS Each Bolivian sample collected was classified according to four main factors: Site We collected in three sites: the Estacion Biologica Beni (Biosphere Reserve) (eleva- tion 100m: ca. 14?47'S:66?15'W); a site near the intersection of the Rio Tigre and the La Paz-Coronavi road (500m: ca 15?23'S:66?59'W); and a forested site on the slope of Cerro Uchumachi, near Coroico (1900m: 16?15'S:67?21'W). These are all evergreen tropical forest sites. A priori we expected maximal species diversity at the intermediate elevation. Collector Four of us are experienced field araneologists, although our particular interests range from fossorial mygalomorphs through vagabond hunting spiders to minute web spinners. The fifth, although an experienced field worker, knew much less about the particular tech- niques and relevant aspects of spider natural history. A priori we expected a significant effect on numbers of individuals taken due to collector. Time of Day Naturalist tradition states that many spider species are nocturnal. On the other hand, spiders resting by day are not uncommon. Consequently, we sampled during the day (roughly 0700-1200) and during the evening (1900-2400). A priori we expected more individuals and more species during the evening Method Our initial classification of methods and microhabitats was elaborate (Fig 1, above). We designed it to access all relatively diverse components of the fauna in all relatively accessible microhabitats. Method and microhabitat were matched so that the sampling protocol would yield a relatively complete picture. The basic sampling unit was one hour of constant collecting timed with a stopwatch Collectors agreed to take all putatively adult spiders encountered, without exception, during that hour. "Adult" meant any animal that was or might possibly be adult; only if the collector was certain that the animal was juvenile was it skipped Spider taxonomists widely agree that con- or heterospecificity can only be judged reliably from sexually mature animals. Skipping juveniles makes collect- ing more efficient; also they are practically impossible to identify reliably. The use of the stopwatch meant that a given sample need not be strictly continuous?one could turn off the watch to attend to personal details, converse, eat, move from one area to another, etc Therefore, each sample represented one full hour of attentive application of a particular method. Using stopwatches provided another benefit A collector with two stopwatches can use at least three methods simultaneously?two hour-based samples, and a technique based on counts or area. Collectors were free to switch between methods, as long as the tally for each was independent Being able to choose is important because the collector can thereby maximize the number of species sampled; it helps to maintain the efficiency inherent in museum collecting However, it may introduce a bias if collectors quit when the hunting is poor. After two to three days in the field, however, it became clear that the initial habitat classification was inefficient and overly complex It was inefficient because one might be condemned to spend several hours searching in and for small holes, which might happen to contain nothing of interest It was too complex to maintain a fully factorial design as a 49 Traditional methods Microhabitats: Hand-searching 1. Herb layer Beating trays 2. Shrub layer Sweep nets 3. Tree bark/surface and beneath Pitfall traps 4. Leaf litter Utter sifting/extraction 5. Big holes (burrows, hollows, srreambanks) Bark/log fragmentation 6. Little holes (tubes in soil) 7. In/under logs, rocks 8. Forest canopy Looking up.?Hand searching accesses 1-3. Unit = 1 hour. Looking down.?Hand searching accesses 4-7. Unit = 1 hour Beating foliage.?Beating tray accesses 1,2. Unit = 25 beats Utter sifting.?Funnel/sheet accesses 4. Unit = 2 sq. meters L^r L (^?Ve) ^.^^ o?1coU?rtfa8 methods against classification of microhabitats for sampling of Araneae. (Below) Methods that turned out to be practical in the field, and their rela- tion to the initial design. collector had to employ too many different methods in different places at different times, and get replicates of each. Since collectors rarely spend more than, say, three weeks at a given site, the entire protocol should be simple enough to be carried out within that time Thereafter we restricted our methods to four (Fig 1, below): Looking up. This calls for hand-picking while the collector moves slowly along on his or her feet searching any vegetation or structures above knee height It roughly trans- lates to collecting while walking about Looking down. This calls for hand-picking while the collector is on his or her knees or stomach, intensively searching the soil leaf Utter, forest floor debris, and shortest vegetatioa It roughly translates to collecting while crawling about Beating A beating "event" was defined as placing a standard sized tray (ca 0.5 m2) beneath a suitable unit of vegetation (branch, bush, sapling vine, etc) and tapping it until no more spiders fell down. Twenty-five such events constituted one sample. A beating sample required about an hour to complete, and so required about the same amount of collector effort as methods 1 and 2. Litter sifting. Although we utilized this method infrequently, it is doubdess an essential way to access the spider fauna. We gathered up 2 irf of litter, and sifted and sorted it on a white sheet to find the spiders. The area sampled required on average about an hour's effort for one person Results from this method are not compared here Because none of these methods is plot-based, we cannot give accurate figures for the area or volume of habitat sampled per sample unit (except of course in the case of sifting) Sampling Araneae is slow work, and in the course of an hour with methods 1-3 no one covered much more or much less than 50 linear meters of ground in the areas we sampled Assuming that a person samples one meter on either side of that line, each sample repre- sents perhaps 100 rrf. These figures are offered as a rough comparison only, and the preci- sion of the method could be improved if one took the time to get a rough measure of the area covered Certainly we did not collect every individual in the area sampled, so our results are lower bounds on the abundance or diversity of the area sampled by each sampling unit For each site the total area sampled was probably about a hectare The Peruvian canopy samples were collected according to methods described by Erwm and Scott (1980) from the Tambopata Reserved Zone and Manu Reserved Zone Madre de Dios, Peru. Two samples are analyzed, one from an isolated Manu canopy 50 including two individual trees of different species and several liana species, and a second series of five Tambopata canopies including a variety of tree and liana species. In the former, all animals that fell from the canopy were collected but in the latter a series of 45 funnels (1 m2) were arranged to subsample the falling arthropods. Although these five canopies were from distinctly different forest types, in order to investigate analytical tech- niques we are treating them as a series of 225 replicate samples. Analysis For the Bolivian samples, two variables, numbers of adult individuals per sample and numbers of species per sample, can be analyzed in much the same way. Here we present preliminary analyses of the first variable only. We used SYSTAT version 4.2 (Wilkerson, 1988) for statistical analysis and graphs. We emphasize that we are not trying to test hypotheses about the effects of various factors on numbers of individuals per sample, but rather simply to explore the structure of the data that will be used to estimate richness. In particular, one can anticipate substantial variability between samples, given our impre- cise definition of sampling methods and units The Peruvian data was fitted to a lognormal distribution according to the procedures outlined by Ludwig and Reynolds (1988). Jackknife estimates of species richness were computed using equations presented in Heltsche & Forrester (1983). RESULTS In the 35 day Bolivian expeditioa we spent roughly ten full days collecting (Table 1). The remaining time was taken up with obtaining permits and visas, obtaining field equip- ment and transportation, getting to or moving between sites, maintaining vehicles and equipment and dealing with various emergencies. We used eight days at the end of the trip to label and organize samples adequately, and to sort the samples to family. Although one could have spent those days in further sampling we felt it was more efficient to get the bulk of the sorting and labelling done, rather than leaving it until some indefinite future date Experience bears this out while we knew in eight days how many adults of each family had been collected and had collected the abundance data for the factorial design, seven months later we still have not completed allocating the specimens to species. There is a considerable delay in sorting once collections join a museum "backlog" Sorting at the site or before the end of the trip may be worth the extra cost The Manu canopy sample, which comprised roughly 95,000 arthropods (T. L Erwin, pers. comm.) required roughly 23 person-days to sort but included only about2% spiders (calculated from Erwin, 1989). The five Tambopata samples came into our hands already separated from the rest of the canopy fauna so we have no data on time required to sort it to that leveL However, for the five canopy collections, sorting the samples to morphospecies required about 14 working days, collating and relating those morphospecies (= developing final morphospecies concepts) required another 23 days, and data entry and proofing required an additional seven days, or about 44 days total on top of field time and the initial sort to order. Numbers of samples and adults taken at each Bolivian site are summarized in Table 1. We averaged about three samples per person per day the average number of adults per sample was 16.4 ? 10 including all sites visited. Some of the samples were taken at sites or by methods not discussed here, and so subsequent analyses treat at most 126 of the total 184 samples. Figure 2 presents box and whisker plots of the number of adults taken by site, collector, method and time of day (n = 126). Significant effects on abundance due to collector, method, and time of day are apparent 51 Table 1. Summary of numbers of samples and number of individuals taken by site. Site Days Samples Samp./Pers./Day IndVSample Total Ind. El Trapiche 4 67 3.4 17.42 ? 10.4 1167 Rio Tigre 4 72 3.8 15.47 ? 9.7 1114 Cerro Uchumachi 2 45 4.5 16.38 ? 10.1 737 Summary 13 191 3.9 16.4 ? 10.0 3018 BOX PLOT OF ABUNDANCE GROUPED BY SITE, N = 126 2.00 JIMUM 49.00 MAXIMUM < ) El Trapiche * * H> - . I Rio Tigre > 1 ( , I Cerro Uchumachi 1 1 BOX PLOT OF ABUNDANCE GROUPED BY COLLECTION, N = 126 2.00 49.00 MINIMUM MAXIMUM ( ( ( ? ) -I ' I >h 1.000 2.000 3.000 4.000 5.000 BOX PLOT OF ABUNDANCE GROUPED BY METH, N = 126 2.00 49.00 MINIMUM MAXIMUM ( ) Look up Look down Beating BOX PLOT OF ABUNDANCE GROUPED BY TIME OF DAY, N = 126 2.00 49.00 MINIMUM MAXIMUM ( ? ) ( ? ) Day Night Figure 2. Box and whisker plots of number of adults collected, grouped by site, collector, method, and time of day. Vertical lines marked by " +" are medians, left and right sides of boxes are first and third quartile boundaries, "whiskers" on each side encompass the extreme quartile plus 1.5 times the middle interquartile spread. Parentheses define the 95% confidence interval about the median. Outlier points lying beyond the whiskers are indicated by a "*" and far outliers by "0." 52 Figure 3 presents box and whisker plots of the number of adults taken by each collector, grouped by site Collectors 1, 2, and 4 did not appear to differ significantly. Collector 3 (the less-experienced person) appeared significantly different at site 1 and 2 but not at site 3. Collector 5 appeared significantly different at site 3, but not 1 or 2. Figure 4 presents box and whisker plots of the number of adults taken by each method, grouped by site. Looking up was most productive at all sites Figure 5 presents box and whisker plots of the number of adults taken by time of day, grouped by site. Collecting at night appeared more productive at all sites, although when grouped in this way the difference is not significant at any site El Trapiche BOX PLOT OF ABUNDANCE GROUPED BY COLLECTOR, N = 50 2.00 49.00 MINIMUM MAXIMUM - E> -!? j Q- 1.000 2.000 3.000 4.000 5.000 Rio Tigre BOX PLOT OF ABUNDANCE GROUPED BY COLLECTOR N = 41 2.00 49.00 MINIMUM MAXIMUM -CD- -<-CTD? .-LDD- 1.000 2.000 3.000 4.000 5.000 Cerro Uchumachi BOX PLOT OF ABUNDANCE GROUPED BY COLLECTOR, N = 35 2.00 49.00 MINIMUM MAXIMUM -(- I- 1.000 , |_ > r ,| 2.000 0 t ? 0- -\ t 3.000 M ? "" 4.000 >r "K ? 5.000 Figure 3. Box and whisker plots of number of adults collected by each collector, grouped by site See Fig. 2 legend for explanation of plot 53 H Trapiche BOX PLOT OF ABUNDANCE GROUPED BY METH, N = 50 2.00 MINIMUM 3? 49.00 MAXIMUM -o Look up Look down Beating Rio Tigre BOX PLOT OF ABUNDANCE GROUPED BY METH, N = 41 100 49.00 MAXIMUM MINIMUM t Look up i ( ? Look down _J Beating "L i 2.00 MINIMUM Cerro Uchumachi BOX PLOT OF ABUNDANCE GROUPED BY METH, N = 35 ? > 49.00 MAXIMUM .-CD Look up Look down Beating Figure 4. Box and whisker plots of number of adults collected by each method, grouped by site. See Fig. 2 legend for explanation of plot B Trapiche BOX PLOT OF ABUNDANCE GROUPED BY TIME OF DAY, N = 50 2.00 MINIMUM 49.00 MAXIMUM H3r Day Night Rio Tigre BOX PLOT OF ABUNDANCE GROUPED BY TIME OF DAY, N = 41 2.00 MINIMUM 49.00 MAXIMUM < ?? > CO Day Night Cerro Uchumachi BOX PLOT OF ABUNDANCE GROUPED BY TIME OF DAY, N = 35 2.00 MINIMUM J 49.00 MAXIMUM ( ?? ) -o Day Night Figure 5. Box and whisker plots of number of adults collected by day and by night, grouped by site. See Fig. 2 legend for explanation of plot 54 Inspection of the data presented in Figures 3-5 suggested that some of the data was heteroscedastic, especially that grouped by collector (Fig. 3). Bartlett's test for homogeneity of variances on the data grouped by each main factor confirmed significant heterogeneity in data grouped by collector (p< 0.001), and time of day (p< 0.000). In addition, most plots of grouped data disclosed some outlier points (more than 1.5 times the interquartile distance from the median value). Seven samples comprising abun- dances of 2, 3, 40, 42, 43, and 49 adults per sample were responsible for these outliers. A histogram of the abundance data confirmed a longish left tail and a group of unusually productive samples at the right-hand tail. The highest abundance sample was taken by collector 4 from the lower surface of a wind-thrown tree trunk during a bout of "looking up" at night. This habitat differs sufficiently from that normally encountered during "looking up" collecting to be regarded as unusual or different. Perhaps the other unusual samples were also the result of collecting in "aberrant" microhabitats. Deletion of these data points eliminated most of the outliers, and the heteroscedasticity in the data grouped by collector (p< 0.156), but not that due to time of day (p< 0.048). Subsequent box and whisker plots of this reduced data (n=119) set did not change most of the results. Significant differences still appeared at El Trapiche and Rio Tigre between collectors 1 and 3, and between looking up and looking down at El Trapiche and between looking up and beating at Rio Tigre. Tukey HSD tests on the abundances grouped by collector, method, and time of day (as in Fig. 2) confirmed that the only significant difference was between collectors 1 and 3 (p< 0.013), that looking up was significantly more productive than either of the other two methods tested (p< 0.000 from looking down and p< 0.011 from beating), and that, overall, collecting at night was more productive than during the day (p< 0.017). Figure 6 presents the observed species abundance distributions for the two Peruvian canopy samples. Raw abundance distributions were transformed logarithmically (log base 2; Preston, 1948; Ludwig & Reynolds, 1988), and plotted as bar charts. Overlaid on each analysis is a curve connecting the expected values for each octave of abundance under the lognormal model. The Manu single canopy sample (Fig. 6, above) included roughly 95,000 arthropods (Erwin, pers. comm.). From it 613 spiders (222 adults, 391 juveniles) were classified among 187 species. Because the sample arrived as a single collection rather than a series of equivalent samples, the jackknif e could not be applied. The range of abundances spanned seven octaves, with an observed mode at 63.5 species. Expected numbers of species per octave were obtained in two ways: a Chi-square goodness of fit and a least square fit to the lognormal equation (S[R] = S0exp[-a2R2]). Because the Chi-square tests the fit to the lognormal model, we used a non-linear estima- tion program to minimize the Chi-square ([0-E]2/E) as a criterion for the best-fit esti- mates of the lognormal parameters. This yielded an expected mode at 54.9 species and a variance of 0.414 (X2 =8.19, df=5, p< 0.1). On this basis, we accept the null hypothesis that the observed distribution is lognormal The area under the curve, interpreted as the total species richness (S*), including those species too rare to be included in the sample, is 235 species (calculated as S* = 1.77(S0/a); Ludwig & Reynolds, 1988). As pointed out above, no rigorous way exists to compute the confidence interval on this estimate. The number of species observed is obviously a lower "bound." The upper "bound" is unknown, but probably 273 = 1.77(S0maX/amin), where Somax is the upper confidence limit on So and amin is the lower confidence limit on a, is an overestimate (Fig. 6, legend). The second set of five fogging samples (Fig. 6, below), each from a different group of canopy tree species, is amenable to both the lognormal and the jackknife. In this case, 1834 individuals (354 adults; 1480 juveniles) were classified among 426 species, as above. Nine octaves of abundance were observed. Judged by Chi-square goodness of fit, the fit to a lognormal distribution is poor (X2=22.36, df= 7, p< 0.001). Nevertheless, using these estimated parameters, S* is 560 55 Figure 6. (Above) Estimation of total species richness for fauna of a single canopy. Chi-square esti- mate of the mean, S?, is 54.9 ? 15.1, and of the variance, a, 0.414 ? 0.082 (tf = 8.19, p< 0.1). S* is 235 species (bounds 1> 142-373 species). Least-square estimates of the same parameters are S? = 57.2 ? 16.8; a = 0.473 ? 0.234 (i2 = 0.91); S* = 214 (bounds ~ 101-548 species, see text for explanation). (Below) Estimation of total species richness for fauna of five canopies. Chi-square estimate otSP = 115.1 ? 25.5; a = 0.366 ? 0.047; (X2 = 22.36, p< 0.001); S? = 560 species (bounds ^ 384-780). Least-squares estimates of the same parameters are SP = 128.3 ? 21.3, a = 0.469 ? 0.127 (i2 = 0.96); S# = 484 (bounds = 318-774 species). species. As above, combining the extreme confidence limits to get a rough idea of the upper bound on the total species richness estimate based on these data, one obtains 426? 780 species. The jackknife estimate (S,=So + k(n-l)/n) works by augmenting the number of species actually observed (So) by the number of species (k) unique to any sample, weighted by the number of samples. Its variance (var(S*)= (n-l)/n[2f f(j)-k?/n]) is a function of the number of samples f(j) having j unique species. The jackknife estimate for these data therefore yields 647 ? 438 species, which broadly overlaps with the lognormal estimate. It is evident that the estimate of S* from these data is not as precise as that for the single canopy data, which is to be expected given the poor Chi-square fit If the data are treated instead as five samples (one from each forest type) rather than as 225 samples, the jackknife estimate is 400 ? 1035 species. 56 DISCUSSION Despite long hours in the field, we averaged rather few samples per person per day. We attribute these surprisingly low numbers to our unit of measurement?the time literally spent searching for animals. Our results may prove typical for similar inventory proj ects; a surprising amount of time in the field is not actually spent searching for animals. Fewer significant effects due to collector appeared than we expected. Those effects that proved durable apparently were due to an initial disparity in experience between one collector and the rest. As expected, collectors vary widely in numbers of adults per sample (Fig. 3), and some data points are outliers. These outliers present some analytical problems but, at least in this case, interpretation was not too difficult Thus, despite our wide variety of taxonomic interests, the experienced collectors performed about equally if judged by total number of individuals taken. Collector 3, the less-experienced individual, got significantly better over the course of the trip (Fig. 3), as one would expect If typical, this is good news for inventory projects. It suggests that inexperienced personnel in the company of experienced collectors can learn collecting techniques (Le., become statistically indistinguishable) in a very short time. It also suggests that fairly diverse collectors (at least from the fine-grained point of view of taxonomic speciality), if given explicit instructions, perform about the same. As a cautionary note, the low numbers of individuals taken by Collector 5 at site 3 (Fig. 3) remain unexplained. Method was significant at all sites and overall (Fig. 4), but not as we had expected. Simple searching in the herb/shrub/sapling zone seems the most efficient technique. Ground-searching {looking down) is less effective, but that habitat is less complex, less accessible to humans, and possibly less rich. Beating the former zone was better than searching the ground, but not as productive as searching. A priori, we expected that beat- ing tray samples would be most productive, perhaps because each beating event typically produces multiple individuals. However, judged per hour of collector time, it is not the most productive. The most productive method seems to be looking up. Our classification of methods obviously was coarse. We were satisfied with beating looking up, and sifting as a "natural" classification of collecting techniques. They seem to represent consistent collecting behaviors that are repeatable across sites and other variables, given a certain amount of common sense on the part of the collector. "Looking down!' on the other hand, is heterogeneous. Activities such as tearing apart rotted logs take a lot of time but are not as productive of individuals or species as is hand-searching. However, they yield portions of the total species richness not accessible in other ways. In the future, looking down should be restricted to activities exactly comparable to that in looking up, and some other category elaborated for different kinds of activities. On the whole, spiders are more abundant at night (Fig. 2), although the difference is not always significant in reduced data sets (Fig. 5). This increased abundance is probably due both to the nocrurnality of the fauna, as well as the simpler and less confusing illumination of a headlamp as compared to dappled sunlight and color during the day. Surprisingly, numbers of individuals taken per sample at each site did not differ sig- nificantly (Fig. 2). Within evergreen tropical forest in Bolivia, spider abundance apparently does not vary much even at sites that vary widely in elevation and vegetation structure. The sampled sites covered an altitudinal range (100 m-2700 m) within which other workers have found significant differences in arthropod abundances. Whether this result will be corroborated by similar studies remains to be seen With regard to the expense and efficiency of estimating species richness, our data suggest several interesting points. If one makes the reasonable assumption that the average tropical "site" (however one wants to define site) has about 300-800 species of spiders at any one time, it is possible to calculate best/worst scenarios for how long it would take to get a sample of such a fauna adequate to estimate the total species richness at 57 the site. For the lognormal parameter a > 0.3 in communities of the above size, roughly 10 times as many individuals as species will suffice to estimate reliably species richness, whether by parametric distributions, non-parametric methods, or graphically. According to our results, the best (worst) scenario is that 119(468)?303(1250) samples would have to be completed in order to obtain 3000?8000 identifiable individuals (calculated from Table 1). If the sustainable range of samples per worker per day is taken to be 3-5, one person would need 23(156) field days in a low diversity site and 60(416) days in a'high diversity site. Five people might require only a fifth of those ranges. If the sorting time is triple the field time, the total study time for one person would be 92(624) days for a low diversity site or 240(1664) days for a high diversity site, not counting travel, city time, or other such factors. These estimates ignore the obvious advisability of repeating the study in different seasons, although the generality of the importance of seasonality is still unknown. However, Erwin and Scott (1980) found that species turnover between wet and dry seasons was 97%. Given that the field time estimated above for a single person might span significant seasonal change, multi-person teams per taxon are advisable. Neverthe- less, in our experience duplication of taxon expertise beyond a pair of people on inven- tory teams is rare. Insofar as the adequacy of a single sampling effort is a goal either met or not, and if not met, cannot be fixed by a repeat trip without adding yet another factor (season and/or year) to an already complex analysis, it seems well worth the money to ensure that the initial effort is sufficient unto itself. Similar analyses should be performed on the distribution of numbers of species per sample, categorized by the same main factors as in the above analysis. Because the time- consuming job of sorting animals to species within site is not complete, we cannot yet per- form that analysis. Because numbers of species per sample will be much lower than numbers of individuals, differences among factors and treatments are less likely to be sig- nificant This also means that we cannot yet construct species abundance distributions from these data, and therefore cannot yet estimate total species richness for any of the Bolivian sites, or assess the available analytical methods on these data. The Peruvian canopy samples, kindly provided by Dr. T L Erwin, suggests that both the lognormal and the jackknife can be used on tropical faunas. These disparate methods roughly agree in the one case in which we have been able to compare them thus far. However, for both collections we included juveniles in our counts. Without juveniles, the single canopy sample would have provided only 222 classifiable individuals and the five canopy sample 354 individuals?both inadequate samples on which to estimate total species richness. Using juveniles increases sample size but the identifications are ambiguous, and studying them tremendously increases sorting time. For the single canopy sample, the S? estimate was 235 species. This is not at all out of line with our hunch that tropical spider faunas from a single site range between 300 and 800 species, given that this sample was from a single canopy. Estimates such as this 187 species observed ranging to 273 species maximum as a likely upper "bound," rough as they are, could be useful for conservation planning. The set of five canopy samples is more problematic because in fact each was from a dif- ferent forest type. Certainly they represent an increase in area sampled relative to the above, and therefore this is not more intense sampling of the same community, but rather sampling over a re-defined and larger area. Pooling the different forest types may have been responsible for the poor fit to the lognormal distribution We followed Ludwig and Reynolds' (1988) recommendations for fitting a lognormal because it meshes well with statistical packages. However, both of our examples may point out a weakness in their approach. No matter how large the number of singletons (species represented by one specimen), if there are doubletons and the octave boundaries are integers, the observed data will always have an apparent mode. Octave 0-1 will always contain fewer species than octave 1-2. Pielou (1975) and Magurran (1988) use logarithmic 58 groupings that avoid this effect on the first two octaves. Hughes (1986) also commented on apparent modes in logged data as effects of the way data are grouped. Insofar as the lognormal requires estimation of the mode, it appears that tropical faunas exhibiting species abundance relations in which singletons are most numerous will always appear lognormal if grouped by Ludwig and Reynolds' method, and this may be a problem with our data. The asymmetry between the first and third octaves in Figure 6 may indicate that singletons are "piling up" in the first and second octaves. Observing two octaves to the left of the mode would give greater confidence that the mode had been obtained. At any rate, a benefit of the lognormal is that it does not require replicate samples. Data collection is therefore easier. On the other hand, the lack of a confidence interval on the estimate is a serious drawback. In the legend to Figure 6 we also report estimates of the lognormal parameters using a least squares procedure. Although it tells us that a significant amount of the variation is explained, that is hardly surprising. The use of a Chi-square expression as the "loss func- tion" minimized by the nonlinear estimation procedure seems preferable because it better tests the null hypothesis of a lognormal distribution. The jackknife is not without problems either. Robert Edwards (pers. comm.) points out that the jackknife estimate has an upper limit of about twice the number of species observed, which would be obtained if all observed species were uniques. If less than half the total species in the community are represented in the sample, the jackknife will under- estimate the real species richness. Treating the five canopy sample either as 225 replicate samples or 5 replicate samples shows that the accuracy and precision of the jackknife depends greatly on the number of (replicate) samples available. Based on evidence from other groups studied at Tambopata, it is one of the richest sites on earth (Erwin, 1985). The total number of spider species known to date from non- canopy habitats at Tambopata Forest Reserve is about 500-600 (Silva, in prep.). Without knowing to what extent the faunas of the canopy and non-canopy overlap, we cannot easily compare these numbers. However, based on the possible extremes of no or total overlap, the number of spider species at Tambopata ranges from the known fauna of slightly more than 500 to a high of about 1200. As no other site known to us is comparably diverse in spider species, Tambopata's "megadiversity" reputation is sustained by these data. Finally, we would like to make three points about future work in this area. All methods that estimate the total species richness of a community make assumptions about the data used in the estimates. We have shown that care in organization of sampling procedures can assess these assumptions at relatively little extra cost. We can say something about the magnitude of the effect of collector experience, collecting method, site, and time of day on relative abundance of individuals in samples. Number of species per sample can be analyzed in the same way as number of adult individuals. Both analyses seem worthwhile because they reveal the structure and quality of the data that underlie estimates of species richness. For example, egregious heterogeneity in the data may indicate that different communities have been inadvertently combined or considered as one. In any study such as ours the number of factors involved will generally be at least four, each factor having several levels. Strictly speaking, this implies a four-way analysis of variance with 90 cells in this case. Adequate replication in each cell implies 270 samples, a number we did not come close to attaining at each site in this preliminary study. Other taxocene-oriented efforts will face the same serious problem. Addition of other factors (e.g., by microhabitat, trophic level, or season) will complicate the analysis. However, if each sample yields about 17 indi- viduals, 270 samples implies some 4600 specimens. This may be about the right sampling intensity for a fauna of 500 species, coincidentally the median value in our guess as to the range of typical species richness of spiders in evergreen tropical sites. In other words, if one is going to need 270 samples anyway, one may as well collect them in a structured 59 test hypotheses about collector or method effects, there is no wmSitiSSfS^ requirements that a strict experimental design imposes ** 1? , ?f?nve ?f the sPeaes abundance distribution at each site would have been an add bonal 1000 specimens (Table 1), most of which would have been known alreadv fTrn n VK,US sample, Yet another octave implies an additional 400s?cTenTWstXh onCth'ean 2 V^l f P'??fAeS^Srichness~^*2Sdepends ?! HI u^65 (d3ta P0ints) ?bserved Benefits ^e linearly but costs risl exponennally. An obvious solution is to truncate intentionally the collection of aidant speciesand therefore *?accept <>eU Imes"at bom taus ramer t^ o automated traps, field collectors could collect in this SSfSEt?S? species can be recognized accurately. If conservation research is to have n^mtm^mZ on the fauna, this is a strong argument for the kind of sampling protocol dTne h?e at opposed to more automated technique, On the other hand! thf Lount of work toVe nme octaves is terrific in the tropics, and the rightmost ones are the easiest * nizin?? u^l m tWS W?rk iS the sePa?tion of morphospecie, While recog- nizing the number of species within any sample is easy, collatinJspeL across hundreds of samples is difficult Collating them reliably across different sles becom^creTslnl unrealistic without preparing detailed notes, drawings, and, especidTyXg^oses rf hundreds of morphospecie, There comes a time when the "synoptic coUectto^fporoach se'tonT. PT "!" dmei f?r ^^ ^ and SS^TvS^ servahon biology and ecology tend to see this as a problem in "data management" an and one which the response to the biodiversity crisis evidently hopes to sidestep bv gnonng the issue of Latin name, However, our work shows that the nld for alpha taxonomy does not grow out of concern with names, but rather out of ntsak data management We predict that simply to do a competent job of makl a Zparan^e inventory of tropical sites, researchers will have to gLerate^omewher'tLZSeir notes computers, or minds, a functional alpha taxonomy of the group vJ^SSteZ s^aX^^^^ ACKNOWLEDGMENTS Sandoval, Jd M^Sitnik MpS^ST^ Mario BT" "^ ^^ Instituto de Ecoloeia in U Pa7 nrnvST 81StlCS' Dr- ano Baudoin and his staff of the Erwinverykin^^^^^ 60 the manuscript considerably. This paper is contribution no. 10 from the Biological Diver- sity in Latin America (BIO LAT) Project LITERATURE CITED B roadhead, E 1983. The assessment of faunal diversity and guild size in tropical forests with particu- lar reference to the Pscoptera. Pp. 25-41. In: S. L Sutton, T. C. Whitmore, & A C. Chad wick (eds.), Tropical Rain Forest Ecology and Management. Blackwell Scientific Publishing, Oxford Coddington, J. A., Larcher, S. F. & J. C. Cokendolpher. 1990. The Systematic status of Arachnida, exclusive of Acarina, in North America north of Mexico (Arachnida: Amblypygi, Araneae, Opiliones, Palpigradi, Pseudoscorpiones, Ricinulei, Schizomida, Scorpiones, Solifugae, Uropygi). Pp. 5-20. In: M. Koztarab & C. W. Schaeffer (eds.), Systematics of the North American Insects and Arachnids: Status and Needs. Virginia Agricultural Experiment Station Information Series 90-1, Virgina Polytechnic Institute and State University: Blacksburg. Coddington, J. A. 1990. Review of Advances in Spider Taxonomy 1981-1987: a supplement to Brignoli'sA Catalog of the Araneae described between 1940 and 1981, by Norman L Platnick, 1989. (Edited by P. Merrett). Manchester University Press. /. Arachnology 18(2): 000-000. Erwin, T. L 1983. Beetles and other insects of tropical forest canopies at Manaus, Brazil, sampled by insecticidal fogging. Pp. 59-75. In: S. L Sutton, T. C. Whitmore, & A. C. Chadwick (eds), Tropical Rain Forest Ecology and Management. Blackwell Scientific Publishing, Oxford Erwin, T. L 1985. Tampobata Reserved Zone, Madre de Dios, Peru. History and description of the Reserve. Rev. Per. Ent. 27:1-8. Erwin, T. L 1989. Sorting tropical forest canopy samples (an experimental project for networking information. P. 8. In: R. J. McGinley (ed), Insert Collection News, voL 2(1). Department of Entomology, National Museum Natural History, Smithsonian Institution: Washington, D.C Erwin, T. L & J. C. Scott 1980. Seasonal and size patterns, trophic structure, and richness of Coleoptera in the tropical arboreal ecosystem: the fauna of the tree Luhea seemanii Triana and Planch in the Canal Zone of Panama. The Coleopterists Bull. 34(3):305-322. Farrell, B. D. & T. L Erwin. 1988. Leaf-beetle community structure in an Amazonian rainforest com- munity. Pp. 73-90. In: P. Jolivet E. Petitpierre & T. H. Hsaio (eds.), Biology of Chrysomelidae, Kluwer Academic Publishers. Fisher, R. A., A. S. Corbett & C. B. Williams. 1943. The relation between the number of species and thq number of individuals in a random sample of an animal population. f.Anim. EcoL 12(1) :42- 58.i Heltsche, J. F. & N. E. Forrester. 1983. Estimating species richness using the jackknife procedure. Biometrics 39:1-11. Hubbell, S. P. & R B. Foster. 1983. Diversity of canopy trees in a neotropical forest and implications for conservation Pp. 5-41. In: S. L Sutton, T. C. Whitmore & A. C. Chadwick (eds.), Tropical Rain Forest Ecology and Management, Blackwell Scientific Publishing, Oxford Hughes, R G. 1986. Theories and models of species abundance. Amer. Nat 128:879-899. Ludwig, J. A. & J. F. Reynolds. 1988. Statistical Ecology: A Primer on Methods and Computing. Wiley Interscience, New York. Magurran, A. E 1988. Ecological Diversity and its Measurement. Princeton University Press: Princeton, N.J. 179 pp. May, R M. 1975. Patterns of species abundance and diversity. Pp. 81-120. In: M L Cody & J. M Diamond (eds.), Ecology and Evolution of Communities, Belknap Press, Harvard: Cambridge, MA May, R M. 1988. How many species are there on Earth? Science 241:1441-1443. Mound, L A. & N. Waloff (eds.) 1978. The Diversity of Insect Faunas (Symposium of the Royal Entomological Society No. 9). Blackwell Scientific Publishing: Oxford Pielou, E C 1975. Ecological Diversity. Wiley Interscience: New York, 165 pp. Pielou, E C. 1977. Mathematical Ecology. Wiley Interscience: New York, 385 pp. Preston, F. W 1948. The commonness and rarity of species. Ecology 29:254-283. Southwood, IRE 1978. Ecological Methods: With Particular Reference to the Study of Insect Populations, 2nd ed Chapman and Hall: New York. Sugihara, G. 1980. Minimal community structure: an explanation of species abundance patterns. Amer. Nat 116:770-787. Taylor, L R. 1978. Bates, Williams, Hutchinson ? a variety of diversities. Pp. 1-18, In: L. A Mound & N. Waloff (eds.). The Diversity of Insect Faunas (Symposium of the Royal Entomological Society No. 9), Blackwell Scientific Publishing: Oxford Wilkerson, L 1988. SYSTAT: the system for statistics. SYSTAT Inc.: Evanston, IL 822 pp. Williams, C. B. 1964. Patterns in the Balance of Nature. Academic Press: London. Wilson, E. O. & F. M. Peter. 1988. Biodiversity. National Academy Press: Washington, D.C.