A BRIEF GUIDE TO QUANTITATIVE PARASITOLOGY 3.0 I. THEORY 1. No single measure can in itself characterise the parasitic infection of a sample of hosts. Thus, no single statistical test can compare the levels of infection of between (among) any two (or more) samples. 2. Rather, parasitic infection of a sample of hosts can be characterised by several different measures, some of which have markedly different biological interpretations, others have more-or-less overlapping interpretations or no interpretations at all. We propose to prefer those measures which have a clear biological interpretation and which do not overlap with each other and thus do not predict each other. 3. Therefore, parasitic infections can be compared between (among) different samples only by means of several different statistical tests. II. HOW TO DESCRIBE THE PARASITIC INFECTION OF A SAMPLE OF HOSTS 1. Always give the host sample size. In most cases, this is expressed as the number of hosts individuals examined. (Exceptionally, other units may also be used for special cases.) 2. Describe prevalence. This is the proportion of infected hosts among all the hosts examined. Give the confidence interval of prevalence (you will find two alternatives for this) to indicate the accuracy of the estimation (use of the confidence intervals belonging to the 95% probability is advisable). 3. Describe mean intensity. This is the mean number of parasites found in the infected hosts (the zeros of uninfected hosts are excluded). Since sample size and prevalence are known, mean intensity defines the quantity of parasites found in the sample of hosts. Given the typical aggregated distribution of parasites, its actual value is highly dependent on a few extremely infected hosts. Also give the confidence interval to indicate the accuracy of the estimation. 4. Describe median intensity. This is the median number of parasites found in infected hosts (the zeros of uninfected hosts are excluded). Median intensity shows a typical level of infection among the infected hosts. Give the confidence interval to indicate the accuracy of the estimation. 5. In certain cases one may prefer to use mean abundance instead of mean intensity. This is the mean number of parasites found in all hosts (involves the zero values of uninfected hosts). Would you prefer to use it, also provide its confidence interval. Keep in mind that this measure unifies two of the former ones: prevalence and mean intensity. Do not use it, unless you have a clearly specified a reason why to prefer it. 6. Describing mean crowding and its confidence interval is essential only for those who study density-dependent characters of parasites. 7. Finally, quantify the aggregatedness of parasites' distribution among hosts. There are 3 indices widely used for this purpose, but their interpretation is quite similar. All the 3 are available in QP3.0. They predict each other rather well, thus it is not necessary to use all the 3 of them. III. HOW TO COMPARE THE PARASITE BURDENS ACROSS TWO OR MORE SAMPLES 1. Compare prevalences by Fisher's Exact Test. This will show whether the proportion of infected individuals differs significantly between the two samples. 2. If several samples are involved, the time need of the former test may increase dramatically. In this case, use Chi-Square Test for the same purpose. 3. Compare mean intensities by a Bootstrap Test. This will show whether parasite quantities differ significantly between the infected proportions of the two samples. 4. Compare median intensities by Mood's Median Test. This will show whether the typical level of infection differs significantly between the infected proportions of the two samples. 5. One can also compare the frequency distributions of intensities by a stochastic equality test. It compares several random pairs of individual values taken from the two samples to test whether or not there is a significant tendency to get higher values from one sample than from the other. 6. In certain cases, one may also decide to compare mean abundances by Bootstrap Test. This will show whether parasite quantities differ significantly between two samples. This comparison unifies two of the former ones: the comparison of prevalences and the comparison of mean intensities. Do not use it, unless you have a clearly specified a reason why to prefer this comparison. 7. Finally, find a tool to compare mean crowding values. This is a rather primitive - but the only existing - method: provided that the two 97.5% confidence intervals are not overlapping, we conclude that the two values are different at a 95% level of significance. IV. TECHNICAL NOTES 1. Increasing the number of samples involved may increase the time need of Fisher's exact test dramatically. Therefore, QP3.0 stops if the estimated time need exceeds a particular time limit. This is 1 minute as a default, but use WordPad to modify it in "FISHERX3.PAR". 2. Increasing the number of replications will improve the accuracy of Bootstrap Confidence Interval estimations, but the time need will increase too. The default value is 2000, but use WordPad to modify it in "BCACONF.PAR". 3. Increasing the number of replications will improve the accuracy of Bootstrap t-tests but the time need will increase too. The default value is 2000, but use WordPad to modify it in "2SAMBOOT.PAR" 4. The "print" command will refer to your Windows default printer. 5. Some extreme sample values may cause certain modules to fail. As far as we know, our Chi-square test fails if you compare prevalences of samples both (or all) equally with 100% prevalence. Of course, such "tests" do not make sense, but can cause a fail if you run them accidentally. In case of no response use the "CTRL+ALT+DEL" keys and then close QP3.0 as usual under Windows. Some tests may also fail when the maximum value of intensity is very high (several thousands). V. DO NOT MAKE THE FOLLOWING MISTAKES 6. Avoid the use of geometric means (not offered in QP3.0). This measure does not have a clear biological interpretation. 7. Avoid the format "mean +- SD" (not offered in QP3.0). This is meaningful for normal distributions, but not for the aggregated distributions characteristic to parasites. Rather use confidence intervals to characterise the accuracy of estimations. 8. Do not make overstatements when interpreting the results of statistical comparisons. If Sample 1 had a significantly higher prevalence than Sample 2, then you should not conclude that Sample 1 was "more infected" than Sample 2 (see I. 1-3). VI. AUTHORSHIP AND COPYRIGHT NOTES 1. QP 3.0 is free for distribution and use in education and science. 2. Before using it for pharmaceutical, industrial, agricultural or any other economic purposes, please contact the authors. 3. When using it for academic purposes please cite: Reiczigel, J. & Rózsa, L. 2001: Quantitative Parasitology 3.0. Budapest. 4. Alternatively, as a theoretical background you can cite: Rózsa, L., Reiczigel, J. & Majoros, G. 2000: Quantifying parasites in samples of hosts. Journal of Parasitology, 86, 228-232. However, some of the more recent methods are not described in the above paper. 5. Thus, when testing crowding values, please consult and cite: Reiczigel, J., Lang, Z., Rózsa, L. & Tóthmérész, B. 2005. Properties of crowding indices and statistical tools to analyze crowding data. Journal of Parasitology, 91, 245-252. 6. Similarly, when testing stochastic equality of intensity distributions, please consult and cite: Reiczigel, J., Zakariás, I. & Rózsa, L. 2005. A Bootstrap Test of Stochastic Equality of Two Populations. The American Statistican, 59(2), 156-161. 7. Finally, when using an unorthodox approach (Sterne's exact method or Wald's method) to test the exact confidence limits for prevalence you can support this decision by citing: Reiczigel, J. 2003. Confidence intervals for the binomial parameter: some new considerations. Statistics in Medicine, 22, 611-621. 8. All (but "Aggregation indeces") statistical modules were designed by J. Reiczigel. Budapest, Hungary, 25th of July, 2005. Jenő Reiczigel & Lajos Rózsa