Introduction to Biological Data Analysis and Statistics Steps in the process of understanding data: 1. Collecting the data 2. Summarizing the data 3. Analyzing the data 4. Interpreting the results and reporting them Note that before carrying out any of the above, there is presumably some underlying question or hypothesis you have formulated, which you wish to use the data to address. There are a few key types of approaches by which we can address scientific questions: observation (natural history - see what occurs where and interpret the results based upon differences in the locations or history), experiment (vary aspects of the environment in order to tease apart how the biological components respond), and theory (make assumptions about the natural world and analyze the implications of those assumptions using verbal, graphical, and mathematical arguments). Each of these approaches involves quantitative approaches, and an objective of this course is to provide you with an understanding of some of the methods needed. Step 1 above involves the area of "design of experiments" in which the process by which the data are to be collected is determined based upon the objectives of the study and the limitations imposed (e.g. cost, time, available personnel, accessibility of the study area, etc.). Design implies that the scientist considers alternative methods to collect the data (e.g. spatial sampling for field observations, different experimental setups for laboratory experiments), as well as the manner in which the factors deemed to affect the data collection are manipulated (e.g. if you wish to determine response of an organism's behavior such as it's respiration rate to temperature, how many different temperature treatments are applied, in what order and for how long). Step 2 in the process is typically called "descriptive statistics" in which the objective is to abstract out certain properties of the data in order to better interpret them. The assumption here is that the data are too complex to understand well by simply looking at them as lists or tables. The simplest example of this is the computation of an "average" value of the data. Many of us obtain a better grasp of a data set by having some summary of the data available, particularly in graphical form, rather than simply a tabular elaboration of the data. Note that whatever methods are utilized here, there is a loss of information associated with the description provided - the description (e.g. the average value of the data) does not include the full amount of information in the complete data set. An objective in descriptive statistics is to choose the appropriate level of description between complete enumeration of the data, and a coarse simple summary (such as a mean), so as to be able to address the questions you posed in the first place. Step 3 in the process typically involves the area of inferential statistics - parameter estimation and hypothesis testing. These refer to using the data to determine estimates of values of particular interest (respiration rate, photosynthetic rate, hemoglobin level, etc.) from the observations, a process called parameter estimation. One might then use the data to evaluate hypotheses (respiration rate increases with temperature, the hemoglobin content of two species differs) in which one compares a "null hypothesis" (respiration rate is independent of temperature) to an "alternative hypothesis" (respiration rate increases with temperature). Step 4 uses the results of the inferential statistics developed to evaluate the results of the observations and provide an interpretation of the results (there is a significant effect of temperature on respiration, and this implies that the species has limited lattitudinal range due to the effects of temperature; two species differ significantly in their photosynthetic rate and you expect one species to outcompete the other under certain environmental conditions). This course will focus on the descriptive statistics aspects of the above process. All life scientists are well served by being exposed to a formal statistics course that includes aspects of experimental design and hypothesis testing however, so you are encouraged to enhance your training in this area beyond the limited coverage included here. The emphasis on descriptive statistics here arises due to regular comments by life science faculty that an extremely important aspect of quantitative training for their students is the ability to interpret graphs, and utilize diverse graphical approaches to explain and interpret experiments.