Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

SPSS Guide: Descriptive Statistics for Biology Labs, Exams of Biostatistics

A detailed guide on using spss for descriptive statistics in a biology lab setting. It covers how to compute summary statistics, create histograms, and interpret data distributions. The lab focuses on analyzing height and arm span measurements of students, offering step-by-step instructions and examples for spss version 29/30. It emphasizes the importance of proper data formatting and graph editing for assignments, including removing color and adjusting line thicknesses. The document also includes interpretations of histograms for height distributions, both overall and separated by gender, providing insights into data analysis and presentation. It is a practical resource for students learning statistical analysis in biology, offering hands-on experience with spss and data interpretation techniques. Designed to enhance students' understanding of descriptive statistics and their application in biological research, making it a valuable tool for both lab work and assignment preparation.

Typology: Exams

2024/2025

Available from 05/29/2025

elam-dennis
elam-dennis 🇨🇦

12 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
BIOL 361 LAB 1 Intro to SPSS & Descriptive Statistics Answer File
TA Lab Section & Time
Amy White LAB 104 M 12:30-2:20 pm in PHY 145
Arisha Imran LAB 102 M 2:30-4:20 pm in PHY 145
Xuewen Geng LAB 105 W 12:30-2:20 pm in PHY 145
Amy Lacey LAB 106 Th 2:30-4:20 pm in STC 0050
Luke Schofield LAB 103 F 12:30-2:20 pm in PHY 145
Ashley Ferns LAB 101 F 2:30-4:20 pm in PHY 145
MAKE SURE TO WORK THROUGH LECTURE 4 BEFORE PARTICIPATING IN LAB 1.
Before attending the lab, you must secure access to IBM-SPSS version 29/30 software using one of the two
options detailed in the course Outline. Make sure you have SPSS open and running when the lab session
begins. Note: for access to SPSS using Option 2, the university loaded version 30 of SPSS after Prof. Hall
wrote the Course Outline. Versions 29 and 30 should be virtually identical.
Objectives: In this lab, you’ll learn basic features of IBM-SPSS that will be useful for this lab, future labs and
assignments. You’ll learn to use SPSS to perform descriptive statistics by applying it to two data sets collected
in 2019.
Part A: Getting started in SPSS no Answers beyond the information provided in the Instruction file. By the
end of Part A, students will have the data used in Part B loaded into an SPSS data file and saved as
Height_Armspan_2019.sav’. If you followed the instructions, the .sav file will have the correct ‘Type’, number
of decimal places, and an informative ‘Label’ including the unit of measurement identified for each variable.
Part B: Perform descriptive statistics on the data recorded on the gender, height and armspan of students in
BIOL 361, using the SPSS file you generated in Step A.
The file contains the following 4 variables: [Go over the structure of the data (the data were not organized
with male and female values in separate columns! Instead, all heights are in 1 column and code for gender in
another column]
Year = Year of study (2019)
Gender = Gender of student (m = male, f = female) [enter right-hand text into ‘Variable View’ tab]
Height = Height (cm) [enter right-hand text into ‘Variable View’ tab]
Armspan = Armspan length (cm) [enter right-hand text into ‘Variable View’ tab]
1) Compute the summary statistics of central tendency and spread (i.e., mean, median, minimum, maximum,
variance, standard deviation, interquartile range, skewness, kurtosis) for each of the quantitative variables
in the dataset [hint: there are two quantitative variables Height & Armspan].
Analyze Descriptive statistics Explore Display = ‘Statistics’ move variable(s) to dependent list
Or [better], to complete parts 1) and 2) in one step:
Analyze Descriptive statistics Explore Display = ‘Both move variable(s) to dependent list AND Click
‘Plots’ button and choose ‘histogram’
Note that when you perform this step, a second window (or file) opens up called Output 1 (by default).
All computations you perform in SPSS get placed into this type of output file (*.spv). You can Save this
file as Height_Armspan_2019.spv
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download SPSS Guide: Descriptive Statistics for Biology Labs and more Exams Biostatistics in PDF only on Docsity!

BIOL 361 LAB 1 – Intro to SPSS & Descriptive Statistics – Answer File TA Lab Section & Time Amy White LAB 104 M 12:30-2:20 pm in PHY 145 Arisha Imran LAB 102 M 2:30-4:20 pm in PHY 145 Xuewen Geng LAB 105 W 12:30-2:20 pm in PHY 145 Amy Lacey LAB 106 Th 2:30-4:20 pm in STC 0050 Luke Schofield LAB 103 F 12:30-2:20 pm in PHY 145 Ashley Ferns LAB 101 F 2:30-4:20 pm in PHY 145 MAKE SURE TO WORK THROUGH LECTURE 4 BEFORE PARTICIPATING IN LAB 1. Before attending the lab, you must secure access to IBM-SPSS version 29 /30 software using one of the two options detailed in the course Outline. Make sure you have SPSS open and running when the lab session begins. Note: for access to SPSS using Option 2, the university loaded version 30 of SPSS after Prof. Hall wrote the Course Outline. Versions 29 and 30 should be virtually identical. Objectives: In this lab, you’ll learn basic features of IBM-SPSS that will be useful for this lab, future labs and assignments. You’ll learn to use SPSS to perform descriptive statistics by applying it to two data sets collected in 2019. Part A: Getting started in SPSS – no Answers beyond the information provided in the Instruction file. By the end of Part A, students will have the data used in Part B loaded into an SPSS data file and saved as ‘Height_Armspan_201 9 .sav’. If you followed the instructions, the .sav file will have the correct ‘Type’, number of decimal places, and an informative ‘Label’ including the unit of measurement identified for each variable. Part B: Perform descriptive statistics on the data recorded on the gender, height and armspan of students in BIOL 361, using the SPSS file you generated in Step A. The file contains the following 4 variables: [Go over the structure of the data (the data were not organized with male and female values in separate columns! Instead, all heights are in 1 column and code for gender in another column] Year = Year of study (20 19 ) Gender = Gender of student (m = male, f = female) [enter right-hand text into ‘Variable View’ tab] Height = Height (cm) [enter right-hand text into ‘Variable View’ tab] Armspan = Armspan length (cm) [enter right-hand text into ‘Variable View’ tab]

1) Compute the summary statistics of central tendency and spread (i.e., mean, median, minimum, maximum,

variance, standard deviation, interquartile range, skewness, kurtosis) for each of the quantitative variables in the dataset [hint: there are two quantitative variables → Height & Armspan ]. Analyze → Descriptive statistics → Explore→ Display = ‘Statistics’ → move variable(s) to dependent list Or [better], to complete parts 1 ) and 2 ) in one step : Analyze → Descriptive statistics → Explore→ Display = ‘Both’ → move variable(s) to dependent list AND Click ‘Plots’ button and choose ‘histogram’

  • Note that when you perform this step, a second window (or file) opens up called Output 1 (by default). All computations you perform in SPSS get placed into this type of output file (*.spv). You can Save this file as Height_Armspan_201 9 .spv

Table 1. Tabular Output obtained (the numerical summary statistics) : [Note: the tables produced by SPSS are not in adequate format for your Assignment Reports. See the Course Handout and Lecture 4 file for examples of how to format a Table and to craft and place a Table Caption. The SPSS tables do not show the correct number of decimal places according to the Rounding-off rule – you have to do this! Measurements were made to the nearest 0.5 cm, so minimum and maximum should be presented to 1 decimal place, and calculated summary statistics (mean, standard deviation, variance, etc.) should be presented to 2 decimal places (1 more decimal place than the ‘raw’ data). Since there are 1 57 subjects (an odd number), the median is a value from a single subject, and should be presented to one decimal place. Note that if the sample size had been an even number (e.g., 156 or 158), the median value would be an average of the two middle values and should be presented to 2 decimal places.] Information that can be gleaned from the Table of summary statistics: The above table presents the key summary statistics for central tendency (mean, median), spread (standard deviation, variance, interquartile range, range, minimum, maximum) and shape (skewness, kurtosis). For both variables, the mean is slightly larger than the median, and values of skewness is slightly greater than zero, suggesting a distribution that is close to symmetrical or slightly skewed to the right. The means for height and armspan seem to be quite similar (<1.5 cm difference) when considering that values for each variable span a range of 44 - 5 1.5 cm for the two variables, respectively.

Figure 1. Histogram showing the distribution of height of students in the course BIOL 361 in winter term of 2019 (n = 157 ). [Note: I edited this graph from the way SPSS provides it. I removed all colour & shading (from the histogram bars and graph background), removed extraneous information, removed grid lines, made the lines thicker (‘weight’ of histogram bars = 1.5; graph axes and tick marks = 1.5), enlarged the text size (axis values = font size 11; axis labels = font size 12). Make sure to remove all colour from graphs in your assignment and to thicken lines and increase font sizes, so the graph looks good when printed in your final version. I set major ticks to every 5 cm and added 4 minor ticks per major tick on both axes (which creates a tick mark for each cm and student). I added lines around the top and right of the graph area and made the lines 1.5 thickness.

To copy the graph into Word, I used Copy As [Right click on graph to get this] → Image; then in Word by

Paste Special → Picture (PNG)]

Interpretation of the histogram in Figure 1 : Height of students in the course BIOL 361 in winter term of 20 19 appears to have an approximately unimodal distribution, with the mode centered near 1 66 .5 cm (Figure 1). The distribution appears to be slightly skewed to the right, which is consistent with the slightly larger mean (170. 23 cm) than median (1 69. 0 cm) and a skewness of 0. 58 (Figure 1, Table 1). A small positive value for kurtosis (0.28) suggests the distribution is slightly leptokurtic. There is an apparent gap at the low end of the distribution and one possible outlier. The values range from 152 .0 cm to 1 96 .0 cm (Table 1). Students might be tempted to describe this histogram as showing a bimodal distribution, which may not be wrong. It is possible that male and female students follow different unimodal distributions for height, which overlap to form two modes in this data set. I just don’t see enough separation between two possible modes to warrant a conclusion of a bimodal distribution. Students are welcome to gain practice on their own in interpreting a histogram showing the distribution of armspan length in the sample of data.

3) Create and interpret histograms to assess the distributions of the variable ‘height’ separately in males and

females. Describe fully the distributions of this variable. What can you conclude? Can you identify any gaps or outliers? If so, what should you do with the outliers? [You are welcome to try this on your own for

the other quantitative variable]. Here, Height is the dependent variable; what is the independent variable? What is the level of measurement of the dependent and independent variables? Dependent Variable = Height; Level of measurement = Ratio Independent variable = Gender; Level of measurement = Nominal Analyze → Descriptive statistics → Explore→ in ‘Display’ box select ‘Plots’ and in the ‘Plot’ button select histogram (turn off stem-and-leaf), Move the variable (height) to ‘Dependent List’ box, AND Move the variable ‘gender’ to the ‘Factor list’). Or, Graph → Histogram → select the variable (height) AND Move the variable ‘gender’ to the ‘Rows Panel’ Figure 2. Histogram showing the distribution of height of female students in the course BIOL 361 in winter term of 2019 (n = 96 ). Interpretation of the histogram in Figure 2 : A histogram shows that height of female students in the course BIOL 361 in winter term of 20 19 has a unimodal distribution that is approximately symmetrical to weakly right- skewed with no obvious outliers (Figure 2). The mode lies near 165 cm, which is smaller than the mode for male students (see Figure 3).

Figure 4. Boxplots comparing the distribution of height in female (f, n = 96 ) and male (m, n = 61 ) students in the course BIOL 361 in winter term of 2019.

Show students how to edit the Boxplots → Double-click 2X to open the Chart Editor and Properties box.

Show how to make whisker lines thicker (to 1.5). How to add minor ticks onto the vertical axis. I modified the symbols for the outliers to size = 6 and line thickness = 1.5. Show that you can change the sequence of boxes

on the horizontal axis – which is helpful for Assignment 1 (Select the x-axis (m and f are highlighted) → In the

Properties box, select the ‘Categories’ tab → click on f or m in the ‘Order’ box and move up or down.

Interpretation of the boxplots in Figure 4 : Boxplots identify three female students as mild outliers (Figure 4). The interquartile ranges do not overlap between male and female students, suggesting height differs markedly between the genders. Males are taller than females, with median height differing by 10.2 cm. Interestingly, the range of values is comparable for females and males (31.5 and 32.0 cm, respectively), but the interquartile range is discernibly wider for the males (Table 1). This suggesting height is slightly more variable in males than in females, as confirmed by slightly higher standard deviation for the males (7.28 vs. 5.88). This occurs despite the larger number of females measured than males. [Note: the values in this paragraph were gleaned from the ‘Descriptives’ Table in the SPSS output]

5) Create a scatterplot of armspan length (vertical axis) versus height (horizontal axis) to explore the

relationship between these two variables in all students in BIOL 361 in Winter 2019. Graph → Scatter/Dot → Simple Scatter → Define → Move the variable ‘Height’ to the X-axis box and Move the variable ‘Armspan length’ to the Y-axis box.

Figure 5. Scatterplot showing the relationship between height and armspan of students in the course BIOL 361 in winter term of 20 19 (n = 157 ). Interpretation : There is a strong positive linear relationship between height and armspan. Note, we can get SPSS to code the data points according to gender as a way to explore this relationship separately in males and females (When generating a new scatterplot, move the variable ‘Gender’ into the ‘Set Markers by’ box in the Scatterplot dialog), as shown below: Figure 6. Scatterplot showing the relationship between height and armspan of male (n = 61 ; solid circles) and female (n = 96 ; open circles) students in the course BIOL 361 in winter term of 20 19.

Answers: Numerical summary/Summary statistics for the variable Difference: Description : The mean difference in reaction time of the writing and non-writing hands is very small (-0.00 07 seconds), suggesting the writing and non-writing hands have similar reaction time (the small negative value means the writing hand is, if anything, slightly slower than the non-writing hand, but this may not be a statistically significant difference – we’ll assess this in a later lab). The mean and median are very similar and skewness is small (g 1 = - 0. 0350 ), suggesting the distribution is symmetrical (Table 1). Figure 7. Histogram showing the distribution of the difference in reaction time of the non-writing hand vs writing hand of students (in seconds) in the course BIOL 361 in winter term of 20 19 (n = 1 65 ). Interpretation : The distribution of the difference in reaction time is unimodal and approximately symmetrical, with the mode centered near zero. There are three possible outliers at the high end.

Figure 8. Boxplot showing the distribution of the difference in reaction time of the non-writing hand vs writing hand of students in the course BIOL 361 in winter term of 2019 (n = 165 ). Interpretation : The interquartile range (IQR) (i.e., the box) is in the middle of the range and the median is near the middle of the IQR, indicating a symmetrical shape. There are three mild outliers at the high end (the two highest values are overlapping) and four mild outliers at the low end (the two lowest values are overlapping). Figure 9. Boxplots comparing the distribution of the difference in reaction time of the writing versus non- writing hands for female (f, n = 98 ) and male (m, n = 67 ) students in the course BIOL 361 in winter term of

Interpretation : Median difference in reaction time is near zero for both males and females, and there is considerable overlap of the range and interquartile ranges of values for both genders. This suggests little to no difference between genders in the relative reaction times of the two hands. This is not surprising, as there is no basis to expect that males and females differ in reaction time of their hands. For the males, there are two