Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

STAT1000 Assignment 2, Assignments of Statistics

This is solved working for assignment 2 in stats 1000

Typology: Assignments

2021/2022

Uploaded on 02/01/2023

AnamRizvi
AnamRizvi 🇨🇦

5 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
STAT 1000 - Assignment 2
Anam Fatima (007909975)
2022-10-21
Instructions
To complete this assignment, add code as needed into the R code chunks given below, and, where applicable,
replace the Delete me and add in your own text response. Be sure when adding in text responses to never
copy-paste symbols from outside of the document. Only use the symbols on your keyboard. Do not delete
the question text, or modify any other part of the code except for the “author” in Line 3.
All calculation output must be visible in the final document, and all text responses should be in complete
English sentences. When you are done You may speak to your class mates about ideas and what func-
tions/optional arguments you may need to use but you may not directly show your code/output to your
classmates. To be clear, you may talk as much as you like but you can not look directly at someone else’s
completed work, and you cannot share your code with any of your classmates. If you have an issue that you
can’t resolve without someone looking at your work (e.g. you get an error when knitting your document),
please see the Help Centre in 311 Machray Hall. To compile this document as a PDF to submit or to view
your intermediate work/output (I suggest you compile it after every question), click on the knit button above
this window.
To properly see the questions, complete the setup and then knit this .Rmd file to .PDF and view the output.
You will have a link in your email that takes you to the Crowdmark submission page. Once you have
completed the worksheet, knit it to .PDF and upload your output to Crowdmark. Also, upload your .Rmd
file to Crowdmark where prompted. To see where your .Rmd file is saved, click File > Save As in the top-left
of your screen. Make sure you set your Name and Student Number in the Author section of this document
(Line 3). Do not alter the title or the date. Please note that if you do not submit a knit .PDF file, you will
given a grade of zero.
After you knit your assignment to PDF, check your code chunks. If your code at any point runs off the page,
find the nearest comma, click to the right of it, and press Enter (or Return if you are on a Mac). This will
force a break in the code so that it goes onto the next line. All of your code must be readable in the final
submission.
Your full submission is due by 11:59 p.m. on October 21st. Crowdmark may allow you to submit late, but
you will be given an automatic grade of zero. Be sure to change the author of this file to your own name and
student ID number. All numerical and graphical answers must be done using R, unless stated otherwise.
Setup [1 mark]
0. Import the KungSan dataset, available on the UMLearn page. Make sure you have “Heading” set to
“Yes” when you import the data, and make sure you name the object KungSan. [1 mark]
KungSan <- read.csv("C:/Users/USER/Downloads/KungSan.csv")
This dataset contains the ages (in years), heights (in centimeters), and weights (in kg) of a sample of children
from the Kung San people, a people located in northern Namibia and southern Angola.
1
pf3
pf4
pf5

Partial preview of the text

Download STAT1000 Assignment 2 and more Assignments Statistics in PDF only on Docsity!

STAT 1000 - Assignment 2

Anam Fatima (007909975)

Instructions

To complete this assignment, add code as needed into the R code chunks given below, and, where applicable, replace the “ Delete me ” and add in your own text response. Be sure when adding in text responses to never copy-paste symbols from outside of the document. Only use the symbols on your keyboard. Do not delete the question text, or modify any other part of the code except for the “author” in Line 3.

All calculation output must be visible in the final document, and all text responses should be in complete English sentences. When you are done You may speak to your class mates about ideas and what func- tions/optional arguments you may need to use but you may not directly show your code/output to your classmates. To be clear, you may talk as much as you like but you can not look directly at someone else’s completed work, and you cannot share your code with any of your classmates. If you have an issue that you can’t resolve without someone looking at your work (e.g. you get an error when knitting your document), please see the Help Centre in 311 Machray Hall. To compile this document as a PDF to submit or to view your intermediate work/output (I suggest you compile it after every question), click on the knit button above this window.

To properly see the questions, complete the setup and then knit this .Rmd file to .PDF and view the output. You will have a link in your email that takes you to the Crowdmark submission page. Once you have completed the worksheet, knit it to .PDF and upload your output to Crowdmark. Also, upload your .Rmd file to Crowdmark where prompted. To see where your .Rmd file is saved, click File > Save As in the top-left of your screen. Make sure you set your Name and Student Number in the Author section of this document (Line 3). Do not alter the title or the date. Please note that if you do not submit a knit .PDF file, you will given a grade of zero.

After you knit your assignment to PDF, check your code chunks. If your code at any point runs off the page, find the nearest comma, click to the right of it, and press Enter (or Return if you are on a Mac). This will force a break in the code so that it goes onto the next line. All of your code must be readable in the final submission.

Your full submission is due by 11:59 p.m. on October 21st. Crowdmark may allow you to submit late, but you will be given an automatic grade of zero. Be sure to change the author of this file to your own name and student ID number. All numerical and graphical answers must be done using R, unless stated otherwise.

Setup [1 mark]

  1. Import the KungSan dataset, available on the UMLearn page. Make sure you have “Heading” set to “Yes” when you import the data, and make sure you name the object KungSan. [1 mark]

KungSan <- read.csv("C:/Users/USER/Downloads/KungSan.csv")

This dataset contains the ages (in years), heights (in centimeters), and weights (in kg) of a sample of children from the Kung San people, a people located in northern Namibia and southern Angola.

The line of code below will shuffle your data, and make the code unique to you. After importing the data, replace 1111111 with your seven-digit student id number in the set.seed function below, and click the green arrow at the top-right hand side of the code chunk. This part is not worth marks, but you will receive a 0 on your assignment if it is not completed correctly. The “echo = FALSE” argument has been added to prevent this code chunk from appearing in your final document.

Make sure you import your data and shuffle it before beginning the assignment questions.

Questions [24 marks]

  1. Produce a scatterplot comparing the ages (X) to the heights (Y). Set appropriate labels for the x- and y-axes, and set a title as well. [3 marks]

plot(KungSan$Age,KungSan$height,xlab="Age",ylab="Heights",main="Kungsan Data")

Kungsan Data

Age

Heights

  1. Calculate the least squares regression line for predicting height from age. [2 marks]

lm(KungSan$height~KungSan$Age)

Call:

lm(formula = KungSan$height ~ KungSan$Age)

Coefficients:

Coefficients:

(Intercept) KungSan$Age

81.377 3.

  1. Give a fully worded interpretation of the value of r^2 from Question 5. [2 mark]

Every additional centimeter in one foots length with gain 3.589 in height.

  1. In Question 5, you should have found a high value for the correlation coefficient. From this alone, can you conclude that changes in age cause a change in height? Why / why not? [2 marks]

As age increases,height also increases we have gotten a very strong correlation coefficient which means there is strong relationship between age and height and it is positive as age increases height also increases.

  1. Give a fully worded interpretation of the slope of the least squares regression equation from Question
    1. [2 mark]

As the age increases the height also increases by 3.589 cm

  1. Suppose that a sample member has an age of 7 years and a height of 106cm. Use the regression equation from Question 2 to calculate the residual for this individual. (Do not use the $residuals vector, simply use R as a calculator here.) [2 marks]

y<- 81.377+3.589* y

## [1] 106.

y1<- 106 y1-y

## [1] -0.

  1. Use the regression equation from Question 2 to estimate the height of a 1-year-old child. [1 mark]

y <- 81.377+3.589* y

## [1] 84.

  1. Is the prediction from Question 10 reliable? Why / why not? [2 marks]

Yes it is reliable.

  1. Two new observations are to be added to this dataset. One has an age of 40 years and a height of 160cm, and another has an age of 12 years and a height of 160cm. Which of these observations, if any, would be considered influential? Why? [2 marks]

The 40 years and a height of 160cm would be influential,because the maximum age we have seen in the table is 16 years old. 40 years would be influential.

  1. Determine the correlation between height and weight in this dataset. [1 mark]

cor(KungSan$height,KungSan$weight)

## [1] 0.

  1. Suppose instead that heights were measured in inches, and weights were measured in pounds. Without doing any calculations, what would the value of the correlation be in this case? [1 mark]

The correlation co-efficent would remain the same as correlation has no units.