Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

RStudio and R using to analysis data frame sports, Assignments of Statistics

R studio language and R using to analysis data frame sports

Typology: Assignments

2019/2020

Uploaded on 04/26/2020

my-one
my-one 🇮🇩

1 document

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Abdulmajeed Yahya Moqbel Saleh Alsabahi (19522410)
> sports <- read.csv(file.choose(), header=TRUE)
> head(sports)
Country X100m X200m X400m X800m X1500m X3000m marathon GDP
1 argentina 11.61 22.94 54.50 2.15 4.43 9.79 178.52 76961923741
2 australia 11.20 22.35 51.08 1.98 4.13 9.08 152.37 150000000000
3 austria 11.43 23.09 50.62 1.99 4.22 9.34 159.37 81861232823
4 belgium 11.41 23.04 52.00 2.00 4.14 8.88 157.85 128000000000
5 bermuda 11.46 23.05 53.30 2.16 4.58 9.81 169.98 613299968
6 brazil 11.31 23.17 52.80 2.10 4.49 9.77 168.75 235000000000
Population
1 28105.89
2 14692.00
3 7549.43
4 9859.24
5 54.67
6 121159.76
> sports <- read.csv(file="sports2.csv", row.names=1)
> head(sports)
X100m X200m X400m X800m X1500m X3000m marathon GDP
argentina 11.61 22.94 54.50 2.15 4.43 9.79 178.52 76961923741
australia 11.20 22.35 51.08 1.98 4.13 9.08 152.37 150000000000
austria 11.43 23.09 50.62 1.99 4.22 9.34 159.37 81861232823
belgium 11.41 23.04 52.00 2.00 4.14 8.88 157.85 128000000000
bermuda 11.46 23.05 53.30 2.16 4.58 9.81 169.98 613299968
brazil 11.31 23.17 52.80 2.10 4.49 9.77 168.75 235000000000
Population
argentina 28105.89
australia 14692.00
austria 7549.43
belgium 9859.24
bermuda 54.67
brazil 121159.76
1) How many data cells are there in the sports data frame?
> dim(sports)
[1] 53 9
> nrow(sports)
[1] 53
> ncol(sports)
pf3
pf4
pf5
pf8

Partial preview of the text

Download RStudio and R using to analysis data frame sports and more Assignments Statistics in PDF only on Docsity!

Abdulmajeed Yahya Moqbel Saleh Alsabahi (19522410)

sports <- read.csv(file.choose(), header=TRUE) head(sports) Country X100m X200m X400m X800m X1500m X3000m marathon GDP 1 argentina 11.61 22.94 54.50 2.15 4.43 9.79 178.52 76961923741 2 australia 11.20 22.35 51.08 1.98 4.13 9.08 152.37 150000000000 3 austria 11.43 23.09 50.62 1.99 4.22 9.34 159.37 81861232823 4 belgium 11.41 23.04 52.00 2.00 4.14 8.88 157.85 128000000000 5 bermuda 11.46 23.05 53.30 2.16 4.58 9.81 169.98 613299968 6 brazil 11.31 23.17 52.80 2.10 4.49 9.77 168.75 235000000000 Population 1 28105. 2 14692. 3 7549. 4 9859. 5 54. 6 121159. sports <- read.csv(file="sports2.csv", row.names=1) head(sports) X100m X200m X400m X800m X1500m X3000m marathon GDP argentina 11.61 22.94 54.50 2.15 4.43 9.79 178.52 76961923741 australia 11.20 22.35 51.08 1.98 4.13 9.08 152.37 150000000000 austria 11.43 23.09 50.62 1.99 4.22 9.34 159.37 81861232823 belgium 11.41 23.04 52.00 2.00 4.14 8.88 157.85 128000000000 bermuda 11.46 23.05 53.30 2.16 4.58 9.81 169.98 613299968 brazil 11.31 23.17 52.80 2.10 4.49 9.77 168.75 235000000000 Population argentina 28105. australia 14692. austria 7549. belgium 9859. bermuda 54. brazil 121159.

1) How many data cells are there in the sports data frame?

dim(sports) [1] 53 9 nrow(sports) [1] 53 ncol(sports)

[1] 9

class(sports) [1] "data.frame" colnames(sports) [1] "X100m" "X200m" "X400m" "X800m" "X1500m" [6] "X3000m" "marathon" "GDP" "Population" rownames(sports) [1] "argentina" "australia" "austria" "belgium" "bermuda" [6] "brazil" "burma" "canada" "chile" "china" [11] "colombia" "costarica" "czech" "denmark" "domrep" [16] "finland" "france" "gdr" "frg" "gbni" [21] "greece" "guatemala" "hungary" "india" "indonesia" [26] "ireland" "israel" "italy" "japan" "kenya" [31] "korea" "dprkorea" "luxembourg" "malaysia" "mauritius" [36] "mexico" "netherlands" "newzealand" "norway" "papua" [41] "philippines" "poland" "portugal" "romania" "singapore" [46] "spain" "sweden" "switzerland" "taipei" "thailand" [51] "turkey" "usa" "ussr" x=nrow(sports) y=ncol(sports) x*y

[1] 477 // data cells are in the sports data frame

2) Extract the first three rows, and the first three columns from the sports data frame

First we do it as Extract the first three rows sports[c(1,2,3),] X100m X200m X400m X800m X1500m X3000m marathon GDP argentina 11.61 22.94 54.50 2.15 4.43 9.79 178.52 76961923741 australia 11.20 22.35 51.08 1.98 4.13 9.08 152.37 150000000000 austria 11.43 23.09 50.62 1.99 4.22 9.34 159.37 81861232823 Population argentina 28105. australia 14692. austria 7549.

now we 2) Extract the first three rows, and the first three columns together

sports[c(1,2,3),c(1,2,3)]

3) Extract all rows except the last one, for the last two columns from the sports data frame.

sports[c(1:52),(8:9)]

malaysia 2.448803e+10 13798. mauritius 1.136544e+09 966. mexico 1.940000e+11 69360. netherlands 1.930000e+11 14149. newzealand 2.324551e+10 3112. norway 6.443938e+10 4085. papua 2.545983e+09 3304. philippines 3.245040e+10 47396. poland NA 35574. portugal 3.289976e+10 9766. romania NA 22242. singapore 1.189362e+10 2413. spain 2.320000e+11 37491. sweden 1.400000e+11 8310. switzerland 1.190000e+11 6319. taipei NA 18030. thailand 3.235351e+10 47385. turkey 6.878929e+10 43975. usa 2.860000e+12 227225.

4) Remove the first four columns from the sports data frame

sports[,1:4] <- list(NULL) head(sports) X1500m X3000m marathon GDP Population argentina 4.43 9.79 178.52 76961923741 28105. australia 4.13 9.08 152.37 150000000000 14692. austria 4.22 9.34 159.37 81861232823 7549. belgium 4.14 8.88 157.85 128000000000 9859. bermuda 4.58 9.81 169.98 613299968 54. brazil 4.49 9.77 168.75 235000000000 121159.

5) Subset of data consising of strength events variables (track events 800m and below), and

“Population”, for countries with per capita GDP of 5000 or more. The actual population

size is the value in the “Population” column multiplied by 1000. 20%

subset(sports, select=c(X800m,X1500m,X3000m,marathon,GDP,Population), +Population > 5000)+sports[,(10.1000)] X800m X1500m X3000m marathon GDP Population 1 28108.04 9770.74 17481.93 9821.02 7.696290e+10 255330. 2 14693.98 22246.78 373.23 7435.83 1.496550e+11 277128. 3 7551.42 2418.17 13807.47 10870.49 8.186124e+10 35655. 4 9861.24 37495.31 974.92 696941.37 1.275080e+11 24551. 6 56.77 8315.02 69370.64 147659.11 2.350250e+11 128709.

sum((is.na(sports$GDP))) [1] 9

Name the countries which have missing GDP data:

is.na(sports$GDP) [[1] FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE [13] TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [37] FALSE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE [49] TRUE FALSE FALSE FALSE TRUE

8) Calculate the mean and standard deviaCon for each variables in the data sets

mean(sports$X100m) ; sd(sports$X100m) [1] 11. [1] 0. mean(sports$X200m) ; sd(sports$X200m) [1] 23. [1] 0. mean(sports$X400m) ; sd(sports$X400m) [1] 53. [1] 2. mean(sports$X800m) ; sd(sports$X800m) [1] 2.

[1] 0.

mean(sports$X1500m) ; sd(sports$X1500m) [1] 4. [1] 0. mean(sports$X3000m) ; sd(sports$X3000m) [1] 9. [1] 0. mean(sports$marathon) ; sd(sports$marathon) [1] 169. [1] 23. mean(sports$GDP) ; sd(sports$GDP) [1] NA [1] NA mean(sports$Population) ; sd(sports$Population) [1] 65975. [1] 165580. summary(sports) X100m X200m X400m X800m Min. :10.79 Min. :21.71 Min. :47.99 Min. :1. 1st Qu.:11.25 1st Qu.:22.82 1st Qu.:51.50 1st Qu.:2. Median :11.58 Median :23.52 Median :53.26 Median :2.

X1500m X3000m marathon GDP

  • argentina 11.61 22.94 54. X100m X200m X400m
  • australia 11.20 22.35 51.
  • austria 11.43 23.09 50.
  • belgium 11.41 23.04 52.
  • bermuda 11.46 23.05 53.
  • brazil 11.31 23.17 52.
  • burma 12.14 24.47 55.
  • canada 11.00 22.25 50.
  • chile 12.00 24.52 54.
  • china 11.95 24.41 54.
  • colombia 11.60 24.00 53.
  • costarica 11.96 24.60 58.
  • czech 11.09 21.97 47.
  • denmark 11.42 23.52 53.
  • domrep 11.79 24.05 56.
  • finland 11.13 22.39 50.
  • france 11.15 22.59 51.
  • gdr 10.81 21.71 48.
  • frg 11.01 22.39 49.
  • gbni 11.00 22.13 50.
  • greece 11.79 24.08 54.
  • guatemala 11.84 24.54 56.
  • hungary 11.45 23.06 51.
  • india 11.95 24.28 53.
  • indonesia 11.85 24.24 55.
  • ireland 11.43 23.51 53.
  • israel 11.45 23.57 54.
  • italy 11.29 23.00 52.
  • japan 11.73 24.00 53.
  • kenya 11.73 23.88 52.
  • korea 11.96 24.49 55.
  • dprkorea 12.25 25.78 51.
  • luxembourg 12.03 24.96 56.
  • malaysia 12.23 24.21 55.
  • mauritius 11.76 25.08 58.
  • mexico 11.89 23.62 53.
  • netherlands 11.25 22.81 52.
  • newzealand 11.55 23.13 51.
  • norway 11.58 23.31 53.
  • papua 12.25 25.07 56.
  • philippines 11.76 23.54 54.
  • poland 11.13 22.21 49.
  • portugal 11.81 24.22 54.
  • romania 11.44 23.46 51.
  • singapore 12.30 25.00 55.
  • spain 11.80 23.98 53.
  • sweden 11.16 22.82 51.
  • switzerland 11.45 23.31 53.
  • taipei 11.22 22.62 52.
  • thailand 11.75 24.46 55.
  • turkey 11.98 24.44 56.
  • usa 10.79 21.83 50.
  • ussr 11.06 22.19 49.
  • argentina 11.61 22.94 54. X100m X200m X400m
  • australia 11.20 22.35 51.
  • austria 11.43 23.09 50.
  • argentina 7.696192e+10 28105. GDP Population
  • australia 1.500000e+11 14692.
  • austria 8.186123e+10 7549.
  • belgium 1.280000e+11 9859.
  • bermuda 6.133000e+08 54.
  • brazil 2.350000e+11 121159.
  • burma NA 33370.
  • canada 2.740000e+11 24593.
  • chile 2.757231e+10 11266.
  • china 1.900000e+11 981235.
  • colombia 3.340071e+10 27737.
  • costarica 4.831447e+09 2389.
  • czech NA 10304.
  • denmark 7.086747e+10 5123.
  • domrep 6.631000e+09 5809.
  • finland 5.368505e+10 4779.
  • france 7.040000e+11 55340.
  • gdr NA 16737.
  • frg 9.470000e+11 78288.
  • gbni 5.650000e+11 56314.
  • greece 5.682966e+10 9642.
  • guatemala 7.878700e+09 7283.
  • hungary NA 10711.
  • india 1.900000e+11 696783.
  • indonesia 7.801321e+10 147490.
  • ireland 2.177229e+10 3412.
  • israel 2.178097e+10 3878.
  • italy 4.760000e+11 56433.
  • japan 1.090000e+12 116782.
  • kenya 7.265315e+09 16268.
  • korea 6.780238e+10 38123.
  • dprkorea NA 17472.
  • luxembourg 6.294122e+09 364.
  • 7 121161.94 6323.86 14159.31 3603.82 NA 43229.
  • 8 33372.00 18034.06 3121.71 4027.45 2.738540e+11 24647.
  • 9 24595.05 47389.55 4094.99 56605.26 2.757236e+10 132425.
  • 10 11268.31 43980.25 3313.78 116950.48 1.896500e+11 1014605.
  • 11 981237.11 227229.35 47406.43 16434.41 3.340079e+10 52330.
  • 13 27739.79 262440.14 35583.07 38282.63 NA 21570.
  • 14 2391.34 28110.07 9775.02 17623.89 7.086747e+10 986358.
  • 15 10306.43 14696.74 22252.54 568.03 6.631007e+09 33547.
  • 17 5125.00 7553.57 2422.93 13953.40 7.035250e+11 57730.
  • 18 5811.20 9863.20 37499.92 1123.72 NA 27041.
  • 19 4781.48 58.70 8319.12 69509.40 9.466951e+11 83411.
  • 20 55342.76 121163.79 6328.03 14299.52 5.649480e+11 62123.
  • 21 16739.07 33374.35 18039.87 3295.10 5.682967e+10 14422.
  • 22 78290.86 24597.86 47395.86 4300.70 7.878756e+09 62624.
  • 23 56316.23 11270.37 43984.90 3460.84 NA 27448.
  • 24 9644.60 981239.32 227234.98 47585.00 1.895940e+11 775072.
  • 25 7285.68 27742.51 262446.02 35775.43 7.801324e+10 203804.
  • 28 10713.08 2393.29 28114.52 9918.13 4.756830e+11 66076.
  • 29 696785.61 10308.54 14701.20 22393.15 1.086990e+12 124065.
  • 30 147492.36 5127.15 7558.63 2595.00 7.265329e+09 26980.
  • 31 3414.95 5813.69 9868.86 37655.82 6.780238e+10 734907.
  • 32 3879.97 4783.78 64.02 8489.70 NA 164962.
  • 34 56436.07 55345.47 121170.22 6501.58 2.448805e+10 17210.
  • 36 116784.04 16741.25 33379.59 18188.53 1.943570e+11 73238.
  • 37 16270.98 78292.64 24602.01 47537.80 1.926610e+11 70583.
  • 41 38125.97 56318.82 11276.39 44176.29 3.245040e+10 164178.
  • 42 17474.09 9646.49 981243.97 227385.82 NA 51843.
  • 43 366.24 7287.62 27746.74 262587.20 3.289979e+10 47890.
  • 44 13800.05 10715.08 2397.84 28271.34 NA 39714.
  • 46 968.09 696787.66 10313.21 14854.60 2.321350e+11 37855.
  • 47 69362.89 147494.48 5131.84 7703.91 1.400890e+11 22108.
  • 48 14151.82 3416.87 5818.04 10012.66 1.187100e+11 7285.
  • 49 3115.00 3882.38 4789.16 232.54 NA 87390.
  • 50 4087.82 56438.60 55351.06 121328.21 3.235352e+10 61535.
  • 51 3306.62 116786.37 16746.38 33571.08 6.878931e+10 47088.
  • 52 47398.93 16272.94 78297.08 24735.72 2.862510e+12 231310.
  • 53 35576.04 38127.65 56322.67 11417.45 NA 265740.
  • [1] > which (is.na(sports$GDP) == TRUE)
  • Mean :11.57 Mean :23.53 Mean :53.17 Mean :2.
  • 3rd Qu.:11.85 3rd Qu.:24.28 3rd Qu.:54.97 3rd Qu.:2.
  • Max. :12.30 Max. :25.78 Max. :58.25 Max. :2.
  • Min. :3.870 Min. : 8.450 Min. :142.7 Min. :6.133e+
  • 1st Qu.:4.110 1st Qu.: 8.840 1st Qu.:152.5 1st Qu.:2.288e+
  • Median :4.230 Median : 9.310 Median :162.6 Median :6.830e+
  • Mean :4.288 Mean : 9.349 Mean :169.6 Mean :2.183e+
  • 3rd Qu.:4.430 3rd Qu.: 9.790 3rd Qu.:179.2 3rd Qu.:1.908e+
  • Max. :4.860 Max. :10.900 Max. :261.1 Max. :2.860e+
  • NA's :
  • Min. : 54. Population
  • 1st Qu.: 6319.
  • Median : 16269.
  • Mean : 65975.
  • 3rd Qu.: 47385.
  • Max. :981235.