library(ggbiplot)
library(tidyverse)
library(tmaptools)
library(leaflet)
library(conflicted)
conflicts_prefer(dplyr::summarize)
conflicts_prefer(dplyr::filter)Worksheet 11
Packages
Rugby league teams by location
There are 37 teams that play professional or semi-professional rugby league in Europe.1 These are listed in the file at http://ritsokiguess.site/datafiles/rugby-league-teams.csv, including the name of each team, its location, and the league in which they play (from Super League, best, to League One, worst). Our aim is to make a map of the locations of the teams, to see what we can learn about where the teams tend to be from.
- Read in and display (some of) the data.
- Look up the latitudes and longitudes of the location where each team plays.
- Draw a map showing where these 37 teams play.
- Where are most of the teams found?
- What can you find out about why most of the teams are located where they are?
- Re-draw your map, but now colouring the points according to which
leaguethe team plays in.
Intoxicant use according to gender and race
In a survey, 2,276 high-school students were classified according to whether or not they have ever used alcohol, cigarettes, or marijuana (responses). In the survey, each student’s race and gender (as they reported them) was also recorded (explanatory). The data are in http://ritsokiguess.site/datafiles/intoxicant.csv. The columns are labelled by the initial letter of each of these, with a column count that says how many students fell into that combination of categories.
- Read in and display some of the data. How do you know you have the correct total number of students?
- Fit a log-linear model with up to two-way associations to these data. To do this, use
(a+c+m+r+g)^2on the right side of your model formula (instead of thea*c*r*m*gthat you were probably expecting). Run a suitabledrop1on this model.
- Build a better model. Why did you stop where you did?
- For each of your significant associations, draw a graph to explore them, and say what you conclude. Note that there is a logical distinction between associations that contain both a response variable and an explanatory one, and those that contain two variables of the same type.
- We can also use
stepto do the model-building (rather than removing terms one by one). Starting from all three-way interactions, runstepon this model, saving the result, and then rundrop1on that result. Is everything remaining significant? (Hint: copy and paste your code from question 8, and change the 2 to a 3.)
- In your final model from the previous question, are there any significant terms that you did not see previously? If so, in each case draw a suitable graph and say what it means.
Footnotes
Rugby league is also played in Australia, traditionally around the city of Sydney, and in places in the Pacific like New Zealand and Papua New Guinea.↩︎