Artwork by @allison_horst


1 Creating a new Project in R

Tip: If on a uni computer sign into OneDrive

Before you do this challenge make sure you have decided where you will save your R project. If you are on a university computer saving it in the cloud on your uni onedrive will mean you can access it later. Check you are signed into your uni one drive: open the file explorer > C drive > Users > find your username > OneDrive - University of Leeds and sign in.

Challenge 1

  1. Click the “File” menu button, then “New Project”.
  2. Click “New Directory”.
  3. Click “New Project”.
  4. Type in the name of your project, e.g. “Learning_R”.
  5. Browse to choose where you want to save it.
  6. Click the “Create Project” button

For future computer labs, if you want to open this same RStudio project now it has been created, click through your file system to get to the directory where it was saved and double click on the .Rproj file. This will open RStudio and start your R session in the same directory as the .Rproj file. Any data, plots or scripts can be saved and retrieved from this directory.

RStudio projects have the added benefit of allowing you to open multiple projects at the same time each open to its own project directory. This allows you to keep multiple projects open without them interfering with each other.


1.1 Best practices for project organization

In the bottom right pane in RStudio, click the “Files” tab. To organise your project, create folders called data, scripts, results, docs by clicking the folder icon with the green plus sign.


1.1.1 Use a script

The most effective way to work with R is to write the code you want to run in an .R script. You then run the selected lines (by either using a keyboard shortcut or clicking the “Run” button) in the R console.

Open a script (click the white square with the cross in green below File at the top left). Save the script in your R project folder (click “File”, then “Save as” and browse to the script folder in your project). You could call it Lesson 1.


1.1.2 Save the data in the data folder

Artwork by @allison_horst

Challenge 2 View the palmer penguins data here.

  1. From the window showing the data, save the data as a file (CTRL + S or right mouse click > “Save as”)
  2. Make sure it’s saved under the name penguins.csv
  3. Save the file in the data folder within your project.


1.2 Reading in data

Remember that the .csv data file is now saved in our data folder (do the same when you work with your own datasets).

We can load this into R via the following:

penguins <- read.csv(file = "data/penguins.csv")

Here we used the function read.csv but there are lots of different functions for reading in different types of files. For example read_excel in the readxl package.

Tip: Problems reading in data

If your spreadsheet won’t read in check:
1. You are using the correct function for the type of file, for example read.csv for csv files
2. You have included the suffix after the name of the file
3. The name of the file is spelled correctly
4. The file exists in the folder you are directing R to. For example, in our case we would click on the folder named data under the files tab at the bottom right of RStudio to check.
5. Look at an example on the internet to ensure you have included all the necessary arguments. read.csv only needs file = but other functions may need other arguments.


Once data is in R, you can view it by clicking on it’s name under the Environment window.

In R, datasets are called data frames (df) or sometimes tibbles.

We can begin exploring our data frame right away, pulling out columns by specifying them using the $ operator:

penguins$species
penguins$bill_length_mm


Passing the penguins data frame through the structure function str will show you the type of data for each variable.

str(penguins)

‘str’ shows us that species is chr which is short for character. This means R will treat species as a factor (in other words, discrete data). Bill_length_mm is num which stands for numeric. int means integer. Both num and int variables are treated as continuous (scale) data by R.


We can calculate the mean of bill length.

mean(penguins$bill_length_mm, na.rm = TRUE)

But we can’t use the mean function on a factor.

mean(penguins$species, na.rm = TRUE)

Tip: When functions won’t work

Sometimes R errors are caused by R treating a variable as a factor when you know it’s a number. Checking what R is “thinking” with str can help.


1.3 Indexing (square brackets)

The $ operator will specify a variable in a data frame. You can also use indexing.

Challenge 3

There are several subtly different ways to call variables, observations and elements from data.frames using indexing:

Try out these examples and after a # in your script describe what data is returned by each one.

  • penguins[1]
  • penguins[1, 1]
  • penguins[, 1]
  • penguins[1, ]
Solution to Challenge 3
penguins[1]
##       species
## 1      Adelie
## 2      Adelie
## 3      Adelie
## 4      Adelie
## 5      Adelie
## 6      Adelie
## 7      Adelie
## 8      Adelie
## 9      Adelie
## 10     Adelie
## 11     Adelie
## 12     Adelie
## 13     Adelie
## 14     Adelie
## 15     Adelie
## 16     Adelie
## 17     Adelie
## 18     Adelie
## 19     Adelie
## 20     Adelie
## 21     Adelie
## 22     Adelie
## 23     Adelie
## 24     Adelie
## 25     Adelie
## 26     Adelie
## 27     Adelie
## 28     Adelie
## 29     Adelie
## 30     Adelie
## 31     Adelie
## 32     Adelie
## 33     Adelie
## 34     Adelie
## 35     Adelie
## 36     Adelie
## 37     Adelie
## 38     Adelie
## 39     Adelie
## 40     Adelie
## 41     Adelie
## 42     Adelie
## 43     Adelie
## 44     Adelie
## 45     Adelie
## 46     Adelie
## 47     Adelie
## 48     Adelie
## 49     Adelie
## 50     Adelie
## 51     Adelie
## 52     Adelie
## 53     Adelie
## 54     Adelie
## 55     Adelie
## 56     Adelie
## 57     Adelie
## 58     Adelie
## 59     Adelie
## 60     Adelie
## 61     Adelie
## 62     Adelie
## 63     Adelie
## 64     Adelie
## 65     Adelie
## 66     Adelie
## 67     Adelie
## 68     Adelie
## 69     Adelie
## 70     Adelie
## 71     Adelie
## 72     Adelie
## 73     Adelie
## 74     Adelie
## 75     Adelie
## 76     Adelie
## 77     Adelie
## 78     Adelie
## 79     Adelie
## 80     Adelie
## 81     Adelie
## 82     Adelie
## 83     Adelie
## 84     Adelie
## 85     Adelie
## 86     Adelie
## 87     Adelie
## 88     Adelie
## 89     Adelie
## 90     Adelie
## 91     Adelie
## 92     Adelie
## 93     Adelie
## 94     Adelie
## 95     Adelie
## 96     Adelie
## 97     Adelie
## 98     Adelie
## 99     Adelie
## 100    Adelie
## 101    Adelie
## 102    Adelie
## 103    Adelie
## 104    Adelie
## 105    Adelie
## 106    Adelie
## 107    Adelie
## 108    Adelie
## 109    Adelie
## 110    Adelie
## 111    Adelie
## 112    Adelie
## 113    Adelie
## 114    Adelie
## 115    Adelie
## 116    Adelie
## 117    Adelie
## 118    Adelie
## 119    Adelie
## 120    Adelie
## 121    Adelie
## 122    Adelie
## 123    Adelie
## 124    Adelie
## 125    Adelie
## 126    Adelie
## 127    Adelie
## 128    Adelie
## 129    Adelie
## 130    Adelie
## 131    Adelie
## 132    Adelie
## 133    Adelie
## 134    Adelie
## 135    Adelie
## 136    Adelie
## 137    Adelie
## 138    Adelie
## 139    Adelie
## 140    Adelie
## 141    Adelie
## 142    Adelie
## 143    Adelie
## 144    Adelie
## 145    Adelie
## 146    Adelie
## 147    Adelie
## 148    Adelie
## 149    Adelie
## 150    Adelie
## 151    Adelie
## 152    Adelie
## 153    Gentoo
## 154    Gentoo
## 155    Gentoo
## 156    Gentoo
## 157    Gentoo
## 158    Gentoo
## 159    Gentoo
## 160    Gentoo
## 161    Gentoo
## 162    Gentoo
## 163    Gentoo
## 164    Gentoo
## 165    Gentoo
## 166    Gentoo
## 167    Gentoo
## 168    Gentoo
## 169    Gentoo
## 170    Gentoo
## 171    Gentoo
## 172    Gentoo
## 173    Gentoo
## 174    Gentoo
## 175    Gentoo
## 176    Gentoo
## 177    Gentoo
## 178    Gentoo
## 179    Gentoo
## 180    Gentoo
## 181    Gentoo
## 182    Gentoo
## 183    Gentoo
## 184    Gentoo
## 185    Gentoo
## 186    Gentoo
## 187    Gentoo
## 188    Gentoo
## 189    Gentoo
## 190    Gentoo
## 191    Gentoo
## 192    Gentoo
## 193    Gentoo
## 194    Gentoo
## 195    Gentoo
## 196    Gentoo
## 197    Gentoo
## 198    Gentoo
## 199    Gentoo
## 200    Gentoo
## 201    Gentoo
## 202    Gentoo
## 203    Gentoo
## 204    Gentoo
## 205    Gentoo
## 206    Gentoo
## 207    Gentoo
## 208    Gentoo
## 209    Gentoo
## 210    Gentoo
## 211    Gentoo
## 212    Gentoo
## 213    Gentoo
## 214    Gentoo
## 215    Gentoo
## 216    Gentoo
## 217    Gentoo
## 218    Gentoo
## 219    Gentoo
## 220    Gentoo
## 221    Gentoo
## 222    Gentoo
## 223    Gentoo
## 224    Gentoo
## 225    Gentoo
## 226    Gentoo
## 227    Gentoo
## 228    Gentoo
## 229    Gentoo
## 230    Gentoo
## 231    Gentoo
## 232    Gentoo
## 233    Gentoo
## 234    Gentoo
## 235    Gentoo
## 236    Gentoo
## 237    Gentoo
## 238    Gentoo
## 239    Gentoo
## 240    Gentoo
## 241    Gentoo
## 242    Gentoo
## 243    Gentoo
## 244    Gentoo
## 245    Gentoo
## 246    Gentoo
## 247    Gentoo
## 248    Gentoo
## 249    Gentoo
## 250    Gentoo
## 251    Gentoo
## 252    Gentoo
## 253    Gentoo
## 254    Gentoo
## 255    Gentoo
## 256    Gentoo
## 257    Gentoo
## 258    Gentoo
## 259    Gentoo
## 260    Gentoo
## 261    Gentoo
## 262    Gentoo
## 263    Gentoo
## 264    Gentoo
## 265    Gentoo
## 266    Gentoo
## 267    Gentoo
## 268    Gentoo
## 269    Gentoo
## 270    Gentoo
## 271    Gentoo
## 272    Gentoo
## 273    Gentoo
## 274    Gentoo
## 275    Gentoo
## 276    Gentoo
## 277 Chinstrap
## 278 Chinstrap
## 279 Chinstrap
## 280 Chinstrap
## 281 Chinstrap
## 282 Chinstrap
## 283 Chinstrap
## 284 Chinstrap
## 285 Chinstrap
## 286 Chinstrap
## 287 Chinstrap
## 288 Chinstrap
## 289 Chinstrap
## 290 Chinstrap
## 291 Chinstrap
## 292 Chinstrap
## 293 Chinstrap
## 294 Chinstrap
## 295 Chinstrap
## 296 Chinstrap
## 297 Chinstrap
## 298 Chinstrap
## 299 Chinstrap
## 300 Chinstrap
## 301 Chinstrap
## 302 Chinstrap
## 303 Chinstrap
## 304 Chinstrap
## 305 Chinstrap
## 306 Chinstrap
## 307 Chinstrap
## 308 Chinstrap
## 309 Chinstrap
## 310 Chinstrap
## 311 Chinstrap
## 312 Chinstrap
## 313 Chinstrap
## 314 Chinstrap
## 315 Chinstrap
## 316 Chinstrap
## 317 Chinstrap
## 318 Chinstrap
## 319 Chinstrap
## 320 Chinstrap
## 321 Chinstrap
## 322 Chinstrap
## 323 Chinstrap
## 324 Chinstrap
## 325 Chinstrap
## 326 Chinstrap
## 327 Chinstrap
## 328 Chinstrap
## 329 Chinstrap
## 330 Chinstrap
## 331 Chinstrap
## 332 Chinstrap
## 333 Chinstrap
## 334 Chinstrap
## 335 Chinstrap
## 336 Chinstrap
## 337 Chinstrap
## 338 Chinstrap
## 339 Chinstrap
## 340 Chinstrap
## 341 Chinstrap
## 342 Chinstrap
## 343 Chinstrap
## 344 Chinstrap

Calls the data in column 1.


penguins[1, 1]
## [1] "Adelie"

Calls the information that is in the first row, first column.


penguins[, 1]
##   [1] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##   [8] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [15] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [22] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [29] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [36] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [43] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [50] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [57] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [64] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [71] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [78] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [85] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [92] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
##  [99] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
## [106] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
## [113] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
## [120] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
## [127] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
## [134] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
## [141] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"   
## [148] "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Adelie"    "Gentoo"    "Gentoo"   
## [155] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [162] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [169] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [176] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [183] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [190] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [197] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [204] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [211] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [218] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [225] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [232] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [239] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [246] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [253] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [260] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [267] "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"    "Gentoo"   
## [274] "Gentoo"    "Gentoo"    "Gentoo"    "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap"
## [281] "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap"
## [288] "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap"
## [295] "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap"
## [302] "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap"
## [309] "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap"
## [316] "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap"
## [323] "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap"
## [330] "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap"
## [337] "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap" "Chinstrap"
## [344] "Chinstrap"

Also, calls the first column.


penguins[1, ]
##   species    island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g  sex year
## 1  Adelie Torgersen           39.1          18.7               181        3750 male 2007
Calls the first row.


1.4 Getting help with coding

Tip: Help files

One of the most daunting aspects of R is the large number of functions available. It would be prohibitive, if not impossible to remember the correct usage for every function you use. Luckily, there are many sources of help!


1.4.1 Using Gen AI to get help with coding

Gen AI can give you the specific help you need when your code won’t work or when you are trying to do something new in R. It is recommended that you use copilot which the university provides for all staff and students. The data you enter into this version of copilot is not shared and therefore more secure.

If you want to use GenAI go to https://uoldigital.net/copilot and use your university username and password to sign in.


Don’t rely solely on AI

You have to know the basics of R so that you can understand the code AI suggests. This means you can identify when GenAI hasn’t quite understood your prompt and why the code it is giving you isn’t working.

Try to avoid using code written by Gen AI and then getting into a spiral by repeatedly asking it to debug that code.


GenAI is only as good as the prompt you provide. Here’s some tips

  • Make sure your prompt includes the names of your variables and dataframe.
  • You can paste the code or error message in as part of your prompt if you are trying to de-bug.
  • Ask it to use specific functions if you have a rough idea.
  • Be specific: the prompt “use indexing on a dataframe called penguins” is less likely to help you than “select the first row of data from a dataframe called penguins using indexing”.


The university has a guide for students on the use of Gen AI in assessments. Your assessment must include a section similar to the below.

Acknowledgement

I acknowledge the use of Copilot (https://m365.cloud.microsoft/chat) to debug R code when cleaning and analysing my data.


1.4.2 Getting help without Gen AI

Every package author writes help files for their functions. Each help page is broken down into sections (Description, Usage, Arguments etc). The last section Examples is very useful.

Either of the commands below will bring up the help for the function log.

?log
help(log)


When your code doesn’t work: seek help from the R community. 9 times out of 10, the answers you are seeking have already been answered on Stack Overflow.


Other internet resources to look out for while googling are:



Source

Adapted from R for Reproducible Scientific Analysis licensed CC_BY 4.0 by The Carpentries