Introduction to R



Introduction to R

0 0


Rcourse


On Github nlk124 / Rcourse

Introduction to R

  • Statistical programming language and environment
  • Available as public domain software (i.e. not copyrighted)
  • Free!
  • Better than most commercial alternatives
  • Available on all platforms (Windows, OSX, Linux)
  • Not just for statistics, but also general purpose programming

Why use R?

  • Flexible
  • Transparent
  • Large user base
  • Valued skill (Remember, R is a programming language)

RStudio

  • Integrated development environment (IDE) for the R language
  • AKA use R with a nice, shiny interface
  • Also free!
  • Using R without RStudio feels kind of like using only Command (Windows) or Terminal (Mac) on your computer

Running R with RStudio

  • The screen is split four ways:
    • Workspace: (Top left) Where scripts are written & saved
    • Console: (Bottom left) Where commands are run
    • Environment, History: (Top right) Where objects are stored
    • Files, Plots, Packages, and Help: (Bottom right) View data, help

Running R with RStudio

  • Type directly into the console (best when you don't want to save the code) or type into the script - then run (CTRL + ENTER) (Windows) or (CMD + ENTER) (Mac)

  • A script is a plain text file with R commands in it. This will be where you save the code that you are writing - the file will end in the extension .R

R as a calculator

  • R has many arithmetic operators

    • + Addition
    • - Subtraction
    • * Multiplication
    • / Division
    • ^ Exponentiation
    • %% Modulus (finds remainder)
    • %/% Integer division (leaves off remainder)
  • R obeys the standard order of operations

R as a calculator

Examples

7 + 4
## [1] 11
3^2
## [1] 9
10 %% 7
## [1] 3

R is logical

  • R also has many logical operators
    • < Less than
    • <= Less than or equal to
    • > Greater than
    • >= Greater than or equal to
    • == Exactly equal to
    • != Not equal to
    • ! NOT
    • | OR
    • & AND
7 == 4
## [1] FALSE
3 > 2
## [1] TRUE

Try It!

What is 17 multiplied by 365?

What is 13 cubed?

Is 9 to the fourth equal to the sum of 2000 and 187 multiplied by 3?

Creating Objects

An object is the fundamental unit in R. All expressions can be saved as an object.

To create an object from an expression we use the assignment operator (<-). The assignment operator assigns values on the right to objects on the left.

a <- (12 + 180) * 3
a
## [1] 576

The object a is now the output of the expression (12 + 180) * 3. Check your environment (upper right panel)

Assignment operator

  • Do not use a = 12 + 180 for assignment in R. This is best practice to use <- and good to get used to when you're learning the language.
  • This may seem arbitrary, but it is helpful when you are reading someone else's code.

Google's R Style Guide

Advanced R: Style Guide

R Tip: Comment on your code

Use # signs to comment on your script. Anything to the right of a # is ignored. Good scripts (and homework) have comments before every major block of code. It's surprisingly hard to remember what you did when reviewing older code without comments, and it's particularly important when other people are reading your code.

5 + 5 # This adds five and five 
## [1] 10
# 10 + 10 this does not add ten and ten 

Expressions using objects

Objects can be combined into other, larger, and more complex objects.

a <- 8 * 10
b <- 2 * 10
d <- a * b
d
## [1] 1600
# This is equivalent to: 
d <- 8 * 10 * 2 * 10
d
## [1] 1600

Try It!

Create an object that is equal to your age. Create another object that is equal to the age of the person to your right. Find the difference between these objects.

Data structures: vectors

  • R has ~5 common data structures. We will start with the simplest: vectors.
  • Vectors are one dimensional strings of numbers, character, or objects. A vector is made using the combine function, c().

Combining numbers into vectors

a <- c(3, 4, 5)
a
## [1] 3 4 5

Combining characters into vectors

Characters in R need to be enclosed in quotation marks.

pets <- c("dog", "cat", "bird")
pets
## [1] "dog"  "cat"  "bird"

Combining objects into vectors

# Make objects 
a <- sqrt(4 * 7)
b <- 6 * 5
g <- 9 * 2

# Combine
d <- c(a, b, g)
d
## [1]  5.291503 30.000000 18.000000

Vectors of regular sequences

You can use a colon (:) to create a vector that includes all integers in between the numbers on either side of the colon

x <- 1:10
x
##  [1]  1  2  3  4  5  6  7  8  9 10

You can use seq(from = , to = , by = ) to create a vector with a set min and max (from, to) with a specified increment (by)

x <- seq(from = 1, to = 20, by = 2)
x
##  [1]  1  3  5  7  9 11 13 15 17 19

Vector indexing: positional

You can access any element in the vector by putting its position in square brackets [ ]

# Create a vector 
height <- c(76, 72, 74, 74, 78)
height
## [1] 76 72 74 74 78
height[1] # extract the 1st element in the vector 
## [1] 76
height[5] # extract the 5th element
## [1] 78

Vector indexing: positional

You can also use vector indexing to return the same vector with certain elements missing using the - symbol

height <- c(76, 72, 74, 74, 78)

height[-1]
## [1] 72 74 74 78

Vector indexing: named

You can assign names to each element of the vector, and then extract the element by indexing based on the name.

# Create a vector with named elements 
temp <- c(monday = 28.1, tuesday = 28.5, wednesday = 29.0, thursday = 30.1, friday = 30.2)
temp
##    monday   tuesday wednesday  thursday    friday 
##      28.1      28.5      29.0      30.1      30.2
temp["wednesday"]
## wednesday 
##        29
temp[3]
## wednesday 
##        29

Vector indexing: logical

You can extract elements in a vector that meet specific criteria based on a logical expression.

y <- 5:50
y
##  [1]  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
## [24] 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
y[y <= 10]  # extract all elements less than or equal to 10
## [1]  5  6  7  8  9 10
y[y < 10 & y != 5]  # extract all elements less than 10 that are not equal to 5
## [1] 6 7 8 9

Try It!

What are the 9th and 12th positions of the vector seq(1, 27, 0.5)?

Bonus! Can you find those positions simultaneously?

Create the vector c(3:33). How many elements are greater than or equal to 17?

Functions

  • A function is a stored object that performs a task given some inputs (called arguments). R has many functions already available, but you can also write your own functions.

  • Try using the tab key while entering arguments in functions to discover an important feature of RStudio.

  • Functions are used in the format:name_of_function(inputs)

  • The output of a function can be saved to an object: output <- name_of_function(inputs)

Functions

Not necessary to explicitly name arguments (but it is often helpful).

seq(1, 10, 1)
##  [1]  1  2  3  4  5  6  7  8  9 10
seq(from = 1, to = 10, by = 1)
##  [1]  1  2  3  4  5  6  7  8  9 10

Functions

Use sum() to take the sum of all elements in a vector:

sum(c(3, 4, 5))
## [1] 12

Use mean() to take the mean of a vector:

mean(seq(5, 100, 5))
## [1] 52.5

Functions

Functions can act on an object

x <- seq(5, 100, 5)

# use the vector x as the input to the function
mean(x)
## [1] 52.5

R Tip: The help system

  • All functions come with a help file.

  • Help files provide important information on what the function does, how it works, and they provide examples at the very bottom.

help(mean)
  • You can also use ? before a function name to view the help screen
?mean  # Same as help(mean) 
?sort  # Same as help(sort)

R Tip: The Help Screen

  • Some R functions are easy to guess by name. Most functions are abbreviated to save time and space.

  • Use ?? to search for functions; e.g. search for any function whose help screens contain the word "robust"

??robust

Note: This will only work for already installed packages

Try It!

What is the median of 34, 16, 105, 27?Remember: functions are often named intuitively.

What does the function range() do, what is the sample example in the help file?

Bonus! Is mean(4, 5) different than mean(c(4, 5))?

Packages

  • We will be exploring functions in much greater detail throughout this course. (Including writing your own functions!)

  • Functions are kept inside packages, some of which come pre-installed with R. Others must be downloaded.

Packages

There are tons of R packages - currently 7742!

Check the List of R Packages and search with your favorite keyword

Ecology, paleo, dispersal, population, time series, phylogenetic, community, Bayes

Installing Packages

Often you will need to install a package to access a certain library of functions.

# Install a new package
install.packages("picante") 

Remember to surround the package name in quotation marks.

Loading Packages

Installing a package just downloads its to your computer.

To actually use a function from an outside package you have to load it. This let's R know what packages to load in, and not waste time with all potential functions.

# Two ways to load packages:
library(ggplot2)
require(ggplot2)

Note: no quotation marks needed

R Tip: Loading Packages

Good scripts (and homeworks) have a series of require() or library() statements at the top of the script.

Try It!

Search & find an interesting package. What is it? What is one function included in the package?

Install the package to your computer.

List of R Packages

The R User Community

Stack Overflow

R-Bloggers

R Mailing Lists

R for cats

Questions?

Worksheet

Answers