User Defined Functions in R Part I

In this tutorial, you will learn about what is user-defined function in R and how to create user-defined function in R. You will also learn about how the function is evaluated in R.

What are functions in R?

For every programming language, functions are the building blocks. Basically, functions are used to incorporate set of instructions that you want to use repeatedly to perform specific task. There are two type of functions in R:

User Defined Functions in R

User-defined functions are the functions defined by the user as per the requirement. Once user-defined functions are created in R, they can be used like other built in functions in R.

Structure of user-defined function in R

In R programming language, a function can be created using the keyword function and are stored as an R objects and their class is "function".

The syntax of function is as follows:

function_name <- function(arg1, arg2, ...) {
  expression_1
  expression_2
  ...
  expression_n
}

where

  • function_name: name given to the function.
  • arg1,arg2: arguments of the function.
  • {}: walls of the function.
  • expression_i: valid R code.
  • Function body: Everything between {} is the body of the function.
  • Return Value: The last evaluated expression in the body or returned value using return() function.

The function_name is the name given to the function and it is stored in R environment as an object with this name. The function takes one or more inputs (or none) known as arguments. Arguments can have a default values.
All the expressions or R codes forming the operations comprises the body of the function.

Note that simple expression doesn't require braces {} where as compound expressions (more than one expressions) are surrounded by braces {}.

Evaluation of Functions in R

Following steps are involved in the evaluation of function in R:

  • A set of variables associated to the arguments of a function are temporarily created.
  • The variable definitions are used to evaluate the body of the function.
  • All the temporary variables are removed at the end.
  • The computed values are returned.

Let us understand the concept of user-defined function with the help of very simple function to compute the area of circle given the radius.

Example 1: Simple Function in R

Suppose we wish to write a function to compute the area of a circle with radius r. Area of a circle with radius r is given by $A=\pi r^2$.

# one line function
area.circle <- function(r)  pi * r ^ 2
## calling function with r=4
area.circle(4)  
[1] 50.26548

Evaluation of the function area.circle() takes place as :

  • Create temporary variable r with value 4
  • Use this value to compute $\pi*r^2$.
  • Remove temporary variable
  • Return the result 50.2654825.
R<-1:4
area.circle(R)
[1]  3.141593 12.566371 28.274334 50.265482

If you do not specify argument to the function you will get an error message.

#area.circle()
## Error in area.circle() : argument "r" is missing, with no default

Let us define the same function with default value of r=1.

area.circle <- function(r = 1)  pi * r ^ 2

The default value of r is 1. If you do not specify the argument to the function then R will evaluate the function with default value r=1 and return the result.

area.circle()  
[1] 3.141593

If you specify the value of the argument to the function, R take that value as the argument value otherwise R take the default value as the argument.

area.circle(4) 
[1] 50.26548
area.circle(r=4)
[1] 50.26548

Components of function in R

All R functions have three components:

  • the body(), the R code inside the the function
  • the formals(), the list of arguments which controls how you can call the function.
  • the environment(), the location of the function's variables.

Let us check the components of user-defined function area.circle() one by one.

# display body of the function
body(area.circle)
pi * r^2
# display formals of the function
formals(area.circle)
$r
[1] 1
# display environment of the function
environment(area.circle)
<environment: R_GlobalEnv>

The return() function in R

By default, the functions defined in R returns the last evaluated expression inside the function. But you can also use return() function to explicitly return one or more values.

Example 2: return() function in R function

Suppose we want to modify the function area.circle() in such way that if the argument r is less than or equal to zero then the function returns the message "radius can not be negative" otherwise the function returns the area of the circle.

Let us modify our function area.circle() as follows:

area.circle <-function(r=1){
  ifelse(r <= 0,
         result <- "radius cannot be negtaive",
         result <- pi*r^2)
  return(result)
}
area.circle(-3)
[1] "radius cannot be negtaive"
area.circle(4)
[1] 50.26548

Argument Matching in R

In R function, arguments with default value are known as named arguments and arguments with no default value are known as positional arguments. The arguments of functions can be matched by name or positions.

Order of matching is

  • named argument
  • positional argument

Example 3: Argument matching in R

Sometimes in a function we don't want to give default values, but we also don't want to cause error. If an argument is missing we can use missing() function.

myfun.0 <- function(x, y, z = 2) {
  if (missing(y)) {
    result <- x + z
  } else{
    result <- x + y + 2 * z
  }
  return(result)
}

Let us evaluate the user-defined function myfun.0 using different arguments.

# x=3, y is missing, default z is 2
myfun.0(3) 
[1] 5
# x=3, y=4,default z is 2
myfun.0(3, 4) 
[1] 11

Note that user can change the default value.

# value of z is taken as 5
myfun.0(3, 4, z = 5) 
[1] 17

In the above function, the arguments taken as x=3, y=4 and z=5.

# matched by default value (z=4)
myfun.0(y = 4, 4, x = 3) 
[1] 15
# matched by default (z=5,x=3, y is missing)
myfun.0(5, x = 3) 
[1] 12

Note that order of matching for the function arguments is (a) named argument (b) positional argument. That is, first the named arguments are matched and then the positional arguments are matched.

Example 4: Function with named results

Below function illustrate the use of names results in a function. You can specify the names to the result of a function.

Function with named results as a list

Let us define a simple function to calculate area and perimeter of a circle. The area of a circle with radius $r$ is $A=\pi r^2$ and the perimeter of a circle is $P=2\pi r$.

area_perimeter_circle <- function(r=1) {
  area <- pi*r^2
  perimeter <- 2*pi*r
  result <- list(Area = area,Perimeter =perimeter)
  return(result)
}

Let us evaluate the function area_perimeter_circle() for radius $r=5$.

MyResult<-area_perimeter_circle(5)
MyResult
$Area
[1] 78.53982

$Perimeter
[1] 31.41593
class(MyResult)
[1] "list"
names(MyResult)
[1] "Area"      "Perimeter"

Function with named results as a vector

Let us modify the function area_perimeter_circle() defined earlier to return the named result as a vector.

area_perimeter_circle2 <- function(r=1) {
  area <- pi*r^2
  perimeter <- 2*pi*r
  result <- c(Area = area,Perimeter =perimeter)
  return(result)
}

Let us evaluate the function area_perimeter_circle2() for radius $r=5$.

MyResult2<-area_perimeter_circle2(5)
MyResult2
     Area Perimeter 
 78.53982  31.41593 
class(MyResult2)
[1] "numeric"
names(MyResult2)
[1] "Area"      "Perimeter"
# display the Area from MyResult2
MyResult2["Area"]
    Area 
78.53982 
# display the Perimeter from MyResult2
MyResult2[2]
Perimeter 
 31.41593 

Some examples of user-defined functions in R

Example 5: Harmonic mean user-defined function

Let $x_i, i=1,2, \cdots , n$ be $n$ positive observations then the harmonic mean of $X$ is denoted by $HM$ and is defined as

$$HM = \dfrac{n}{\sum_{i=1}^{n} \dfrac{1}{x_i}}$$

Thus, the harmonic mean is the reciprocal of the arithmetic mean of the reciprocal of the observations.

Harmonic.Mean <- function(x, na.rm = TRUE) {
  if (na.rm == FALSE & any(is.na(x))) {
    avg <- NA
  } else{
    avg <- 1 / mean(1 / x, na.rm = TRUE)
  }
  return(avg)
}
x <- c(10,25,36,23,NA,17)
Harmonic.Mean(x)
[1] 18.51306

Example 6: User-defined function to calculate median

Write a function to compute median of sample observations.

The sample median, $M$ is the middle observation of the sorted values. If $n$ is odd, median is the $\bigg(\frac{n+1}{2}\bigg)^{th}$ observation of the sorted values and if $n$ is even, median is the average of $\bigg(\frac{n}{2}\bigg)^{th}$ and $\bigg(\frac{n}{2}+1\bigg)^{th}$ observation of the sorted values.

median <- function (x, na.rm = FALSE) {
  if (na.rm)
    # remove missing values
    x <- x[!is.na(x)]
  else if (any(is.na(x)))
    return(NA)
  n <- length(x)
  half <- (n + 1) / 2
  if (n %% 2 == 1) {
    # if n is odd 
    sort(x)[half]
  } else {
    # if n is even
    sum(sort(x)[c(half, half + 1)]) / 2
  }
}
x<-c(175, 176, 173, 175, 174, 173, 173, 176, 173, 179)
median(x)
[1] 174.5

Example 7: Function to simulat an experiment of rolling a die

Suppose you want to simulate an experiment of rolling a die 10000 times. Write a function to Simulate an experiment of rolling a die. From the simulation calculate the frequency and relative frequency.

roll.die<-function(){
  die<-1:6
  sample(die,size=1)
}
roll.die()
[1] 5
roll.die()
[1] 3

Every time you execute a function roll.die() it returns a number appearing on the top of the die.

Note that the sample() function is used to draw a simple random sample with or without replacement. Every time you execute the function roll.die() the function returns a new value between 1 to 6 (inclusive).

Use the function to simulate the experiment 10000 times. Get the frequency and relative frequency table.

n <- 10000
# Initialize the sim.result
sim.result <- numeric(n)
for (i in 1:n) {
sim.result[i] <- roll.die()
}
# use table() function to get frequencies
table(sim.result) 
sim.result
   1    2    3    4    5    6 
1674 1678 1651 1692 1679 1626 

The relative frequency can be calculated as individual frequency divided by the total.

# to get relative frequencies
table(sim.result)/n 
sim.result
     1      2      3      4      5      6 
0.1674 0.1678 0.1651 0.1692 0.1679 0.1626 

Let us plot the relative frequency of rolling die experiment 10000 times.

plot(table(sim.result)/n,col="blue",lwd=3,xlab="Relative Frequency")
relative frequency of rolling die
relative frequency of rolling die

Endnote

In this tutorial, you learned about functions in R, how to define user-defined functions in R.

To learn more about functions in R, please refer to the following tutorials:

Hopefully you enjoyed this tutorial on user-defined functions in R.

VRCBuzz co-founder and passionate about making every day the greatest day of life. Raju is nerd at heart with a background in Statistics. Raju looks after overseeing day to day operations as well as focusing on strategic planning and growth of VRCBuzz products and services. Raju has more than 25 years of experience in Teaching fields. He gain energy by helping people to reach their goal and motivate to align to their passion. Raju holds a Ph.D. degree in Statistics. Raju loves to spend his leisure time on reading and implementing AI and machine learning concepts using statistical models.

Leave a Comment