Normal Distribution probabilities Using R

Normal Distribution Probabilities using R

In this tutorial, you will learn about how to use dnorm(), pnorm(), qnorm() and rnorm() functions in R programming language to compute the individual probabilities, cumulative probabilities, quantiles and to generate random sample for Normal distribution.

Before we discuss R functions for Normal distribution, let us see what is Normal distribution.

Normal Distribution

Normal distribution distribution is a continuous type probability distribution. Normal distribution has found applications in many fields.

A continuous random variable $X$ is said to have a normal distribution with parameters $\mu$ and $\sigma^2$ if its probability density function is given by

$$ \begin{equation*} f(x;\mu, \sigma^2) = \left\{ \begin{array}{ll} \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2\sigma^2}(x-\mu)^2}, & \hbox{$-\infty< x<\infty$,} \\ & \hbox{$-\infty<\mu<\infty$, $\sigma^2>0$;} \\ 0, & \hbox{Otherwise.} \end{array} \right. \end{equation*} $$

where $e= 2.71828...$ and $\pi = 3.1425926...$.

The parameter $\mu$ is called the location parameter (as it changes the location of density curve) and $\sigma^2$ is called the scale parameter of normal distribution (as it changes the scale of density curve).

In notation it can be written as $X\sim N(\mu,\sigma^2)$.

Read more about the theory and results of Normal distribution here.

Normal probabilities using dnorm() function in R

For continuous probability distribution, density is the value of the probability density function at $x$ (i.e., $f(x)$).

The syntax to compute the probability density function for Normal distribution using R is

dnorm(x,mean=0, sd = 1)

where

  • x : the value(s) of the variable and,
  • mean : mean of Normal distribution (location parameter),
  • sd : standard deviation of Normal distribution (scale parameter).

The dnorm() function gives the density for given value(s) x, mean and sd.

Numerical Problem for Normal Distribution

To understand the four functions dnorm(), pnorm(), qnorm() and rnorm(), let us take the following numerical problem.

Normal Distribution Example

The GRE is widely used to help predict the performance of applicants to graduate schools. The range of possible scores on a GRE is 200 to 900. The psychology department at a university finds that the students in their department have scores with a mean of 544 and standard deviation of 103.

(a) Find the value of the density function at $x=550$.
(b) Plot the graph of Normal probability distribution.
(c) Find the probability that a student in psychology department has a score less than 480.
(d) Find the probability that a student in psychology department has a score at least 460.
(e) Find the probability that a student in the psychology department has a score between 480 and 730.
(f) Plot the graph of cumulative Normal probabilities.
(g) What is the value of $c$, if $P(X\leq c) \geq 0.80$?
(h) Simulate 1000 Normal distributed random variables with $\mu= 544$ and $\sigma = 103$.

Let $X$ denote the GRE score. Given that $X\sim Normal(544, 103^2)$.

Example 1: How to use dnorm() function in R?

To find the value of the density function at $x=550$ we need to use dnorm() function.

First let us define the given parameters as

# mean of distribution
mu <- 544
# standard deviation of distribution
sigma <- 103

The probability density function of $X$ is

$$ \begin{aligned} f(x)&= \frac{1}{103\sqrt{2\pi}}e^{-\frac{1}{2}\big(\frac{x-544}{103}\big)^2},\\ &\quad\text{for } x \geq 0. \end{aligned} $$

For part (a), we need to find the density function at $x=550$. That is $f(550)$.

(a) The value of the density function at $x=550$ is

$$ \begin{aligned} f(550)&=\frac{1}{103\sqrt{2\pi}}e^{-\frac{1}{2}\big(\frac{550-544}{103}\big)^2}\\ &= 0.0038667 \end{aligned} $$

The above probability can be calculated using dnorm(550,mean=544,sd=103) function in R.

# Compute Normal probability
result1 <- dnorm(550,mean=mu,sd=sigma)
result1
[1] 0.00386666

Example 2 Visualize Normal probability distribution

Using dnorm() function we can compute Normal distribution probabilities for given x, mean and sd. To plot the probability density function of Normal distribution, we need to create a sequence of x values and compute the corresponding probabilities.

# create a sequence of x values
x <- seq(200,900, by=10)
## Compute the Normal pdf for each x
px <- dnorm(x,mean=mu,sd=sigma)

(b) Visualizing Normal Distribution with dnorm() function and plot() function in R:

The probability density function of Normal distribution with given 544 and 103 can be visualized using plot() function as follows:

## Plot the Normal probability dist
plot(x,px,type="l",xlim=c(200,900),ylim=c(0,max(px)),
     lwd=3, col="darkred",ylab="f(x)",
     main=expression(paste("PDF of Normal with ",
        mu,"=544 and ",sigma,"=103")))
PDF of Normal Dist
PDF of Normal Dist

Normal cumulative probability using pnorm() function in R

The syntax to compute the cumulative probability distribution function (CDF) for Normal distribution using R is

pnorm(q,mean=0, sd=1)

where

  • q : the value(s) of the variable,
  • mean : mean of Normal distribution (location parameter),
  • sd : standard deviation of Normal distribution (scale parameter).

Using this function one can calculate the cumulative distribution function of Normal distribution for given value(s) of q (value of the variable x), mean and sd.

Example 3: How to use pnorm() function in R?

In the above example, for part (c), we need to find the probability $P(X\leq 480)$.

(c) The probability that a student in the psychology department has a score less than 480 is

$$ \begin{aligned} P(X < 480)&=P\bigg(\frac{X-\mu}{\sigma} < \frac{480-544}{103}\bigg)\\ &= P(Z < -0.621)\\ &=0.2671816 \end{aligned} $$

## Compute cumulative Normal probability
result2 <- pnorm(480,mean=mu,sd=sigma)
result2
[1] 0.2671816

Example 4: How to use pnorm() function in R?

In the above example, for part (d), we need to find the probability $P(X \geq 460)$.

(d) The probability that a student in psychology department has a score at least 460 is

$$ \begin{aligned} P(X \geq 460) &= 1- P(X < 460)\\ &=1-P\bigg(\frac{X-\mu}{\sigma} < \frac{460-544}{103}\bigg)\\ &= 1- P(Z < -0.816)\\ &= 1- 0.2073834\\ &=0.7926166. \end{aligned} $$

To calculate the probability that a random variable $X$ is greater than a given number, one can use the option lower.tail=FALSE in pnorm() function.

Above probability can be calculated easily using pnorm() function with argument lower.tail=FALSE as

$P(X \geq 460)$= pnorm(460,mean=544,sd=103,lower.tail=FALSE)

or by using complementary event as

$P(X \geq 460) = 1- P(X\leq 460)$= 1- pnorm(460,mean=544,sd=103)

# compute cumulative Normal probabilities
# with lower.tail False
pnorm(460,mean=mu,sd=sigma,lower.tail=FALSE)
[1] 0.7926166
# Using complementary event
1-pnorm(460,mean=mu,sd=sigma)
[1] 0.7926166

Example 5: How to use pnorm() function in R?

One can also use pnorm() function to calculate the probability that the random variable $X$ is between two values.

(e) The probability that a student in the psychology department has a score between $480$ and $730$ is

$$ \begin{aligned} P(480 \leq X\leq 730) &=P(480 \leq X\leq 730)\\ &=P\bigg(\frac{480-544}{103}\leq \frac{X-\mu}{\sigma} \leq \frac{730-544}{103}\bigg)\\ &=P\bigg(-0.621 \leq Z \leq 1.806\bigg)\\ &= P(Z < 1.806) -P(Z < -0.621)\\ &=0.9645-0.2672\\ &= 0.6973 \end{aligned} $$

The above probability can be calculated using pnorm() function as follows:

a <- pnorm(730,mean=mu,sd=sigma)
b <- pnorm(480,mean=mu,sd=sigma)
result3 <- a - b
result3
[1] 0.6973455

Example 6: Visualize the cumulative Normal probability distribution

Using pnorm() function we can compute Normal cumulative probabilities (CDF) for given x, mean and sd. To plot the CDF of Normal distribution, we need to create a sequence of x values and compute the corresponding cumulative probabilities.

# create a sequence of x values
x <- seq(200,900, by=10)
## Compute the Normal pdf for each x
Fx <- pnorm(x,mean=mu,sd=sigma)

(f) Visualizing Normal Distribution with pnorm() function and plot() function in R:

The cumulative probability distribution of Normal distribution with given x, mean and sd can be visualized using plot() function as follows:

## Plot the Normal  probability dist
plot(x,Fx,type="l",xlim=c(200,900),ylim=c(0,1),
     lwd=3, col="darkred",ylab="F(x)",
     main=expression(paste("CDF of Normal with ",
        mu,"=544 and ",sigma,"=103")))
CDF of Normal Dist
CDF of Normal Dist

Normal Distribution Quantiles using qnorm() in R

The syntax to compute the quantiles of Normal distribution using R is

qnorm(p,mean=0,sd=1)

where

  • p : the value(s) of the probabilities,
  • mean : mean of Normal distribution (location parameter),
  • sd : standard deviation of Normal distribution (scale parameter).

The function qnorm(p,mean,sd) gives $100*p^{th}$ quantile of Normal distribution for given value of p, mean and sd.

The $p^{th}$ quantile is the smallest value of Normal random variable $X$ such that $P(X\leq x) \geq p$.

It is the inverse of pnorm() function. That is, inverse cumulative probability distribution function for Normal distribution.

Example 7: How to use qnorm() function in R?

In part (g), we need to find the value of $c$ such a that $P(X\leq c) \geq 0.80$. That is we need to find the $80^{th}$ quantile of given Normal distribution.

mu <- 544
sigma <- 103
prob <- 0.80
# compute the quantile for Normal  dist
qnorm(0.80,mean=mu, sd=sigma)
[1] 630.687

The $80^{th}$ percentile of given Normal distribution is 630.6869871.

Visualize the quantiles of Normal Distribution

The quantiles of Normal distribution with given p, mean=mu and sd=sigma can be visualized using plot() function as follows:

p <- seq(0,1,by=0.01)
qx <- qnorm(p,mean=mu,sd=sigma)
# Plot the Quantiles of Normal  dist
plot(p,qx,type="l",lwd=2,col="darkred",
     ylab="quantiles",
     main=expression(paste("Quantiles of Normal with ",
        mu,"=544 and ",sigma,"=103")))
Quantiles of Normal Dist
Quantiles of Normal Dist

Simulating Normal random variable using rnorm() function in R

The general R function to generate random numbers from Normal distribution is

rnorm(n,mean=0,sd=1)

where,

  • n : the sample observations,
  • mean : mean of Normal distribution (location parameter),
  • sd : standard deviation of Normal distribution (scale parameter).

The function rnorm(n,mean,sd) generates n random numbers from Normal distribution with given mean and sd.

Example 8: How to use rnorm() function in R?

In part (h), we need to generate 1000 random numbers from Normal distribution with given $mean = 544$ and $sd=103$.

(h) We can use rnorm(1000,mean,sd) function to generate random numbers from Normal distribution.

## initialize sample size to generate
n <- 1000
# Simulate 1000 values From Normal  dist
x_sim <- rnorm(n,mean=mu,sd=sigma)

The below graphs shows the density of the simulated random variables from Normal Distribution.

## Plot the simulated data
plot(density(x_sim),xlab="Simulated x",ylab="density",
     lwd=5,col="darkred",
     main=expression(paste("Simulated data Normal with ",
        mu,"=544 and ",sigma,"=103")))
Random Sample Normal Dist
Random Sample Normal Dist

If you use same function again, R will generate another set of random numbers from $Normal(544,103^2)$.

# Simulate 1000 values From Normal  dist
x_sim_2 <- rnorm(n,mean=mu,sd=sigma)
## Plot the simulated data
plot(density(x_sim_2),xlab="Simulated x",ylab="density",
     lwd=5,col="blue",
    main=expression(paste("Simulated data Normal with ",
        mu,"=544 and ",sigma,"=103")))
Random Sample Normal Dist 2
Random Sample Normal Dist 2

For the simulation purpose to reproduce same set of random numbers, one can use set.seed() function.

# set seed for reproducibility
set.seed(1457)
# Simulate 1000 values From Normal  dist
x_sim_3 <- rnorm(n,mean=mu,sd=sigma)
## Plot the simulated data
plot(density(x_sim_3),xlab="Simulated x",ylab="density",
     lwd=5,col="darkred",
     main=expression(paste("Simulated data Normal with ",
        mu,"=544 and ",sigma,"=103")))
Random Sample Normal Dist 3
Random Sample Normal Dist 3
set.seed(1457)
# Simulate 1000 values From Normal  dist
x_sim_4 <- rnorm(n,mean=mu,sd=sigma)
## Plot the simulated data
plot(density(x_sim_4),xlab="Simulated x",ylab="density",
     lwd=5,col="darkred",
     main=expression(paste("Simulated data Normal with ",
        mu,"=544 and ",sigma,"=103")))
Random Sample Normal Dist 3
Random Sample Normal Dist 4

Since we have used set.seed(1457) function, R will generate the same set of Normal distributed random numbers.

hist(x_sim_4,breaks = 30,col="red4",
     main=expression(paste("Histogram Normal with ",
        mu,"=544 and ",sigma,"=103")))
Histogram of Simulated data Normal Dist
Histogram of Simulated data Normal Dist

To learn more about other discrete and continuous probability distributions using R, go through the following tutorials:

Discrete Distributions Using R

Binomial distribution in R
Poisson distribution in R
Geometric distribution in R
Negative Binomial distribution in R
Hypergeometric distribution in R

Continuous Distributions Using R

Uniform distribution in R
Exponential distribution in R
Log-Normal distribution in R
Beta distribution in R
Gamma distribution in R
Cauchy distribution in R
Laplace distribution in R
Logistic distribution in R
Weibull distribution in R

Endnote

In this tutorial, you learned about how to compute the probabilities, cumulative probabilities and quantiles of Normal distribution in R programming. You also learned about how to simulate a Normal distribution using R programming.

To learn more about R code for discrete and continuous probability distributions, please refer to the following tutorials:

Probability Distributions using R

Let me know in the comments below, if you have any questions on Normal Distribution using R and your thought on this article.

Leave a Comment