Negative Binomial distribution probabilities using R

Negative Binomial distribution probabilities using R

In this tutorial, you will learn about how to use dnbinom(), pnbinom(), qnbinom() and rnbinom() functions in R programming language to compute the individual probabilities, cumulative probabilities, quantiles and to generate random sample for Negative Binomial distribution.

Before we discuss R functions for Negative Binomial distribution, let us see what is Negative Binomial distribution.

Negative Binomial Distribution

Negative Binomial distribution distribution helps to describe the probability of occurrence of a number of events in some given time interval or in a specified region. The time interval may be of any length, such as a minutes, a day, a week etc.

Let $X\sim NB(r,p)$. Then the probability distribution of $X$ is

$$ \begin{aligned} P(X=x)&= \binom{x+r-1}{x} p^{r} q^{x},\\ & \quad x=0,1,2,\ldots; r=1,2,\ldots\\ & \quad 0 < p, q < 1, p+q=1. \end{aligned} $$

where $r$, the number of successes and $p$, the probability of success in each trial are the parameters of Negative Binomial distribution.

Read more about the theory and results of Negative Binomial distribution here.

Negative Binomial probabilities using dnbinom() function in R

For discrete probability distribution, density is the probability of getting exactly the value $x$ (i.e., $P(X=x)$).

The syntax to compute the probability at $x$ for Negative Binomial distribution using R is

dnbinom(x,size,prob)

where

  • x : the value(s) of the variable and,
  • size : target number of successes,
  • prob : probability of success in each trial.

The dnbinom() function gives the probability for given value(s) x , size and prob.

Numerical Problem for Negative Binomial Distribution

To understand the four functions dnbinom(), pnbinom(), qnbinom() and rnbinom(), let us take the following numerical problem.

Negative Binomial Distribution Example

A large lot of tires contains 5% defectives. 4 tires are to be chosen for a car.

(a) Find the probability that you find 2 defective tires before 4 good ones.
(b) Plot the graph of Negative Binomial probability distribution.
(c) Find the probability that you find at most 1 defective tires before 4 good ones.
(d) Find the probability that you find at least 2 defective tires before 4 good ones.
(e) What is the probability that 1 to 3 (inclusive) defective tires before 4 good ones?
(f) Plot the graph of cumulative Negative Binomial probabilities.
(g) What is the value of $c$, if $P(X\leq c) \geq 0.99$?
(h) Simulate 100 Negative Binomial distributed random variables with $n= 4$ and $prob = 0.95$.

Let $X$ denote the number of defective tires selected before 4 good tires. A large lot of tires contains 5% defectives. The probability that a lot contains good (i.e., non-defective) tire is $p= 1-0.05 = 0.95$.

The random variable $X$ follows a negative binomial distribution with $size=4$ and $prob=0.95$. That is $X\sim NB(4,0.95)$.

Example 1: How to use dnbinom() function in R?

To find the probability that exactly 2 defective tires before 4 good ones, i.e., $P(X=2)$, we need to use dnbinom() function.

Let $X$ denote the number of breakdowns during a month. is $0.95$. Then $X\sim P(0.95)$.

First let us define the given terms as

# no. of successes
size <- 4
# Probability of success
prob <- 0.95

The probability mass function of $X$ is

$$ \begin{aligned} P(X=x)&= \binom{x+4-1}{x} (0.95)^{4} (0.05)^{x},\\ &\quad \quad x=0,1,2,\ldots \end{aligned} $$

For part (a), we need to find the probability $P(X = 2)$.

First I will show you how to calculate this probability using manual calculation, then I will show you how to compute the same probability using dnbinom() function in R.

(a) The probability that exactly 2 defective tires before 4 good ones is

$$ \begin{aligned} P(X=2)&= \binom{2+3}{2} (0.95)^{4} (0.05)^{2}\\ &= \binom{5}{2} (0.95)^{4} (0.05)^{2}\\ &= 10*(0.8145062)*(0.0025)\\ &= 0.0203627 \end{aligned} $$

The above probability can be calculated using dnbinom(2,0.95) function in R.

# Compute Negative Binomial probability
result1 <- dnbinom(2,size,prob)
result1
[1] 0.02036266

Example 2 Visualize Negative Binomial probability distribution

Using dnbinom() function we can compute Negative Binomial distribution probabilities and make a table of it.

# x is the possible values of random variable x
x <- 0:7
## Compute the Negative Binomial probabilities 
px<-dnbinom(x,size,prob)
# make a table 
nb_table <- cbind(x,px)
# specify the column names
colnames(nb_table) <- c("x", "P(X=x)")
nb_table
     x       P(X=x)
[1,] 0 8.145062e-01
[2,] 1 1.629013e-01
[3,] 2 2.036266e-02
[4,] 3 2.036266e-03
[5,] 4 1.781732e-04
[6,] 5 1.425386e-05
[7,] 6 1.069039e-06
[8,] 7 7.635996e-08

Using kable() function from knitr package, we can create table in LaTeX, HTML, Markdown and reStructured Text.

# to make table
library(knitr)
kable(nb_table)
x P(X=x)
0 0.8145062
1 0.1629013
2 0.0203627
3 0.0020363
4 0.0001782
5 0.0000143
6 0.0000011
7 0.0000001

(b) Visualizing Negative Binomial Distribution with dnbinom() function and plot() function in R:

The probability mass function of Negative Binomial distribution with given size and prob can be visualized using dnbinom() function in plot() function as follows:

## Plot the Negative Binomial probability dist
plot(x,px,type="h",xlim=c(0,8),ylim=c(0,max(px)),
     lwd=10, col="blue",ylab="P(X=x)")
title("PMF of Negative Binomial (size = 4, prob= 0.95)")
PMF of Negative Binomial
PMF of Negative Binomial

Negative Binomial cumulative probability using pnbinom() function in R

The syntax to compute the cumulative probability distribution function (CDF) for Negative Binomial distribution using R is

pnbinom(q,size, prob)

where

  • q : the value(s) of the variable,
  • size : target number of successes,
  • prob : probability of success in each trial.

This function is very useful for calculating the cumulative Negative Binomial probabilities for given value(s) of q (value of the variable x), size and prob.

Example 3: How to use pnbinom() function in R?

In the above example, for part (c), we need to find the probability $P(X\leq 1)$.

First I will show you how to calculate this probability using manual calculation, then I will show you how to compute the same probability using pnbinom() and dnbinom() function in R.

(c) The probability that at most 1 breakdown during next month

$$ \begin{aligned} P(X\leq 1)&=\sum_{x=0}^{1}P(X=x)\\ &= P(X=0)+P(X=1)\\ &= \binom{0+3}{0} (0.95)^{4} (0.05)^{0}+\binom{1+3}{1} (0.95)^{4} (0.05)^{1}\\ &= 1*(0.95)^4*0.05^0)+4*(0.95)^4 (0.05)^1\\ &= 0.8145062+0.1629013\\ &= 0.9774075 \end{aligned} $$

## Compute cumulative Negative Binomial probability
result2 <- pnbinom(1,size,prob)
result2
[1] 0.9774075

Above probability can also be calculated using dnbinom() function and the sum() function as follows:

sum(dnbinom(0:1,size,prob))
[1] 0.9774075

Example 4: How to use pnbinom() function in R?

In the above example, for part (d), we need to find the probability $P(X \geq 2)$.

Numerically the probability that at least 2 defective tires before 4 good ones can be calculated as

$$ \begin{aligned} P(X \geq 2) & =1-P(X\leq 1)\\ &=1-\sum_{x=0}^{1} P(X=x)\\ & = 1- (P(X=0)+P(X=1))\\ &= 1- \big(0.8145062+0.1629013)\\ & = 0.0225925\\ \end{aligned} $$

To calculate the probability that a random variable $X$ is greater than a given number you can use the option lower.tail=FALSE in pnbinom() function.

Above probability can be calculated easily using pnbinom() function with argument lower.tail=FALSE as

$P(X \geq 2) =$ pnbinom(1,prob,lower.tail=FALSE)

or by using complementary event as

$P(X \geq 2) = 1- P(X\leq 1)$= 1- pnbinom(1,prob)

# compute cumulative Negative Binomial probabilities
# with lower.tail False
pnbinom(1,size,prob,lower.tail=FALSE)
[1] 0.0225925
1-pnbinom(1,size,prob)
[1] 0.0225925

Example 5: How to use pnbinom() function in R?

One can also use pnbinom() function to calculate the probability that the random variable $X$ is between two values.

(e) The probability that the sample contains 1 to 3 (inclusive) defective tires before 4 good ones is

$$ \begin{aligned} P(1 \leq X \leq 3) &= P(X=1)+P(X=2)+P(X=3)\\ &=\binom{1+3}{1} (0.95)^{4} (0.05)^{1}+\binom{2+3}{2} (0.95)^{4} (0.05)^{2}\\ &\quad +\binom{3+3}{3} (0.95)^{4} (0.05)^{3}\\ &= 0.1629013+0.0203627+0.0020363\\ &= 0.1853002 \end{aligned} $$

Above event can also be written as

$$ \begin{aligned} P(1 \leq X \leq 3) &= P(X\leq 3) -P(X\leq 0)\\ &= 0.9998064 - 0.8145062\\ &= 0.1853002 \end{aligned} $$

The above probability can be calculated using pnbinom() function as follows:

result3 <- pnbinom(3,size,prob)-pnbinom(0,size,prob)
result3
[1] 0.1853002

The above probability can also be calculated using dnbinom() function along with sum() function.

result4 <- sum(dnbinom(1:3,size,prob))
result4
[1] 0.1853002

The first command compute the Negative Binomial probability for $x=1$, $x=2$ and $x=3$. Then add all the probabilities using sum() function and store the result in result4.

Example 6: Visualize the cumulative Negative Binomial probability distribution

# the value of x
x <- 0:7
## Compute the Negative Binomial probabilities 
px <- dnbinom(x,size,prob)
## Compute the cumulative Negative Binomial probabilities 
Fx <- pnbinom(x,size,prob)
## make a table 
nb_table2 <- cbind(x,px,Fx)
## assign column names
colnames(nb_table2) <- c("x", "P(X=x)","P(X<=x)")
# display result
nb_table2
     x       P(X=x)   P(X<=x)
[1,] 0 8.145062e-01 0.8145062
[2,] 1 1.629013e-01 0.9774075
[3,] 2 2.036266e-02 0.9977702
[4,] 3 2.036266e-03 0.9998064
[5,] 4 1.781732e-04 0.9999846
[6,] 5 1.425386e-05 0.9999988
[7,] 6 1.069039e-06 0.9999999
[8,] 7 7.635996e-08 1.0000000
kable(nb_table2)
x P(X=x) P(X<=x)
0 0.8145062 0.8145062
1 0.1629013 0.9774075
2 0.0203627 0.9977702
3 0.0020363 0.9998064
4 0.0001782 0.9999846
5 0.0000143 0.9999988
6 0.0000011 0.9999999
7 0.0000001 1.0000000

The cumulative probability distribution of Negative Binomial distribution with given x, size and prob can be visualized using plot() function with argument type="s" (step function) as follows:

# Plot the cumulative Negative Binomial dist
plot(x,Fx,type="s",lwd=2,col="blue",
     ylab=expression(P(X<=x)),
main="Distribution Function of NB(size= 4,prob = 0.95)")
CDF of Negative Binomial
CDF of Negative Binomial

Negative Binomial Distribution Quantiles using qnbinom() in R

The syntax to compute the quantiles of Negative Binomial distribution using R is

qnbinom(p,size,prob)

where

  • p : the value(s) of the probabilities,
  • size : target number of successes,
  • prob : probability of success in each trial.

The function qnbinom(p,size,prob) gives $100*p^{th}$ quantile of Negative Binomial distribution for given value of p, size and prob.

The $p^{th}$ quantile is the smallest value of Negative Binomial random variable $X$ such that $P(X\leq x) \geq p$.

It is the inverse of pnbinom() function. That is, inverse cumulative probability distribution function for Negative Binomial distribution.

Example 7: How to use qnbinom() function in R?

In part (g), we need to find the value of $c$ such a that $P(X\leq c) \geq 0.99$. That is we need to find the $99^{th}$ quantile of given Negative Binomial distribution.

size <- 4
prob <- 0.95
# compute the quantile for Negative Binomial dist
qnbinom(0.99,size, prob)
[1] 2

From the above table of Negative Binomial probabilities and cumulative probabilities, it is clear that $99^{th}$ percentile is 2.

Visualize the quantiles of Negative Binomial Distribution

The quantiles of Negative Binomial distribution with given p, size and prob can be visualized using plot() function as follows:

p <- seq(0,1,by=0.02)
qx <- qnbinom(p,size=size,prob=prob)
# Plot the quantiles of Negative Binomial dist
plot(p,qx,type="s",lwd=2,col="darkred",
     ylab="quantiles",
main="Quantiles of NB(size=4,prob=0.95)")
Quantiles of Negative Binomial
Quantiles of Negative Binomial

Simulating Negative Binomial random variable using rnbinom() function in R

The general R function to generate random numbers from Negative Binomial distribution is

rnbinom(n,size,prob)

where,

  • n : the sample size,
  • size : target number of successes,
  • prob : probability of succcess in each trial.

The function rnbinom(n,size,prob) generates n random numbers from Negative Binomial distribution with given size and prob.

Example 8: How to use rnbinom() function in R?

In part (h), we need to generate 1000 random numbers from Negative Binomial distribution with given $size = 4$ and $prob=0.95$.

We can use rnbinom() function to generate random numbers from Negative Binomial distribution.

## initialize sample size to generate
n <- 1000
# Simulate 1000 values From Negative Binomial dist
x_sim <- rnbinom(n,size,prob)

To get the frequency table of simulated negative binomial random variables, we can use table() function in R.

## Print the frequency table
table(x_sim)
x_sim
  0   1   2   3   4 
799 174  24   2   1 
## Plot the simulated data
plot(table(x_sim),xlab="x",ylab="frequency",
     lwd=10,col="red",
     main="Simulated data from NB(4,0.95) dist")
Random Sample from Negative Binomial
Random Sample from Negative Binomial

If you use same function again, R will generate another set of random numbers from $NB(4,0.95)$.

# Simulate 100 values From Negative Binomial dist
x_sim_2 <- rnbinom(n,size,prob)

The frequency table of simulated data from Negative Binomial distribution is as follow:

## Print the frequency table
table(x_sim_2)
x_sim_2
  0   1   2   3 
816 162  20   2 
## Plot the simulated data
plot(table(x_sim_2),xlab="x",ylab="frequency",
     lwd=10,col="red",
     main="Simulated data from NB(4,0.95) dist")
Random Sample from Negative Binomial 2
Random Sample from Negative Binomial 2

For the simulation purpose to reproduce same set of random numbers, one can use set.seed() function.

# set seed for reproducibility
set.seed(1457)
# Simulate 100 values From Negative Binomial dist
x_sim_3 <- rnbinom(n,size,prob)

The frequency table of x_sim_3 is as follows:

## Print the frequency table
table(x_sim_3)
x_sim_3
  0   1   2   3   4 
805 177  15   2   1 
plot(table(x_sim_3),xlab="x",ylab="frequency",
     lwd=10,col="magenta",
     main="Simulated data from NB(4,0.95) dist")
Random Sample from Negative Binomial 3
Random Sample from Negative Binomial 3
set.seed(1457)
# Simulate 1000 values From Negative Binomial dist
x_sim_4 <- rnbinom(n,size,prob)

The frequency table of x_sim_4 is as follows:

## Print the frequency table
table(x_sim_4)
x_sim_4
  0   1   2   3   4 
805 177  15   2   1 
plot(table(x_sim_4),xlab="x",ylab="frequency",
     lwd=10,col="magenta",
     main="Simulated data from NB(4,0.95) dist")
Random Sample from Negative Binomial 3
Random Sample from Negative Binomial 3

Since we have used set.seed(1457) function for both the simulation, the x_sim_3 and x_sim_4 are same.

To learn more about other discrete and continuous probability distributions using R, go through the following tutorials:

Discrete Distributions Using R

Binomial distribution in R
Poisson distribution in R
Geometric distribution in R
Hypergeometric distribution in R

Continuous Distributions Using R

Uniform distribution in R
Exponential distribution in R
Normal distribution in R
Log-Normal distribution in R
Beta distribution in R
Gamma distribution in R
Cauchy distribution in R
Laplace distribution in R
Logistic distribution in R
Weibull distribution in R

Endnote

In this tutorial, you learned about how to compute the probabilities, cumulative probabilities and quantiles of Negative Binomial distribution in R programming. You also learned about how to simulate a Negative Binomial distribution using R programming.

To learn more about R code for discrete and continuous probability distributions, please refer to the following tutorials:

Probability Distributions using R

Let me know in the comments below, if you have any questions on Negative Binomial Distribution using R and your thought on this article.

VRCBuzz co-founder and passionate about making every day the greatest day of life. Raju is nerd at heart with a background in Statistics. Raju looks after overseeing day to day operations as well as focusing on strategic planning and growth of VRCBuzz products and services. Raju has more than 25 years of experience in Teaching fields. He gain energy by helping people to reach their goal and motivate to align to their passion. Raju holds a Ph.D. degree in Statistics. Raju loves to spend his leisure time on reading and implementing AI and machine learning concepts using statistical models.

Leave a Comment