## Binomial distribution probabilities using R

In this tutorial, you will learn about how to use `dbinom()`

, `pbinom()`

, `qbinom()`

and `rbinom()`

functions in R programming language to compute the individual probabilities, cumulative probabilities, quantiles and how to generate random sample from Binomial distribution.

Before we discuss R functions for binomial distribution, let us see what is binomial distribution.

## Binomial Distribution

Binomial distribution is typically used in situations where there are only two possible outcomes of a random experiment, such as *success* or *failure*, *head* or *tail*, *profit* or *loss* and the probability of success is constant from one trial to another trial. All the trials must be independent of each other.

Let $X\sim B(n,p)$ distribution. Then the probability mass function of binomial random variable $X$ is

` $$ \begin{aligned} P(X=x) & = \binom{n}{x} p^x q^{n-x},\\ & \quad x = 0,1,2, \cdots, n; \\ & \quad 0 \leq p \leq 1, q = 1-p \end{aligned} $$ `

where there are two parameters, namely, $n$ : number of trials (`size`

) and $p$ : probability of success (`prob`

).

Read more about the theory and results of binomial distribution here.

## Binomial probabilities using dbinom() function in R

For discrete probability distribution, density is the probability of getting exactly the value $x$ (i.e., $P(X=x)$).

The syntax to compute the probability at $x$ for binomial distribution using R is

`dbinom(x,size,prob)`

where

`x`

: the value(s) of the variable,`size`

: the number of trials, and`prob`

: the probability of success (`prob`

).

The `dbinom()`

function gives the probability for givn value(s) `x`

(no. of successes), `size`

(no. of trials) and `prob`

(probability of success).

## Numerical Problem for Binomial Distribution

To understand the four functions `dbinom()`

, `pbinom()`

, `qbinom()`

and `rbinom()`

, let us take the following numerical problem.

### Binomial Distribution Example

In a university 45% of the students are female. A random sample of ten students are selected.

(a) Find the probability that exactly four female students are selected.

(b) Plot the graph of binomial probability distribution.

(c) What is the probability that 2 or less female students are selected?

(d) What is the probability more than 8 female students are selected?

(e) What is the probability that 4 to 6 (inclusive) female students are selected?

(f) Plot the graph of cumulative binomial probabilities.

(g) What is the value of $c$, if $P(X\leq c) \geq 0.70$?

(h) Simulate 100 binomial distributed random variables with $n=10$ and $p=0.45$.

### Example 1: How to use `dbinom()`

function in R?

To find the probability that exactly four female students are selected, we need to use `dbinom()`

function.

Let $X$ denote the number of female students out of randomly selected $10$ students and the probability of female students (success) is $0.45$. Then $X\sim B(10, 0.45)$.

First let us define the given terms as

```
# assign no. of trials or size
size <- 10
# assign probability of success or prob
prob <- 0.45
# x is the possible values of random variable x
x <- 0:size
```

The probability mass function of $X$ is

` $$ \begin{aligned} P(X=x) &= \binom{10}{x} (0.45)^x (1-0.45)^{10-x},\\ &\quad x=0,1,\cdots, 10 \end{aligned} $$ `

For part (a), we need to find the probability $P(X = 4)$.

First I will show you how to calculate this probability using manual calculation, then I will show you how to compute the same probability using `dbinom()`

function in R.

(a) The probability that the sample contains exactly four female students is

` $$ \begin{aligned} P(X= 4) & =\binom{10}{4} (0.45)^{4} (1-0.45)^{10-4}\\ & = 0.2383666\\ \end{aligned} $$ `

The above probability can be calculated using `dbinom(4,10,0.45)`

function in R.

```
# Compute binomial probability
result1 <- dbinom(4,size,prob)
result1
```

`[1] 0.2383666`

### Example 2 Visualize Binomial probability distribution

Using `dbinom()`

function we can compute Binomial distribution probabilities and make a table of it.

```
## Compute the binomial probabilities
px<-dbinom(x,size,prob)
# make a table
b_table<-cbind(x,px)
# specify the column names
colnames(b_table)<-c("x", "P(X=x)")
b_table
```

```
x P(X=x)
[1,] 0 0.0025329516
[2,] 1 0.0207241496
[3,] 2 0.0763025509
[4,] 3 0.1664782929
[5,] 4 0.2383666466
[6,] 5 0.2340327076
[7,] 6 0.1595677552
[8,] 7 0.0746031063
[9,] 8 0.0228895894
[10,] 9 0.0041617435
[11,] 10 0.0003405063
```

Using `kable()`

function from knitr package, we can create table in LaTeX, HTML, Markdown and reStructured Text.

```
# to make table
library(knitr)
kable(b_table,align="c")
```

x | P(X=x) |
---|---|

0 | 0.0025330 |

1 | 0.0207241 |

2 | 0.0763026 |

3 | 0.1664783 |

4 | 0.2383666 |

5 | 0.2340327 |

6 | 0.1595678 |

7 | 0.0746031 |

8 | 0.0228896 |

9 | 0.0041617 |

10 | 0.0003405 |

(b) Visualizing Binomial Distribution with `dbinom()`

function and `plot()`

function in R:

The probability mass function of binomial distribution with given `size`

and given `prob`

can be visualized using `dbinom()`

function in `plot()`

function as follows:

```
## Plot the binomial probability dist
plot(x,px,type="h",xlim=c(0,11),ylim=c(0,max(px)),
lwd=10, col="blue",ylab="P(X=x)")
title("PMF of Binomial (size=10,prob=0.45)")
```

## Binomial cumulative probability using `pbinom()`

function in R

The syntax to compute the cumulative probability distribution function (CDF) for binomial distribution using R is

`pbinom(q,size,prob)`

where

`q`

: the value(s) of the variable,`size`

: the number of trials, and`prob`

: the probability of success (`prob`

).

This function is very useful for calculating the cumulative binomial probabilities for given value(s) of `q`

(value of the variable `x`

), `size`

(no. of trials) and `prob`

(probability of success).

### Example 3: How to use `pbinom()`

function in R?

In the above example, for part (c), we need to find the probability $P(X\leq 2)$.

First I will show you how to calculate this probability using manual calculation, then I will show you how to compute the same probability using `pbinom()`

and `dbinom()`

function in R.

(c) The probability that 2 or less female students are selected equals

` $$ \begin{aligned} P(X \leq 2) &= P(X=0)+P(X=1)+P(X=2)\\ &= \binom{10}{0} (0.45)^{0} (1-0.45)^{10-0} +\binom{10}{1} (0.45)^{1} (1-0.45)^{10-1} \\ &\quad +\binom{10}{2} (0.45)^{2} (1-0.45)^{10-2}\\ &= 0.002533+0.0207241+0.0763026\\ &= 0.0995597 \end{aligned} $$ `

```
## Compute cumulative binomial probability
result2 <- pbinom(2,size,prob)
result2
```

`[1] 0.09955965`

Above probability can also be calculated using `dbinom()`

function and the `sum()`

function as follows:

`sum(dbinom(0:2,size,prob))`

`[1] 0.09955965`

### Example 4: How to use `pbinom()`

function in R?

In the above example, for part (d), we need to find the probability $P(X > 8)$.

Numerically the probability that the selected sample contains more than 8 females can be calculated as

` $$ \begin{aligned} P(X > 8) & =P(X\geq 9)\\ &=\sum_{x=9}^{10} P(X=x)\\ & = P(X=9)+P(X=10)\\ &= 0.0041617+3.4050629\times 10^{-4}\\ & = 0.0045022\\ \end{aligned} $$ `

To calculate the probability that a random variable $X$ is greater than a given number you can use the option `lower.tail=FALSE`

in `pbinom()`

function.

Above probability can be calculated easily using `pbinom()`

function with argument `lower.tail=FALSE`

as

$P(X > 8) =$ `pbinom(8,size,prob,lower.tail=FALSE)`

or by using complementary event as

$P(X > 8) = 1- P(X\leq 8)$= 1- `pbinom(8,size,prob)`

```
# compute cumulative binomial probabilities
# with lower.tail False
pbinom(8,size,prob,lower.tail=FALSE)
```

`[1] 0.00450225`

`1-pbinom(8,size,prob)`

`[1] 0.00450225`

### Example 5: How to use `pbinom()`

function in R?

One can also use `pbinom()`

function to calculate the probability that the random variable $X$ is between two values.

(e) The probability that the sample contains 4 to 6 (inclusive) female students is

` $$ \begin{aligned} P(4 \leq X \leq 6) &= P(X=4)+P(X=5)+P(X=6)\\ &= \binom{10}{4} (0.45)^{4} (1-0.45)^{10-4} +\binom{10}{5} (0.45)^{5} (1-0.45)^{10-5} \\ &\quad +\binom{10}{6} (0.45)^{6} (1-0.45)^{10-6}\\ &= 0.2383666+0.2340327+0.1595678\\ &= 0.6319671 \end{aligned} $$ `

Above event can also be written as

` $$ \begin{aligned} P(4 \leq X \leq 6) &= P(X\leq 6) -P(X\leq 3)\\ &= 0.8980051 - 0.2660379 \end{aligned} $$ `

The above probability can be calculated using `pbinom()`

function as follows:

```
result3 <- pbinom(6,size,prob)-pbinom(3,size,prob)
result3
```

`[1] 0.6319671`

The above probability can also be calculated using `dbinom()`

function along with `sum()`

function.

```
result4 <- sum(dbinom(4:6,size,prob))
result4
```

`[1] 0.6319671`

The first command compute the binomial probability for $x=4$, $x=5$ and $x=6$. Then add all the probabilities using `sum()`

function and store the result in `result4`

.

### Example 6: Visualize the cumulative binomial probability distribution

```
# the value of x
x <- 0:size
# Compute cumulative binomial probabilities
Fx <- pbinom(x,size,prob)
```

```
## Compute the binomial probabilities
px <- dbinom(x,size,prob)
## Compute the cumulative binomial probabilities
Fx <- pbinom(x,size,prob)
## make a table
b_table2 <- cbind(x,px,Fx)
## assign column names
colnames(b_table2) <- c("x", "P(X=x)","P(X<=x)")
# display result
b_table2
```

```
x P(X=x) P(X<=x)
[1,] 0 0.0025329516 0.002532952
[2,] 1 0.0207241496 0.023257101
[3,] 2 0.0763025509 0.099559652
[4,] 3 0.1664782929 0.266037945
[5,] 4 0.2383666466 0.504404592
[6,] 5 0.2340327076 0.738437299
[7,] 6 0.1595677552 0.898005054
[8,] 7 0.0746031063 0.972608161
[9,] 8 0.0228895894 0.995497750
[10,] 9 0.0041617435 0.999659494
[11,] 10 0.0003405063 1.000000000
```

`kable(b_table2,align="c")`

x | P(X=x) | P(X<=x) |
---|---|---|

0 | 0.0025330 | 0.0025330 |

1 | 0.0207241 | 0.0232571 |

2 | 0.0763026 | 0.0995597 |

3 | 0.1664783 | 0.2660379 |

4 | 0.2383666 | 0.5044046 |

5 | 0.2340327 | 0.7384373 |

6 | 0.1595678 | 0.8980051 |

7 | 0.0746031 | 0.9726082 |

8 | 0.0228896 | 0.9954978 |

9 | 0.0041617 | 0.9996595 |

10 | 0.0003405 | 1.0000000 |

The cumulative probability distribution of binomial distribution with given `size`

and given `prob`

can be visualized using `plot()`

function with argument `type="s"`

(step function) as follows:

```
# Plot the cumulative binomial dist
plot(x,Fx,type="s",lwd=2,col="blue",
ylab=expression(P(X<=x)),
main="Distribution Function of B(n=10,p=0.45)")
```

## Binomial Distribution Quantiles using `qbinom()`

in R

The syntax to compute the quantiles of binomial distribution using R is

`qbinom(p,size,prob)`

where

`p`

: the value(s) of the probabilities,`size`

: the number of trials, and`prob`

: the probability of success (`prob`

).

The function `qbinom(p,size,prob)`

gives $100*p^{th}$ quantile of Binomial distribution for given value of `p`

, `size`

and `prob`

.

The $p^{th}$ quantile is the smallest value of binomial random variable $X$ such that $P(X\leq x) \geq p$.

It is the inverse of `pbinom()`

function. That is, inverse cumulative probability distribution function for binomial distribution.

### Example 7: How to use `qbinom()`

function in R?

In part (g), we need to find the value of $c$ such a that $P(X\leq c) \geq 0.70$. That is we need to find the $70^{th}$ quantile of given binomial distribution.

```
size <- 10
prob <- 0.45
```

```
# compute the quantile for binomial dist
qbinom(0.70,size,prob)
```

`[1] 5`

From the above table of Binomial probabilities and cumulative probabilities, it is clear that $70^{th}$ percentile is 5.

### Visualize the quantiles of Binomial Distribution

The quantiles of Binomial distribution with given `p`

, `size`

and `prob`

can be visualized using `plot()`

function as follows:

```
p <- seq(0,1,by=0.02)
qx <- qbinom(p,size=size,prob=prob)
# Plot the quantiles of Binomial dist
plot(p,qx,type="s",lwd=2,col="darkred",
ylab="quantiles",
main="Quantiles of B(size=10,prob=0.45)")
```

## Simulating Binomial random variable using `rbinom()`

function in R

The general R function to generate random numbers from Binomial distribution is `rbinom(n,size,prob)`

,

where,

`n`

is the sample size,`size`

is the number of trials, and`prob`

is the the probability of success in binomial distribution.

The function `rbinom(n,size,prob)`

generates `n`

random numbers from Binomial distribution with the number of trials `size`

and the probability of success `prob`

.

### Example 8: How to use `rbinom()`

function in R?

In part (h), we need to generate 100 random numbers from binomial distribution with number of trials (`size`

) =10 and probability of success (`prob`

) =0.45.

We can use `rbinom()`

function to generate random numbers from binomial distribution.

```
## initialize sample size to generate
n <- 100
# Simulate 100 values From Binomial dist
x_sim <- rbinom(n,size,prob)
# print values at console
x_sim
```

```
[1] 4 6 4 6 7 2 5 6 5 4 7 4 5 5 3 7 3 2 4 7 6 5 5 8 5 5 5 5 4 3 7 7 5 6 2 4 6
[38] 3 4 3 3 4 4 4 3 3 3 4 3 6 2 4 6 3 5 3 3 6 6 4 5 2 4 4 6 4 6 6 6 4 6 5 5 0
[75] 4 3 4 5 4 3 3 5 4 6 3 4 8 6 6 3 3 5 4 5 4 3 6 2 4 5
```

The frequency table for Binomial simulated data `x_sim`

can be obtained using `table()`

command.

```
## Print the frequency table
table(x_sim)
```

```
x_sim
0 2 3 4 5 6 7 8
1 6 20 26 20 19 6 2
```

```
plot(table(x_sim),xlab="x",ylab="frequency",
lwd=10,col="magenta",
main="Simulated data from B(10,0.45) dist")
```

If you use same function again, R will generate another set of random numbers from $B(10,0.45)$.

```
# Simulate 100 values From Binomial dist
x_sim_2 <- rbinom(n,size,prob)
# print values at console
x_sim_2
```

```
[1] 5 6 2 6 2 4 3 4 4 4 3 5 5 4 6 7 4 3 5 5 4 3 3 4 6 4 4 7 2 4 4 5 2 6 3 1 6
[38] 4 2 5 4 4 3 6 3 1 5 4 3 6 4 5 2 5 7 3 6 2 5 8 6 2 6 4 4 4 6 8 3 2 5 6 5 6
[75] 7 5 5 4 3 3 4 3 5 4 4 6 3 5 3 3 5 5 5 6 4 7 7 4 3 3
```

The frequency table of simulated data from Binomial distribution is as follow:

```
## Print the frequency table
table(x_sim_2)
```

```
x_sim_2
1 2 3 4 5 6 7 8
2 9 19 26 20 16 6 2
```

```
plot(table(x_sim_2),xlab="x",ylab="frequency",
lwd=10,col="magenta",
main="Simulated data from B(10,0.45) dist")
```

For the simulation purpose to reproduce same set of random numbers, one can use `set.seed()`

function.

```
# set seed for reproducibility
set.seed(1457)
# Simulate 100 values From Binomial dist
x_sim_3 <- rbinom(n,size,prob)
# print values at console
x_sim_3
```

```
[1] 5 5 5 3 5 3 5 4 6 3 5 3 3 1 4 4 4 5 2 5 2 4 4 5 4 3 5 7 3 4 7 3 3 4 3 5 6
[38] 3 5 4 6 5 3 3 3 3 5 7 3 4 3 4 7 4 7 3 4 1 7 5 4 4 4 4 4 6 7 6 5 4 6 2 3 5
[75] 3 6 3 8 4 2 4 1 4 3 5 5 5 6 2 5 5 5 5 3 3 5 3 6 4 5
```

The frequency table of `x_sim_3`

is as follows:

```
# frequency table using tabel command
table(x_sim_3)
```

```
x_sim_3
1 2 3 4 5 6 7 8
3 5 25 24 26 9 7 1
```

```
plot(table(x_sim_3),xlab="x",ylab="frequency",
lwd=10,col="darkred",
main="Simulated data from B(10,0.45) dist")
```

```
set.seed(1457)
# Simulate 100 values From Binomial dist
x_sim_4 <- rbinom(n,size,prob)
# print values at console
x_sim_4
```

```
[1] 5 5 5 3 5 3 5 4 6 3 5 3 3 1 4 4 4 5 2 5 2 4 4 5 4 3 5 7 3 4 7 3 3 4 3 5 6
[38] 3 5 4 6 5 3 3 3 3 5 7 3 4 3 4 7 4 7 3 4 1 7 5 4 4 4 4 4 6 7 6 5 4 6 2 3 5
[75] 3 6 3 8 4 2 4 1 4 3 5 5 5 6 2 5 5 5 5 3 3 5 3 6 4 5
```

The frequency table of `x_sim_4`

is as follows:

```
# frequency table using table
table(x_sim_4)
```

```
x_sim_4
1 2 3 4 5 6 7 8
3 5 25 24 26 9 7 1
```

```
plot(table(x_sim_4),xlab="x",ylab="frequency",
lwd=10,col="darkred",
main="Simulated data from B(10,0.45) dist")
```

Since we have used `set.seed(1457)`

function for both the simulation, the `x_sim_3`

and `x_sim_4`

are same.

To learn more about other discrete and continuous probability distributions using R, go through the following tutorials:

Poisson distribution in R

Geometric distribution in R

Negative Binomial distribution in R

Hypergeometric distribution in R

**Continuous Distributions Using R**

Uniform distribution in R

Exponential distribution in R

Normal distribution in R

Log-Normal distribution in R

Beta distribution in R

Gamma distribution in R

Cauchy distribution in R

Laplace distribution in R

Logistic distribution in R

Weibull distribution in R

## Endnote

In this tutorial, you learned about how to compute the probabilities, cumulative probabilities and quantiles of Binomial distribution in R programming. You also learned about how to simulate a binomial distribution using R programming.

To learn more about R code for discrete and continuous probability distributions, please refer to the following tutorials:

Probability Distributions using R

Let me know in the comments below, if you have any questions on Binomial Distribution using R and your thought on this article.