Contents

- 1 Geometric distribution probabilities using R
- 2 Geometric Distribution
- 3 Geometric probabilities using dgeom() function in R
- 4 Numerical Problem for Geometric Distribution
- 5 Geometric cumulative probability using pgeom() function in R
- 6 Geometric Distribution Quantiles using qgeom() in R
- 7 Simulating Geometric random variable using rgeom() function in R
- 8 Endnote

## Geometric distribution probabilities using R

In this tutorial, you will learn about how to use `dgeom()`

, `pgeom()`

, `qgeom()`

and `rgeom()`

functions in R programming language to compute the individual probabilities, cumulative probabilities, quantiles and to generate random sample for Geometric distribution.

Before we discuss R functions for Geometric distribution, let us see what is Geometric distribution.

## Geometric Distribution

Geometric distribution is used to model the situation where we are interested in finding the probability of number failures before first success or number of trials (attempts) to get first success in a repeated mutually independent Beronulli’s trials, each with probability of success p

Let $X\sim G(p)$. Then the probability distribution of $X$ is

` $$ \begin{aligned} P(X=x)= \left\{ \begin{array}{ll} pq^x , & \hbox{$x=0,1,2,\cdots$;} \\ & \hbox{$0 < p < 1,\; q=1-p$;} \\ 0, & \hbox{Otherwise.} \end{array} \right. \end{aligned} $$ `

where $p$ is the parameter of Geometric distribution.

Read more about the theory and results of Geometric distribution here.

## Geometric probabilities using `dgeom()`

function in R

For discrete probability distribution, density is the probability of getting exactly the value $x$ (i.e., $P(X=x)$).

The syntax to compute the probability at $x$ for Geometric distribution using R is

`dgeom(x,prob)`

where

`x`

: the value(s) of the variable and,`prob`

: the probability of success in each trial.

The `dgeom()`

function gives the probability for given value(s) `x`

and `prob`

.

## Numerical Problem for Geometric Distribution

To understand the four functions `dgeom()`

, `pgeom()`

, `qgeom()`

and `rgeom()`

, let us take the following numerical problem.

### Geometric Distribution Example

If a production line has a 4.5 % defective rate. Let $X$ denote the number of non-defective products before first defective product.

(a) Find the the probability that the there will be 3 non-defective products before first defective.

(b) Plot the graph of Geometric probability distribution.

(c) Find the probability that there will be at most 3 non-defective products before first defective.

(d) Find the probability that there will be at least 3 non-defective products before first defective.

(e) What is the probability that 3 to 5 (inclusive) non-defective products before first defective product?

(f) Plot the graph of cumulative Geometric probabilities.

(g) What is the value of $c$, if $P(X\leq c) \geq 0.60$?

(h) Simulate 100 Geometric distributed random variables with $prob = 0.35$.

Let $X$ denote the number of non-defective products before first defective product. Let us consider non-defective product as success and defective product as failure. Then $p=P(\text{ success })=0.35$. Then $X\sim G(0.35)$.

### Example 1: How to use `dgeom()`

function in R?

To find the probability that exactly three non-defective products before first defective product, we need to use `dgeom()`

function.

First let us define the given terms as

```
# probability of success/defective
prob <- 0.35
```

The probability mass function of $X$ is

` $$ \begin{aligned} P(X=x) &= 0.35(0.65)^x,\\ & \quad x=0,1,2,\cdots \end{aligned} $$ `

For part (a), we need to find the probability $P(X = 3)$.

First I will show you how to calculate this probability using manual calculation, then I will show you how to compute the same probability using `dgeom()`

function in R.

(a) The probability that the there will be 3 non-defective products before first defective is

` $$ \begin{aligned} P(X = 3) & =0.35(0.65)^3\\ & = 0.0961188\\ \end{aligned} $$ `

The above probability can be calculated using `dgeom(3,0.35)`

function in R.

```
# Compute Geometric probability
result1 <- dgeom(3,prob)
result1
```

`[1] 0.09611875`

### Example 2 Visualize Geometric probability distribution

Using `dgeom()`

function we can compute Geometric distribution probabilities and make a table of it.

```
# x is the possible values of random variable x
x <- 0:12
## Compute the Geometric probabilities
px<-dgeom(x,prob)
# make a table
b_table <- cbind(x,px)
# specify the column names
colnames(b_table) <- c("x", "P(X=x)")
b_table
```

```
x P(X=x)
[1,] 0 0.350000000
[2,] 1 0.227500000
[3,] 2 0.147875000
[4,] 3 0.096118750
[5,] 4 0.062477188
[6,] 5 0.040610172
[7,] 6 0.026396612
[8,] 7 0.017157798
[9,] 8 0.011152568
[10,] 9 0.007249169
[11,] 10 0.004711960
[12,] 11 0.003062774
[13,] 12 0.001990803
```

Using `kable()`

function from knitr package, we can create table in LaTeX, HTML, Markdown and reStructured Text.

```
# to make table
library(knitr)
kable(b_table)
```

x | P(X=x) |
---|---|

0 | 0.3500000 |

1 | 0.2275000 |

2 | 0.1478750 |

3 | 0.0961188 |

4 | 0.0624772 |

5 | 0.0406102 |

6 | 0.0263966 |

7 | 0.0171578 |

8 | 0.0111526 |

9 | 0.0072492 |

10 | 0.0047120 |

11 | 0.0030628 |

12 | 0.0019908 |

(b) Visualizing Geometric Distribution with `dgeom()`

function and `plot()`

function in R:

The probability mass function of Geometric distribution with given `prob`

can be visualized using `dgeom()`

function in `plot()`

function as follows:

```
## Plot the Geometric probability dist
plot(x,px,type="h",xlim=c(0,12),ylim=c(0,max(px)),
lwd=10, col="blue",ylab="P(X=x)")
title("PMF of Geometric (prob= 0.35)")
```

## Geometric cumulative probability using `pgeom()`

function in R

The syntax to compute the cumulative probability distribution function (CDF) for Geometric distribution using R is

`pgeom(q,prob)`

where

`q`

: the value(s) of the variable,`prob`

: the probability of success in each trial.

This function is very useful for calculating the cumulative Geometric probabilities for given value(s) of `q`

(value of the variable `x`

), `prob`

.

### Example 3: How to use `pgeom()`

function in R?

In the above example, for part (c), we need to find the probability $P(X\leq 3)$.

First I will show you how to calculate this probability using manual calculation, then I will show you how to compute the same probability using `pgeom()`

and `dgeom()`

function in R.

(c) The probability that there will be at most 3 non-defective products before first defective is

` $$ \begin{aligned} P(X\leq 3) &= P(X=0)+ P(X=1)+P(X=2)+P(X=3)\\ &= 0.35+ 0.35(0.65)^1\\ & \quad +0.35(0.65)^2+0.35(0.65)^3\\ &= 0.35+0.2275\\ & \quad +0.147875+0.0961188\\ &= 0.8214937 \end{aligned} $$ `

```
## Compute cumulative Geometric probability
result2 <- pgeom(3,prob)
result2
```

`[1] 0.8214937`

Above probability can also be calculated using `dgeom()`

function and the `sum()`

function as follows:

`sum(dgeom(0:3,prob))`

`[1] 0.8214937`

### Example 4: How to use `pgeom()`

function in R?

In the above example, for part (d), we need to find the probability $P(X \geq 3)$.

Numerically the probability that there will be at least 3 non-defective products before first defective can be calculated as

` $$ \begin{aligned} P(X \geq 3) & =1-P(X\leq 2)\\ &=1-\sum_{x=0}^{2} P(X=x)\\ & = 1- \big(P(X=0)+P(X=1)+P(X=2)\big)\\ &= 1- \big(0.35+0.2275\\ &\quad +0.147875\big)\\ & = 0.274625\\ \end{aligned} $$ `

To calculate the probability that a random variable $X$ is greater than a given number you can use the option `lower.tail=FALSE`

in `pgeom()`

function.

Above probability can be calculated easily using `pgeom()`

function with argument `lower.tail=FALSE`

as

$P(X \geq 3) =$ `pgeom(2,prob,lower.tail=FALSE)`

or by using complementary event as

$P(X \geq 3) = 1- P(X\leq 2)$= 1- `pgeom(2,prob)`

```
# compute cumulative Geometric probabilities
# with lower.tail False
pgeom(2,prob,lower.tail=FALSE)
```

`[1] 0.274625`

`1-pgeom(2,prob)`

`[1] 0.274625`

### Example 5: How to use `pgeom()`

function in R?

One can also use `pgeom()`

function to calculate the probability that the random variable $X$ is between two values.

(e) Tthe probability that 3 to 5 (inclusive) non-defective products before first defective product is

` $$ \begin{aligned} P(3 \leq X \leq 5) &= P(X=3)+P(X=4)+P(X=5)\\ &= 0.35(0.65)^3+0.35(0.65)^4 + 0.35(0.65)^5\\ &= 0.0961188+0.0624772+0.0406102\\ &= 0.1992061 \end{aligned} $$ `

Above event can also be written as

` $$ \begin{aligned} P(3 \leq X \leq 5) &= P(X\leq 5) -P(X\leq 2)\\ &= 0.9245811 - 0.725375\\ &=0.1992061 \end{aligned} $$ `

The above probability can be calculated using `pgeom()`

function as follows:

```
result3 <- pgeom(5,prob)-pgeom(2,prob)
result3
```

`[1] 0.1992061`

The above probability can also be calculated using `dgeom()`

function along with `sum()`

function.

```
result4 <- sum(dgeom(3:5,prob))
result4
```

`[1] 0.1992061`

The first command compute the Geometric probability for $x=3$, $x=4$ and $x=5$. Then add all the probabilities using `sum()`

function and store the result in `result4`

.

### Example 6: Visualize the cumulative Geometric probability distribution

```
# the value of x
x <- 0:12
# Compute cumulative Geometric probabilities
Fx <- pgeom(x,prob)
```

```
## Compute the Geometric probabilities
px <- dgeom(x,prob)
## Compute the cumulative Geometric probabilities
Fx <- pgeom(x,prob)
## make a table
b_table2 <- cbind(x,px,Fx)
## assign column names
colnames(b_table2) <- c("x", "P(X=x)","P(X<=x)")
# display result
b_table2
```

```
x P(X=x) P(X<=x)
[1,] 0 0.350000000 0.3500000
[2,] 1 0.227500000 0.5775000
[3,] 2 0.147875000 0.7253750
[4,] 3 0.096118750 0.8214937
[5,] 4 0.062477188 0.8839709
[6,] 5 0.040610172 0.9245811
[7,] 6 0.026396612 0.9509777
[8,] 7 0.017157798 0.9681355
[9,] 8 0.011152568 0.9792881
[10,] 9 0.007249169 0.9865373
[11,] 10 0.004711960 0.9912492
[12,] 11 0.003062774 0.9943120
[13,] 12 0.001990803 0.9963028
```

`kable(b_table2)`

x | P(X=x) | P(X<=x) |
---|---|---|

0 | 0.3500000 | 0.3500000 |

1 | 0.2275000 | 0.5775000 |

2 | 0.1478750 | 0.7253750 |

3 | 0.0961188 | 0.8214937 |

4 | 0.0624772 | 0.8839709 |

5 | 0.0406102 | 0.9245811 |

6 | 0.0263966 | 0.9509777 |

7 | 0.0171578 | 0.9681355 |

8 | 0.0111526 | 0.9792881 |

9 | 0.0072492 | 0.9865373 |

10 | 0.0047120 | 0.9912492 |

11 | 0.0030628 | 0.9943120 |

12 | 0.0019908 | 0.9963028 |

The cumulative probability distribution of Geometric distribution with given `prob`

can be visualized using `plot()`

function with argument `type="s"`

(step function) as follows:

```
# Plot the cumulative Geometric dist
plot(x,Fx,type="s",lwd=2,col="blue",
ylab=expression(P(X<=x)),
main="Distribution Function of G(0.35)")
```

## Geometric Distribution Quantiles using `qgeom()`

in R

The syntax to compute the quantiles of Geometric distribution using R is

`qgeom(p,prob)`

where

`p`

: the value(s) of the probabilities,`prob`

: the probability of success in each trial.

The function `qgeom(p,prob)`

gives $100*p^{th}$ quantile of Geometric distribution for given value of `p`

and `prob`

.

The $p^{th}$ quantile is the smallest value of Geometric random variable $X$ such that $P(X\leq x) \geq p$.

It is the inverse of `pgeom()`

function. That is, inverse cumulative probability distribution function for Geometric distribution.

### Example 7: How to use `qgeom()`

function in R?

In part (g), we need to find the value of $c$ such a that $P(X\leq c) \geq 0.60$. That is we need to find the $60^{th}$ quantile of given Geometric distribution.

`prob <- 0.35`

```
# compute the quantile for Geometric dist
qgeom(0.60,prob)
```

`[1] 2`

From the above table of Geometric probabilities and cumulative probabilities, it is clear that $60^{th}$ percentile is 2.

### Visualize the quantiles of Geometric Distribution

The quantiles of Geometric distribution with given `p`

, `size`

and `prob`

can be visualized using `plot()`

function as follows:

```
p <- seq(0,1,by=0.02)
qx <- qgeom(p,prob=prob)
# Plot the quantiles of Geometric dist
plot(p,qx,type="s",lwd=2,col="darkred",
ylab="quantiles",
main="Quantiles of Geo(0.35)")
```

## Simulating Geometric random variable using `rgeom()`

function in R

The general R function to generate random numbers from Geometric distribution is

`rgeom(n,prob)`

where,

`n`

is the sample size,`prob`

: the probability of success in each trial.

The function `rgeom(n,prob)`

generates `n`

random numbers from Geometric distribution with the probability of success `prob`

.

### Example 8: How to use `rgeom()`

function in R?

In part (h), we need to generate 100 random numbers from Geometric distribution with probability of success $0.35$.

We can use `rgeom()`

function to generate random numbers from Geometric distribution.

```
## initialize sample size to generate
n <- 100
# Simulate 100 values From Geometric dist
x_sim <- rgeom(n,prob)
# print values at console
x_sim
```

```
[1] 2 4 0 0 1 8 5 1 1 0 2 7 0 1 2 2 1 5 5 0 0 4 0 1 1 2 1 1 0 2 0 1 4 1 0 0 5
[38] 4 0 2 2 0 6 0 0 1 1 1 2 1 1 1 4 0 5 1 2 1 3 2 2 0 3 5 1 0 0 1 2 2 3 2 2 0
[75] 0 0 0 3 0 0 0 4 1 0 2 0 0 3 0 1 0 0 0 0 0 0 3 4 2 0
```

The frequency table for Geometric simulated data `x_sim`

can be obtained using `table()`

command.

```
## Print the frequency table
table(x_sim)
```

```
x_sim
0 1 2 3 4 5 6 7 8
37 23 18 6 7 6 1 1 1
```

```
## Plot the simulated data
plot(table(x_sim),xlab="x",ylab="frequency",
lwd=10,col="gray",
main="Simulated data from Geo(0.45) dist")
```

If you use same function again, R will generate another set of random numbers from $Geo(0.35)$.

```
# Simulate 100 values From Geometric dist
x_sim_2 <- rgeom(n,prob)
# print values at console
x_sim_2
```

```
[1] 0 1 0 1 5 0 5 1 3 0 0 0 0 0 1 5 0 0 1 2 1 5 0 0 0 0 0 1 0 2 1 0 1 0 0 0 0
[38] 4 1 2 3 5 2 0 0 0 0 0 2 0 1 2 0 3 5 4 0 0 0 3 0 6 0 6 2 2 2 4 1 1 2 0 5 0
[75] 4 3 0 0 1 7 3 4 1 0 3 4 1 0 1 1 0 1 2 2 0 1 2 0 3 0
```

The frequency table of simulated data from Geometric distribution is as follow:

```
# frequency table of simulated data
table(x_sim_2)
```

```
x_sim_2
0 1 2 3 4 5 6 7
43 20 13 8 6 7 2 1
```

```
plot(table(x_sim_2),xlab="x",ylab="frequency",
lwd=10,col="pink",
main="Simulated data from Geo(0.45) dist")
```

For the simulation purpose to reproduce same set of random numbers, one can use `set.seed()`

function.

```
# set seed for reproducibility
set.seed(1457)
# Simulate 100 values From Geometric dist
x_sim_3 <- rgeom(n,prob)
# print values at console
x_sim_3
```

```
[1] 1 0 0 0 0 0 0 2 3 5 4 3 1 5 0 0 1 1 2 1 4 3 0 0 0 1 4 0 0 4 2 0 0 0 1 1 6
[38] 0 0 4 1 1 1 0 0 1 2 2 5 0 0 1 1 2 5 2 2 0 2 3 1 1 0 0 0 0 0 0 4 0 1 1 0 1
[75] 1 0 1 4 0 0 0 0 0 0 0 1 0 3 5 2 1 2 1 4 5 1 6 0 0 2
```

The frequency table of `x_sim_3`

is as follows:

`table(x_sim_3)`

```
x_sim_3
0 1 2 3 4 5 6
42 25 12 5 8 6 2
```

```
plot(table(x_sim_3),xlab="x",ylab="frequency",
lwd=10,col="purple",
main="Simulated data from Geo(0.45) dist")
```

```
set.seed(1457)
# Simulate 100 values From Geometric dist
x_sim_4 <- rgeom(n,prob)
# print values at console
x_sim_4
```

```
[1] 1 0 0 0 0 0 0 2 3 5 4 3 1 5 0 0 1 1 2 1 4 3 0 0 0 1 4 0 0 4 2 0 0 0 1 1 6
[38] 0 0 4 1 1 1 0 0 1 2 2 5 0 0 1 1 2 5 2 2 0 2 3 1 1 0 0 0 0 0 0 4 0 1 1 0 1
[75] 1 0 1 4 0 0 0 0 0 0 0 1 0 3 5 2 1 2 1 4 5 1 6 0 0 2
```

The frequency table of `x_sim_4`

is as follows:

`table(x_sim_4)`

```
x_sim_4
0 1 2 3 4 5 6
42 25 12 5 8 6 2
```

```
plot(table(x_sim_4),xlab="x",ylab="frequency",
lwd=10,col="purple",
main="Simulated data from Geo(0.45) dist")
```

Since we have used `set.seed(1457)`

function for both the simulation, the `x_sim_3`

and `x_sim_4`

are same.

To learn more about other discrete and continuous probability distributions using R, go through the following tutorials:

**Discrete Distributions Using R**

Binomial distribution in R

Poisson distribution in R

Negative Binomial distribution in R

Hypergeometric distribution in R

**Continuous Distributions Using R**

Uniform distribution in R

Exponential distribution in R

Normal distribution in R

Log-Normal distribution in R

Beta distribution in R

Gamma distribution in R

Cauchy distribution in R

Laplace distribution in R

Logistic distribution in R

Weibull distribution in R

## Endnote

In this tutorial, you learned about how to compute the probabilities, cumulative probabilities and quantiles of Geometric distribution in R programming. You also learned about how to simulate a Geometric distribution using R programming.

To learn more about R code for discrete and continuous probability distributions, please refer to the following tutorials:

Probability Distributions using R

Let me know in the comments below, if you have any questions on Geometric Distribution using R and your thought on this article.