Determine Amount of Missing Values in Data

Computes the proportion of data that is missing in a given data set.

Usage

na_prop_overall(x)

na_prop_by_variable(x)

na_prop_by_observation(x)

na_count_overall(x)

na_count_by_variable(x)

na_count_by_observation(x)

Arguments

x: A vector of length \(N\) or a matrix with dimensions \(N \times P\).

Value

Overall: a single numeric value between [0, 1] or a count between [0, N].
Variable: \(P\) different numeric values between [0, 1] or counts between [0, N].
Observation: \(N\) different numeric values between [0, 1] or counts between [0, P].

Examples

# By vector
x = c(1, 2, NA, 4)
na_prop_overall(x)
#> [1] 0.25
na_count_overall(x)
#> [1] 1

# By Data Frame
missing_df = data.frame(
 a = c(1, 2, NA, 4),
 b = c(3, NA, 2, NA)
)

# Proportion
na_prop_overall(missing_df)
#> [1] 0.375
na_prop_by_variable(missing_df)
#>    a    b 
#> 0.25 0.50 
na_prop_by_observation(missing_df)
#> [1] 0.0 0.5 0.5 0.5

# Counts
na_count_overall(missing_df)
#> [1] 3
na_count_by_variable(missing_df)
#> a b 
#> 1 2 
na_count_by_observation(missing_df)
#> [1] 0 1 1 1