Skip to contents

cut_ages() provides categorisation of ages based on specified breaks which represent the left-hand interval limits. The resulting intervals span from the minimum break through to a specified max_upper and will always be closed on the left and open on the right. Ages below the minimum break, or above max_upper will be returned as NA.

Usage

cut_ages(ages, breaks, max_upper = Inf)

Arguments

ages

[numeric].

Vector of age values.

Double values are coerced to integer prior to categorisation / aggregation.

Must not be NA.

breaks

[integerish].

1 or more non-negative cut points in increasing (strictly) order.

These correspond to the left hand side of the desired intervals (e.g. the closed side of [x, y).

Double values are coerced to integer prior to categorisation.

max_upper

[numeric]

Represents the maximum upper bound for the resulting intervals.

Double values are rounded up to the nearest (numeric) integer.

Defaults to Inf.

Value

A data frame with an ordered factor column (interval), as well as columns corresponding to the explicit bounds (lower_bound and upper_bound). Internally both bound columns are stored as double but it can be taken as part of the function API that lower_bound is coercible to integer without any coercion to NA_integer_. Similarly all values of upper_bound apart from those corresponding to max_upper can be assumed coercible to integer (max_upper may or may not depending on the given argument).

Examples


cut_ages(ages = 0:9, breaks = c(0L, 3L, 5L, 10L))
#> # A tibble: 10 × 3
#>    interval lower_bound upper_bound
#>    <ord>          <dbl>       <dbl>
#>  1 [0, 3)             0           3
#>  2 [0, 3)             0           3
#>  3 [0, 3)             0           3
#>  4 [3, 5)             3           5
#>  5 [3, 5)             3           5
#>  6 [5, 10)            5          10
#>  7 [5, 10)            5          10
#>  8 [5, 10)            5          10
#>  9 [5, 10)            5          10
#> 10 [5, 10)            5          10

cut_ages(ages = 0:9, breaks = c(0L, 5L))
#> # A tibble: 10 × 3
#>    interval lower_bound upper_bound
#>    <ord>          <dbl>       <dbl>
#>  1 [0, 5)             0           5
#>  2 [0, 5)             0           5
#>  3 [0, 5)             0           5
#>  4 [0, 5)             0           5
#>  5 [0, 5)             0           5
#>  6 [5, Inf)           5         Inf
#>  7 [5, Inf)           5         Inf
#>  8 [5, Inf)           5         Inf
#>  9 [5, Inf)           5         Inf
#> 10 [5, Inf)           5         Inf

# Note the following is comparable to a call to
# cut(ages, right = FALSE, breaks = c(breaks, Inf))
ages <- seq.int(from = 0, by = 10, length.out = 10)
breaks <- c(0, 1, 10, 30)
cut_ages(ages, breaks)
#> # A tibble: 10 × 3
#>    interval  lower_bound upper_bound
#>    <ord>           <dbl>       <dbl>
#>  1 [0, 1)              0           1
#>  2 [10, 30)           10          30
#>  3 [10, 30)           10          30
#>  4 [30, Inf)          30         Inf
#>  5 [30, Inf)          30         Inf
#>  6 [30, Inf)          30         Inf
#>  7 [30, Inf)          30         Inf
#>  8 [30, Inf)          30         Inf
#>  9 [30, Inf)          30         Inf
#> 10 [30, Inf)          30         Inf

# values above max_upper treated as NA
cut_ages(ages = 0:10, breaks = c(0,5), max_upper = 7)
#> # A tibble: 11 × 3
#>    interval lower_bound upper_bound
#>    <ord>          <dbl>       <dbl>
#>  1 [0, 5)             0           5
#>  2 [0, 5)             0           5
#>  3 [0, 5)             0           5
#>  4 [0, 5)             0           5
#>  5 [0, 5)             0           5
#>  6 [5, 7)             5           7
#>  7 [5, 7)             5           7
#>  8 NA                NA          NA
#>  9 NA                NA          NA
#> 10 NA                NA          NA
#> 11 NA                NA          NA