Skip to contents

iv_diff() is a convenient way to generate an iv from a preexisting vector, as long as that vector is in strictly increasing order. It returns an iv that is 1 element shorter than x (unless x is already empty).

It is particularly useful for creating an iv column from an existing column inside of dplyr::mutate(), but requires you to explicitly handle padding in that case, see the examples.

Missing values are allowed, and will be propagated to each side of the resulting interval after applying the diff.

Usage

iv_diff(x)

Arguments

x

[vector]

A vector in strictly increasing order.

Value

An iv using x as the inner type, with size equal to max(0L, vec_size(x) - 1L).

Details

iv_diff() is inspired by diff().

Examples

x <- as.Date("2019-01-01") + c(0, 5, 7, 10, 19)
x
#> [1] "2019-01-01" "2019-01-06" "2019-01-08" "2019-01-11" "2019-01-20"

# Notice how the boundaries don't overlap, because the closing `)` aligns
# with an opening `[`.
iv_diff(x)
#> <iv<date>[4]>
#> [1] [2019-01-01, 2019-01-06) [2019-01-06, 2019-01-08) [2019-01-08, 2019-01-11)
#> [4] [2019-01-11, 2019-01-20)

# Like `iv()`, missing values propagate to both boundaries of the interval.
# Before missing value propagation was applied, it looked like this:
# [1, NA), [NA, 2), [2, 3)
x <- c(1, NA, 2, 3)
iv_diff(x)
#> <iv<double>[3]>
#> [1] [NA, NA) [NA, NA) [2, 3)  

# Values in `x` must be in strictly increasing order to generate a valid
# interval vector
x <- c(1, 0, 2, 2)
try(iv_diff(x))
#> Error in iv_diff(x) : `x` must be in strictly increasing order.
#>  `x` is not in strictly increasing order at locations: `c(2, 4)`.

x <- c(1, NA, 0)
try(iv_diff(x))
#> Error in iv_diff(x) : `x` must be in strictly increasing order.
#>  `x` is not in strictly increasing order at locations: `3`.

# ---------------------------------------------------------------------------
# Use with `mutate()`

library(dplyr)

# `iv_diff()` is useful for converting a pre-existing column into an interval
# vector, but you'll need to apply padding to ensure that the size of the
# diff-ed result is the same as the number of rows in your data frame. There
# are two main ways to pad, which are explored below.
df <- tibble(x = c(1, 3, 6))

# Pad with a known lower/upper bound
df %>% mutate(iv = iv_diff(c(0, x)))
#> # A tibble: 3 × 2
#>       x        iv
#>   <dbl> <iv<dbl>>
#> 1     1    [0, 1)
#> 2     3    [1, 3)
#> 3     6    [3, 6)
df %>% mutate(iv = iv_diff(c(x, Inf)))
#> # A tibble: 3 × 2
#>       x        iv
#>   <dbl> <iv<dbl>>
#> 1     1    [1, 3)
#> 2     3    [3, 6)
#> 3     6  [6, Inf)

# Pad with a missing value, which results in a fully missing interval
df %>% mutate(iv = iv_diff(c(NA, x)))
#> # A tibble: 3 × 2
#>       x        iv
#>   <dbl> <iv<dbl>>
#> 1     1  [NA, NA)
#> 2     3    [1, 3)
#> 3     6    [3, 6)
df %>% mutate(iv = iv_diff(c(x, NA)))
#> # A tibble: 3 × 2
#>       x        iv
#>   <dbl> <iv<dbl>>
#> 1     1    [1, 3)
#> 2     3    [3, 6)
#> 3     6  [NA, NA)