This family of functions revolves around computing interval containers. A container is defined as the widest interval that isn't contained by any other interval.
iv_containers()
returns all of the containers found withinx
.iv_identify_containers()
identifies the containers that each interval inx
falls in. It replacesx
with a list of the same size where each element of the list contains the containers that the corresponding interval inx
falls in. This is particularly useful alongsidetidyr::unnest()
.iv_identify_container()
is similar in spirit toiv_identify_containers()
, but is useful when you suspect that each interval inx
is contained within exactly 1 container. It replacesx
with an iv of the same size where each interval is the container that the corresponding interval inx
falls in. If any interval falls in more than one container, an error is thrown.iv_locate_containers()
returns a two column data frame with akey
column containing the result ofiv_containers()
and aloc
list-column containing integer vectors that map each interval inx
to the container that it falls in.
Value
For
iv_containers()
, an iv with the same type asx
.For
iv_identify_containers()
, a list-of containing ivs with the same size asx
.For
iv_identify_container()
, an iv with the same type asx
.For
iv_locate_containers()
, a two column data frame with akey
column containing the result ofiv_containers()
and aloc
list-column containing integer vectors.
Examples
library(dplyr, warn.conflicts = FALSE)
library(tidyr)
x <- iv_pairs(
c(4, 6),
c(1, 5),
c(2, 3),
c(NA, NA),
c(NA, NA),
c(9, 12),
c(9, 14)
)
x
#> <iv<double>[7]>
#> [1] [4, 6) [1, 5) [2, 3) [NA, NA) [NA, NA) [9, 12) [9, 14)
# Containers are intervals which aren't contained in any other interval.
# They are always returned in ascending order.
# If any missing intervals are present, a single one is retained.
iv_containers(x)
#> <iv<double>[4]>
#> [1] [1, 5) [4, 6) [9, 14) [NA, NA)
# `iv_identify_container()` is useful alongside `group_by()` and
# `summarize()` if you know that each interval is contained within exactly
# 1 container
df <- tibble(x = x)
df <- mutate(df, container = iv_identify_container(x))
df
#> # A tibble: 7 × 2
#> x container
#> <iv<dbl>> <iv<dbl>>
#> 1 [4, 6) [4, 6)
#> 2 [1, 5) [1, 5)
#> 3 [2, 3) [1, 5)
#> 4 [NA, NA) [NA, NA)
#> 5 [NA, NA) [NA, NA)
#> 6 [9, 12) [9, 14)
#> 7 [9, 14) [9, 14)
df %>%
group_by(container) %>%
summarize(n = n())
#> # A tibble: 4 × 2
#> container n
#> <iv<dbl>> <int>
#> 1 [1, 5) 2
#> 2 [4, 6) 1
#> 3 [9, 14) 2
#> 4 [NA, NA) 2
# If any interval is contained within multiple containers,
# then you can't use `iv_identify_container()`
y <- c(x, iv_pairs(c(0, 3), c(8, 13)))
y
#> <iv<double>[9]>
#> [1] [4, 6) [1, 5) [2, 3) [NA, NA) [NA, NA) [9, 12) [9, 14) [0, 3)
#> [9] [8, 13)
try(iv_identify_container(y))
#> Error in iv_identify_container(y) :
#> Intervals in `x` can't fall within multiple containers.
#> ℹ Location 3 falls within multiple containers.
#> ℹ Use `iv_identify_containers()` to identify all of the containers that a particular interval is contained by.
# Instead, use `iv_identify_containers()` to identify every container
# that each interval falls in
df <- tibble(y = y, container = iv_identify_containers(y))
df
#> # A tibble: 9 × 2
#> y container
#> <iv<dbl>> <list<iv<dbl>>>
#> 1 [4, 6) [1]
#> 2 [1, 5) [1]
#> 3 [2, 3) [2]
#> 4 [NA, NA) [1]
#> 5 [NA, NA) [1]
#> 6 [9, 12) [2]
#> 7 [9, 14) [1]
#> 8 [0, 3) [1]
#> 9 [8, 13) [1]
# You can use `tidyr::unchop()` to see the containers that each interval
# falls in
df %>%
mutate(row = row_number(), .before = 1) %>%
unchop(container)
#> # A tibble: 11 × 3
#> row y container
#> <int> <iv<dbl>> <iv<dbl>>
#> 1 1 [4, 6) [4, 6)
#> 2 2 [1, 5) [1, 5)
#> 3 3 [2, 3) [0, 3)
#> 4 3 [2, 3) [1, 5)
#> 5 4 [NA, NA) [NA, NA)
#> 6 5 [NA, NA) [NA, NA)
#> 7 6 [9, 12) [8, 13)
#> 8 6 [9, 12) [9, 14)
#> 9 7 [9, 14) [9, 14)
#> 10 8 [0, 3) [0, 3)
#> 11 9 [8, 13) [8, 13)
# A more programmatic interface to `iv_identify_containers()` is
# `iv_locate_containers()`, which returns the containers you get from
# `iv_containers()` alongside the locations in the input that they contain.
iv_locate_containers(y)
#> key loc
#> 1 [0, 3) 3, 8
#> 2 [1, 5) 2, 3
#> 3 [4, 6) 1
#> 4 [8, 13) 6, 9
#> 5 [9, 14) 6, 7
#> 6 [NA, NA) 4, 5