iv_locate_between() locates where needles, a vector, falls between the
bounds of haystack, an iv. It works similar to base::match(), where
needles[i] checks for a match in all of haystack. Unlike match(), all
matches are returned, rather than just the first.
This function returns a two column data frame. The needles column is an
integer vector pointing to locations in needles. The haystack column is
an integer vector pointing to locations in haystack with a match.
Usage
iv_locate_between(
needles,
haystack,
...,
missing = "equals",
no_match = NA_integer_,
remaining = "drop",
multiple = "all"
)Arguments
- needles, haystack
[vector, iv]needlesshould be a vector andhaystackshould be an iv.needlesshould have the same type as the start/end components ofhaystack.Each element of
needlesrepresents the value to search for.haystackrepresents the intervals to search in.
- ...
These dots are for future extensions and must be empty.
- missing
[integer(1) / "equals" / "drop" / "error"]Handling of missing values in
needles."equals"considers missing values inneedlesas exactly equal to missing intervals inhaystackwhen determining if there is a matching relationship between them."drop"drops missing values inneedlesfrom the result."error"throws an error if any values inneedlesare missing.If a single integer is provided, this represents the value returned in the
haystackcolumn for values inneedlesthat are missing.
- no_match
Handling of
needleswithout a match."drop"dropsneedleswith zero matches from the result."error"throws an error if anyneedleshave zero matches.If a single integer is provided, this represents the value returned in the
haystackcolumn for observations ofneedlesthat have zero matches. The default represents an unmatched needle withNA.
- remaining
Handling of
haystackvalues thatneedlesnever matched."drop"drops remaininghaystackvalues from the result. Typically, this is the desired behavior if you only care whenneedleshas a match."error"throws an error if there are any remaininghaystackvalues.If a single integer is provided (often
NA), this represents the value returned in theneedlescolumn for the remaininghaystackvalues thatneedlesnever matched. Remaininghaystackvalues are always returned at the end of the result.
- multiple
Handling of
needleswith multiple matches. For each needle:"all"returns all matches detected inhaystack."any"returns any match detected inhaystackwith no guarantees on which match will be returned. It is often faster than"first"and"last"if you just need to detect if there is at least one match."first"returns the first match detected inhaystack."last"returns the last match detected inhaystack."warning"throws a warning if multiple matches are detected, but otherwise falls back to"all"."error"throws an error if multiple matches are detected.
Examples
x <- as.Date(c("2019-01-05", "2019-01-10", "2019-01-07", "2019-01-20"))
y <- iv_pairs(
as.Date(c("2019-01-01", "2019-01-03")),
as.Date(c("2019-01-04", "2019-01-08")),
as.Date(c("2019-01-07", "2019-01-09")),
as.Date(c("2019-01-10", "2019-01-20")),
as.Date(c("2019-01-15", "2019-01-20"))
)
x
#> [1] "2019-01-05" "2019-01-10" "2019-01-07" "2019-01-20"
y
#> <iv<date>[5]>
#> [1] [2019-01-01, 2019-01-03) [2019-01-04, 2019-01-08) [2019-01-07, 2019-01-09)
#> [4] [2019-01-10, 2019-01-20) [2019-01-15, 2019-01-20)
# Find any location where `x` is between the intervals in `y`
loc <- iv_locate_between(x, y)
loc
#> needles haystack
#> 1 1 2
#> 2 2 4
#> 3 3 2
#> 4 3 3
#> 5 4 NA
iv_align(x, y, locations = loc)
#> needles haystack
#> 1 2019-01-05 [2019-01-04, 2019-01-08)
#> 2 2019-01-10 [2019-01-10, 2019-01-20)
#> 3 2019-01-07 [2019-01-04, 2019-01-08)
#> 4 2019-01-07 [2019-01-07, 2019-01-09)
#> 5 2019-01-20 [NA, NA)
# Drop values in `x` without a match
loc <- iv_locate_between(x, y, no_match = "drop")
loc
#> needles haystack
#> 1 1 2
#> 2 2 4
#> 3 3 2
#> 4 3 3
iv_align(x, y, locations = loc)
#> needles haystack
#> 1 2019-01-05 [2019-01-04, 2019-01-08)
#> 2 2019-01-10 [2019-01-10, 2019-01-20)
#> 3 2019-01-07 [2019-01-04, 2019-01-08)
#> 4 2019-01-07 [2019-01-07, 2019-01-09)
# ---------------------------------------------------------------------------
a <- c(1, NA)
b <- iv(c(NA, NA), c(NA, NA))
# By default, missing values in `needles` are treated as being exactly
# equal to missing intervals in `haystack`, so the missing value in `a` is
# considered between the missing interval in `b`.
iv_locate_between(a, b)
#> needles haystack
#> 1 1 NA
#> 2 2 1
#> 3 2 2
# If you'd like missing values in `needles` to always be considered
# unmatched, set `missing = NA`
iv_locate_between(a, b, missing = NA)
#> needles haystack
#> 1 1 NA
#> 2 2 NA