Skip to contents

This family of functions counts different types of relationships between two ivs. It works similar to base::match(), where needles[i] checks for a relationship in all of haystack.

  • iv_count_overlaps() counts instances of a specific type of overlap between the two ivs.

  • iv_count_precedes() counts instances when needles[i] precedes (i.e. comes before) any interval in haystack.

  • iv_count_follows() counts instances when needles[i] follows (i.e. comes after) any interval in haystack.

These functions return an integer vector the same size as needles containing a count of the times a particular relationship between the i-th interval of needles and any interval of haystack occurred.

Usage

iv_count_overlaps(
  needles,
  haystack,
  ...,
  type = "any",
  missing = "equals",
  no_match = 0L
)

iv_count_precedes(
  needles,
  haystack,
  ...,
  closest = FALSE,
  missing = "equals",
  no_match = 0L
)

iv_count_follows(
  needles,
  haystack,
  ...,
  closest = FALSE,
  missing = "equals",
  no_match = 0L
)

Arguments

needles

[iv]

Interval vectors used for relation matching.

  • Each element of needles represents the interval to search for.

  • haystack represents the intervals to search in.

Prior to comparison, needles and haystack are coerced to the same type.

haystack

[iv]

Interval vectors used for relation matching.

  • Each element of needles represents the interval to search for.

  • haystack represents the intervals to search in.

Prior to comparison, needles and haystack are coerced to the same type.

...

These dots are for future extensions and must be empty.

type

[character(1)]

The type of relationship to find. One of:

  • "any": Finds any overlap whatsoever between an interval in needles and an interval in haystack.

  • "within": Finds when an interval in needles is completely within (or equal to) an interval in haystack.

  • "contains": Finds when an interval in needles completely contains (or equals) an interval in haystack.

  • "equals": Finds when an interval in needles is exactly equal to an interval in haystack.

  • "starts": Finds when the start of an interval in needles matches the start of an interval in haystack.

  • "ends": Finds when the end of an interval in needles matches the end of an interval in haystack.

missing

[integer(1) / "equals" / "error"]

Handling of missing intervals in needles.

  • "equals" considers missing intervals in needles as exactly equal to missing intervals in haystack when determining if there is a matching relationship between them.

  • "error" throws an error if any intervals in needles are missing.

  • If a single integer value is provided, this represents the count returned for a missing interval in needles. Use 0L to force missing intervals to never match.

no_match

[integer(1) / "error"]

Handling of needles without a match.

  • "error" throws an error if any needles have zero matches.

  • If a single integer is provided, this represents the count returned for a needle with zero matches. The default value gives unmatched needles a count of 0L.

closest

[TRUE / FALSE]

Should only the closest relationship be returned?

If TRUE, will only return the closest interval(s) in haystack that the current value of needles either precedes or follows. Note that multiple intervals can still be returned if there are ties, which can be resolved using multiple.

Value

An integer vector the same size as needles.

Examples

library(vctrs)
#> 
#> Attaching package: ‘vctrs’
#> The following object is masked from ‘package:dplyr’:
#> 
#>     data_frame

x <- iv_pairs(
  as.Date(c("2019-01-05", "2019-01-10")),
  as.Date(c("2019-01-07", "2019-01-15")),
  as.Date(c("2019-01-20", "2019-01-31"))
)

y <- iv_pairs(
  as.Date(c("2019-01-01", "2019-01-03")),
  as.Date(c("2019-01-04", "2019-01-08")),
  as.Date(c("2019-01-07", "2019-01-09")),
  as.Date(c("2019-01-10", "2019-01-20")),
  as.Date(c("2019-01-15", "2019-01-20"))
)

x
#> <iv<date>[3]>
#> [1] [2019-01-05, 2019-01-10) [2019-01-07, 2019-01-15) [2019-01-20, 2019-01-31)
y
#> <iv<date>[5]>
#> [1] [2019-01-01, 2019-01-03) [2019-01-04, 2019-01-08) [2019-01-07, 2019-01-09)
#> [4] [2019-01-10, 2019-01-20) [2019-01-15, 2019-01-20)

# Count the number of times `x` overlaps `y` at all
iv_count_overlaps(x, y)
#> [1] 2 3 0

# Count the number of times `y` is within an interval in `x`
iv_count_overlaps(y, x, type = "within")
#> [1] 0 0 2 0 0

# Count the number of times `x` precedes `y`
iv_count_precedes(x, y)
#> [1] 2 1 0

# ---------------------------------------------------------------------------

a <- iv(c(1, NA), c(2, NA))
b <- iv(c(NA, NA), c(NA, NA))

# Missing intervals are seen as exactly equal by default, so they are
# considered to overlap
iv_count_overlaps(a, b)
#> [1] 0 2

# If you'd like missing intervals to be treated as unmatched, set
# `missing = 0L`
iv_count_overlaps(a, b, missing = 0L)
#> [1] 0 0

# If you'd like to propagate missing intervals, set `missing = NA`
iv_count_overlaps(a, b, missing = NA)
#> [1]  0 NA