Skip to contents

iv_locate_relates() is similar to iv_locate_overlaps(), but it locates a specific set of relations developed by James Allen in the paper: Maintaining Knowledge about Temporal Intervals.

Usage

iv_locate_relates(
needles,
haystack,
...,
type,
missing = "equals",
no_match = NA_integer_,
remaining = "drop",
multiple = "all",
relationship = "none"
)

Arguments

needles, haystack

[iv]

Interval vectors used for relation matching.

• Each element of needles represents the interval to search for.

• haystack represents the intervals to search in.

Prior to comparison, needles and haystack are coerced to the same type.

...

These dots are for future extensions and must be empty.

type

[character(1)]

The type of relationship to find. See the Allen's Interval Algebra section for a complete description of each type. One of:

• "precedes"

• "preceded-by"

• "meets"

• "met-by"

• "overlaps"

• "overlapped-by"

• "starts"

• "started-by"

• "during"

• "contains"

• "finishes"

• "finished-by"

• "equals"

missing

[integer(1) / "equals" / "drop" / "error"]

Handling of missing intervals in needles.

• "equals" considers missing intervals in needles as exactly equal to missing intervals in haystack when determining if there is a matching relationship between them.

• "drop" drops missing intervals in needles from the result.

• "error" throws an error if any intervals in needles are missing.

• If a single integer is provided, this represents the value returned in the haystack column for intervals in needles that are missing.

no_match

Handling of needles without a match.

• "drop" drops needles with zero matches from the result.

• "error" throws an error if any needles have zero matches.

• If a single integer is provided, this represents the value returned in the haystack column for values of needles that have zero matches. The default represents an unmatched needle with NA.

remaining

Handling of haystack values that needles never matched.

• "drop" drops remaining haystack values from the result. Typically, this is the desired behavior if you only care when needles has a match.

• "error" throws an error if there are any remaining haystack values.

• If a single integer is provided (often NA), this represents the value returned in the needles column for the remaining haystack values that needles never matched. Remaining haystack values are always returned at the end of the result.

multiple

Handling of needles with multiple matches. For each needle:

• "all" returns all matches detected in haystack.

• "any" returns any match detected in haystack with no guarantees on which match will be returned. It is often faster than "first" and "last" if you just need to detect if there is at least one match.

• "first" returns the first match detected in haystack.

• "last" returns the last match detected in haystack.

relationship

Handling of the expected relationship between needles and haystack. If the expectations chosen from the list below are invalidated, an error is thrown.

• "none" doesn't perform any relationship checks.

• "one-to-one" expects:

• Each value in needles matches at most 1 value in haystack.

• Each value in haystack matches at most 1 value in needles.

• "one-to-many" expects:

• Each value in needles matches any number of values in haystack.

• Each value in haystack matches at most 1 value in needles.

• "many-to-one" expects:

• Each value in needles matches at most 1 value in haystack.

• Each value in haystack matches any number of values in needles.

• "many-to-many" expects:

• Each value in needles matches any number of values in haystack.

• Each value in haystack matches any number of values in needles.

This performs no checks, and is identical to "none", but is provided to allow you to be explicit about this relationship if you know it exists.

• "warn-many-to-many" doesn't assume there is any known relationship, but will warn if needles and haystack have a many-to-many relationship (which is typically unexpected), encouraging you to either take a closer look at your inputs or make this relationship explicit by specifying "many-to-many".

relationship is applied after filter and multiple to allow potential multiple matches to be filtered out first.

relationship doesn't handle cases where there are zero matches. For that, see no_match and remaining.

Value

A data frame containing two integer columns named needles and haystack.

Allen's Interval Algebra

The interval algebra developed by James Allen serves as a basis and inspiration for iv_locate_overlaps(), iv_locate_precedes(), and iv_locate_follows(). The original algebra is composed of 13 relations which have the following properties:

• Distinct: No pair of intervals can be related by more than one type.

• Exhaustive: All pairs of intervals are described by one of the types.

• Qualitative: No numeric intervals are considered. The relationships are computed by purely qualitative means.

Take the notation that x and y represent two intervals. Now assume that x can be represented as [x_s, x_e), where x_s is the start of the interval and x_e is the end of it. Additionally, assume that x_s < x_e. With this notation, the 13 relations are as follows:

• Precedes:

x_e < y_s

• Preceded-by:

x_s > y_e

• Meets:

x_e == y_s

• Met-by:

x_s == y_e

• Overlaps:

(x_s < y_s) & (x_e > y_s) & (x_e < y_e)

• Overlapped-by:

(x_e > y_e) & (x_s < y_e) & (x_s > y_s)

• Starts:

(x_s == y_s) & (x_e < y_e)

• Started-by:

(x_s == y_s) & (x_e > y_e)

• Finishes:

(x_s > y_s) & (x_e == y_e)

• Finished-by:

(x_s < y_s) & (x_e == y_e)

• During:

(x_s > y_s) & (x_e < y_e)

• Contains:

(x_s < y_s) & (x_e > y_e)

• Equals:

(x_s == y_s) & (x_e == y_e)

Note that when missing = "equals", missing intervals will only match the type = "equals" relation. This ensures that the distinct property of the algebra is maintained.

Connection to other functions

Note that some of the above relations are fairly restrictive. For example, "overlaps" only detects cases where x straddles y_s. It does not consider the case where x and y are equal to be an overlap (as this is "equals") nor does it consider when x straddles y_e to be an overlap (as this is "overlapped-by"). This makes the relations extremely useful from a theoretical perspective, because they can be combined without fear of duplicating relations, but they don't match our typical expectations for what an "overlap" is.

iv_locate_overlaps(), iv_locate_precedes(), and iv_locate_follows() use more intuitive types that aren't distinct, but typically match your expectations better. They can each be expressed in terms of Allen's relations:

• iv_locate_overlaps():

• "any":

overlaps | overlapped-by | starts | started-by | finishes | finished-by | during | contains | equals

• "contains":

contains | started-by | finished-by | equals

• "within":

during | starts | finishes | equals

• "starts":

starts | started-by | equals

• "ends":

finishes | finished-by | equals

• "equals":

equals

• iv_locate_precedes():

precedes | meets

• iv_locate_follows():

preceded-by | met-by

References

Allen, James F. (26 November 1983). "Maintaining knowledge about temporal intervals". Communications of the ACM. 26 (11): 832–843.

Examples

x <- iv(1, 3)
y <- iv(3, 4)

# "precedes" is strict, and doesn't let the endpoints match
iv_locate_relates(x, y, type = "precedes")
#>   needles haystack
#> 1       1       NA

# Since that is what "meets" represents
iv_locate_relates(x, y, type = "meets")
#>   needles haystack
#> 1       1        1

# "overlaps" is a very specific type of overlap where an interval in
# needles straddles the start of an interval in haystack
x <- iv_pairs(c(1, 4), c(1, 3), c(0, 3), c(2, 5))
y <- iv(1, 4)

# It doesn't match equality, or when the starts match, or when the end
# of the interval in haystack is straddled instead
iv_locate_relates(x, y, type = "overlaps")
#>   needles haystack
#> 1       1       NA
#> 2       2       NA
#> 3       3        1
#> 4       4       NA