Skip to content

These two functions execute a query on a given node, and return the captures of the query for further use. Both functions return the same information, just structured differently depending on your use case.

  • query_matches() returns the captures first grouped by pattern, and further grouped by match within each pattern. This is useful if you include multiple patterns in your query.

  • query_captures() returns a flat list of captures ordered by their node location in the original text. This is normally the easiest structure to use if you have a single pattern without any alternations that would benefit from having individual captures split by match.

Both also return the capture name, i.e. the @name you specified in your query.

Usage

query_matches(x, node, ..., range = NULL)

query_captures(x, node, ..., range = NULL)

Arguments

x

[tree_sitter_query]

A query.

node

[tree_sitter_node]

A node to run the query over.

...

These dots are for future extensions and must be empty.

range

[tree_sitter_range / NULL]

An optional range to restrict the query to.

Predicates

There are 3 core types of predicates supported:

  • #eq? @capture "string"

  • #eq? @capture1 @capture2

  • #match? @capture "regex"

Each of these predicates can also be inverted with a not- prefix, i.e. #not-eq? and #not-match?.

String double quotes

The underlying tree-sitter predicate parser requires that strings supplied in a query must use double quotes, i.e. "string" not 'string'. If you try and use single quotes, you will get a query error.

#match? regex

The regex support provided by #match? is powered by grepl().

Escapes are a little tricky to get right within these match regex strings. To use something like \s in the regex string, you need the literal text \\s to appear in the string to tell the tree-sitter regex engine to escape the backslash so you end up with just \s in the captured string. This requires putting two literal backslash characters in the R string itself, which can be accomplished with either "\\\\s" or using a raw string like r'["\\\\s"]' which is typically a little easier. You can also write your queries in a separate file (typically called queries.scm) and read them into R, which is also a little more straightforward because you can just write something like (#match? @id "^\\s$") and that will be read in correctly.

Examples

text <- "
foo + b + a + ab
and(a)
"

source <- "(identifier) @id"

language <- treesitter.r::language()

query <- query(language, source)
parser <- parser(language)
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# A flat ordered list of captures, that's most useful here since
# we only have 1 pattern!
captures <- query_captures(query, node)
captures$node
#> [[1]]
#> <tree_sitter_node>
#> 
#> ── Text ──────────────────────────────────────────────────────────────────
#> foo
#> 
#> ── S-Expression ──────────────────────────────────────────────────────────
#> (identifier [(1, 0), (1, 3)])
#> 
#> [[2]]
#> <tree_sitter_node>
#> 
#> ── Text ──────────────────────────────────────────────────────────────────
#> b
#> 
#> ── S-Expression ──────────────────────────────────────────────────────────
#> (identifier [(1, 6), (1, 7)])
#> 
#> [[3]]
#> <tree_sitter_node>
#> 
#> ── Text ──────────────────────────────────────────────────────────────────
#> a
#> 
#> ── S-Expression ──────────────────────────────────────────────────────────
#> (identifier [(1, 10), (1, 11)])
#> 
#> [[4]]
#> <tree_sitter_node>
#> 
#> ── Text ──────────────────────────────────────────────────────────────────
#> ab
#> 
#> ── S-Expression ──────────────────────────────────────────────────────────
#> (identifier [(1, 14), (1, 16)])
#> 
#> [[5]]
#> <tree_sitter_node>
#> 
#> ── Text ──────────────────────────────────────────────────────────────────
#> and
#> 
#> ── S-Expression ──────────────────────────────────────────────────────────
#> (identifier [(2, 0), (2, 3)])
#> 
#> [[6]]
#> <tree_sitter_node>
#> 
#> ── Text ──────────────────────────────────────────────────────────────────
#> a
#> 
#> ── S-Expression ──────────────────────────────────────────────────────────
#> (identifier [(2, 4), (2, 5)])
#>