These two functions execute a query on a given node
, and return the
captures of the query for further use. Both functions return the same
information, just structured differently depending on your use case.
query_matches()
returns the captures first grouped by pattern, and further grouped by match within each pattern. This is useful if you include multiple patterns in your query.query_captures()
returns a flat list of captures ordered by their node location in the original text. This is normally the easiest structure to use if you have a single pattern without any alternations that would benefit from having individual captures split by match.
Both also return the capture name, i.e. the @name
you specified in your
query.
Arguments
- x
[tree_sitter_query]
A query.
- node
[tree_sitter_node]
A node to run the query over.
- ...
These dots are for future extensions and must be empty.
- range
[tree_sitter_range / NULL]
An optional range to restrict the query to.
Predicates
There are 3 core types of predicates supported:
#eq? @capture "string"
#eq? @capture1 @capture2
#match? @capture "regex"
Each of these predicates can also be inverted with a not-
prefix, i.e.
#not-eq?
and #not-match?
.
String double quotes
The underlying tree-sitter predicate parser requires that strings supplied
in a query must use double quotes, i.e. "string"
not 'string'
. If you
try and use single quotes, you will get a query error.
#match?
regex
The regex support provided by #match?
is powered by grepl()
.
Escapes are a little tricky to get right within these match regex strings.
To use something like \s
in the regex string, you need the literal text
\\s
to appear in the string to tell the tree-sitter regex engine to escape
the backslash so you end up with just \s
in the captured string. This
requires putting two literal backslash characters in the R string itself,
which can be accomplished with either "\\\\s"
or using a raw string like
r'["\\\\s"]'
which is typically a little easier. You can also write your
queries in a separate file (typically called queries.scm
) and read them
into R, which is also a little more straightforward because you can just
write something like (#match? @id "^\\s$")
and that will be read in
correctly.
Examples
text <- "
foo + b + a + ab
and(a)
"
source <- "(identifier) @id"
language <- treesitter.r::language()
query <- query(language, source)
parser <- parser(language)
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)
# A flat ordered list of captures, that's most useful here since
# we only have 1 pattern!
captures <- query_captures(query, node)
captures$node
#> [[1]]
#> <tree_sitter_node>
#>
#> ── Text ──────────────────────────────────────────────────────────────────
#> foo
#>
#> ── S-Expression ──────────────────────────────────────────────────────────
#> (identifier [(1, 0), (1, 3)])
#>
#> [[2]]
#> <tree_sitter_node>
#>
#> ── Text ──────────────────────────────────────────────────────────────────
#> b
#>
#> ── S-Expression ──────────────────────────────────────────────────────────
#> (identifier [(1, 6), (1, 7)])
#>
#> [[3]]
#> <tree_sitter_node>
#>
#> ── Text ──────────────────────────────────────────────────────────────────
#> a
#>
#> ── S-Expression ──────────────────────────────────────────────────────────
#> (identifier [(1, 10), (1, 11)])
#>
#> [[4]]
#> <tree_sitter_node>
#>
#> ── Text ──────────────────────────────────────────────────────────────────
#> ab
#>
#> ── S-Expression ──────────────────────────────────────────────────────────
#> (identifier [(1, 14), (1, 16)])
#>
#> [[5]]
#> <tree_sitter_node>
#>
#> ── Text ──────────────────────────────────────────────────────────────────
#> and
#>
#> ── S-Expression ──────────────────────────────────────────────────────────
#> (identifier [(2, 0), (2, 3)])
#>
#> [[6]]
#> <tree_sitter_node>
#>
#> ── Text ──────────────────────────────────────────────────────────────────
#> a
#>
#> ── S-Expression ──────────────────────────────────────────────────────────
#> (identifier [(2, 4), (2, 5)])
#>