You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
github-actions[bot] edited this page Jul 21, 2025
·
1 revision
Based on the collection of tokens that we know can be emitted by R, this page is almost a verbatim copy of the corresponding section of the appendix in the original master's thesis.
Every R program is an expression list, identified by the exprlist token type, which consists of several expressions (identified by expr).
Consider the following example:
x<-1+2if(x>0) {
print("Hello World!")
}
y<-3
The corresponding expression list consists of three expressions, namely the assignment of 1 + 2 to x, the if construct, and the assignment of 3 to y.
With the following tables, we provide what is to our knowledge a full list of token types that R produces.
For more information, the "Syntax" topic of the R documentation offers a great starting point.
Besides PIPEBIND, which at the moment must be enabled explicitly by setting an environment variable all the tokens shown in the tables are supported by the normalization of flowR - although this is different from supporting all of their uses.
It should be noted that there are many tokens that appear in the source code of the R interpreter but are not listed within the tables.
While some of these tokens, like COLON_ASSIGN, are explicitly marked as deprecated, several of them seem to be for internal use only and are - to the best of our knowledge - never emitted by the parser with getParseData. For example:
newlines are directly consumed to split expressions,
error tokens produce an explicit error message, and
the individual tokens for unary operators (like UPLUS) are transformed to the same token as their binary counterparts (like +).
Tokens Representing Constants
#
✓
Token
Description
T1
✓
NULL_CONST
Represents NULL.
T2
✓
NUM_CONST
Identifies a number (including NA) or a logical, depending on the lexeme.
T3
✓
STR_CONST
A string, independent of the quotation mark.
Tokens Representing Assignments
#
✓
Token
Description
T4
✓
EQ_ASSIGN
A local equal assignment. Differentiate this from EQ-SUB, which has a slightly different semantic.
T5
✓
EQ_FORMALS
Essentially EQ-ASSIGN, but when used within formals.
T6
✓
EQ_SUB
Essentially EQ-ASSIGN, but when used to name arguments for function call or arguments in access.
T7
✓
LEFT_ASSIGN
A local left assignment or global left assignment. Includes :=, originally bound to COLON_ASSIGN.
T8
✓
RIGHT_ASSIGN
A local right assignment or global right assignment.