antlr4: syntax ambiguity, left recursion, all?


Kode Charlie

My syntax looks like below and it doesn't compile. The error returned (from the antlr4 maven plugin) is:

[INFO] --- antlr4-maven-plugin:4.3:antlr4 (default-cli) @ beebell ---
[INFO] ANTLR 4: Processing source directory /Users/kodecharlie/workspace/beebell/src/main/antlr4
[INFO] Processing grammar: DateRange.g4
org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree node: RULE expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree node: RULE expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree node: RULE expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree node: RULE expecting <UP>
[ERROR] error(20):  internal error: Rule HOUR undefined 
[ERROR] error(20):  internal error: Rule MINUTE undefined 
[ERROR] error(20):  internal error: Rule SECOND undefined 
[ERROR] error(20):  internal error: Rule HOUR undefined 
[ERROR] error(20):  internal error: Rule MINUTE undefined 

I can see that the syntax can be confused - for example, is the 2 digit number MINUTE, SECOND or HOUR (or maybe the start of the year). But some articles suggest that this error is caused by left recursion.

Can you tell me what happened?

thanks. Here is the syntax:

grammar DateRange;

range     : startDate (THRU endDate)? | 'Every' LONG_DAY 'from' startDate THRU endDate ;

startDate : dateTime ;
endDate   : dateTime ;
dateTime  : GMTOFF | SHRT_MDY | YYYYMMDD | (WEEK_DAY)? LONG_MDY ;

// Dates.
GMTOFF    : YYYYMMDD 'T' HOUR ':' MINUTE ':' SECOND ('-'|'+') HOUR ':' MINUTE ;
YYYYMMDD  : YEAR '-' MOY '-' DOM ;
SHRT_MDY  : MOY ('/' | '-') DOM ('/' | '-') YEAR ;
LONG_MDY  : (SHRT_MNTH '.'? | LONG_MNTH) WS DOM ','? (WS YEAR (','? WS TIMESPAN)? | WS startTime)? ;

YEAR      : DIGIT DIGIT DIGIT DIGIT ;   // year
MOY       : (DIGIT | DIGIT DIGIT) ;     // month of year.
DOM       : (DIGIT | DIGIT DIGIT) ;     // day of month.
TIMESPAN  : startTime (WS THRU WS endTime)? ;

// Time-of-day.
startTime : TOD ;
endTime   : TOD ;
TOD       : NOON | HOUR2 (':' MINUTE)? WS? MERIDIAN ;
NOON      : 'noon' ;
HOUR2     : (DIGIT | DIGIT DIGIT) ;
MERIDIAN  : 'AM' | 'am' | 'PM' | 'pm' ;

// 24-hour clock.  Sanity-check range in listener.
HOUR      : DIGIT DIGIT ;
MINUTE    : DIGIT DIGIT ;
SECOND    : DIGIT DIGIT ;

// Range verb.
THRU      : WS ('-'|'to') WS -> skip ;

// Weekdays.
WEEK_DAY  : (SHRT_DAY | LONG_DAY) ','? WS ;
SHRT_DAY  : 'Sun' | 'Mon' | 'Tue' | 'Wed' | 'Thu' | 'Fri' | 'Sat' -> skip ;
LONG_DAY  : 'Sunday' | 'Monday' | 'Tuesday' | 'Wednesday' | 'Thursday' | 'Friday' | 'Saturday' -> skip ;

// Months.
SHRT_MNTH : 'Jan' | 'Feb' | 'Mar' | 'Apr' | 'May' | 'Jun' | 'Jul' | 'Aug' | 'Sep' | 'Oct' | 'Nov' | 'Dec' ;
LONG_MNTH : 'January' | 'February' | 'March' | 'April' | 'May' | 'June' | 'July' | 'August' | 'September' | 'October' | 'November' | 'December' ;

DIGIT     : [0-9] ;
WS        : [ \t\r\n]+ -> skip ;
Kode Charlie

I solved this problem by setting a unique production rule for each sequence of numbers (length 1, 2, 3 or 4). Again, I've simplified several rules - in fact, trying to make the alternative to production rules more straightforward. Anyway, here is the final result of the compilation:

grammar DateRange;

range     : 'Every' WS longDay WS 'from' WS startDate THRU endDate
          | startDate THRU endDate
          | startDate
          ;

startDate : dateTime ; endDate   : dateTime ; dateTime  : utc
          | shrtMdy
          | yyyymmdd
          | longMdy
          | weekDay ','? WS longMdy
          ;

// Dates.
utc       : yyyymmdd 'T' hour ':' minute ':' second ('-'|'+') hour ':' minute ;
yyyymmdd  : year '-' moy '-' dom ;
shrtMdy : moy ('/' | '-') dom ('/' | '-') year ;
longMdy   : longMonth WS dom ','? optYearAndOrTime?
          | shrtMonth '.'? WS dom ','? optYearAndOrTime?
          ;

optYearAndOrTime : WS year ','? WS timespan
                 | WS year
                 | WS timespan
                 ;

fragment DIGIT : [0-9] ;
ONE_DIGIT    : DIGIT ;
TWO_DIGITS   : DIGIT ONE_DIGIT ;
THREE_DIGITS : DIGIT TWO_DIGITS ;
FOUR_DIGITS  : DIGIT THREE_DIGITS ;

year      : FOUR_DIGITS ;                   // year
moy       : ONE_DIGIT | TWO_DIGITS ;        // month of year.
dom       : ONE_DIGIT | TWO_DIGITS ;        // day of month.
timespan  : (tod THRU tod) | tod ;

// Time-of-day.
tod       : noon | (hour2 (':' minute)? WS? meridian?) ;
noon      : 'noon' ; hour2     : ONE_DIGIT | TWO_DIGITS ;
meridian  : ('AM' | 'am' | 'PM' | 'pm' | 'a.m.' | 'p.m.') ;

// 24-hour clock.  Sanity-check range in listener.
hour      : TWO_DIGITS ;
minute    : TWO_DIGITS ;
second    : TWO_DIGITS ;   // we do not use seconds.

// Range verb.
THRU      : WS? ('-'|'–'|'to') WS? ;

// Weekdays.
weekDay   : shrtDay | longDay ; shrtDay   : 'Sun' | 'Mon' | 'Tue' | 'Wed' | 'Thu' | 'Fri' | 'Sat' ; longDay   : 'Sunday' | 'Monday' | 'Tuesday' | 'Wednesday' | 'Thursday' | 'Friday' | 'Saturday' ;

// Months.
shrtMonth : 'Jan' | 'Feb' | 'Mar' | 'Apr' | 'May' | 'Jun' | 'Jul' | 'Aug' | 'Sep' | 'Oct' | 'Nov' | 'Dec' ;
longMonth : 'January' | 'February' | 'March' | 'April' | 'May' | 'June' | 'July' | 'August' | 'September' | 'October' | 'November' | 'December' ;

WS        : ~[a-zA-Z0-9,.:]+ ;

Related


How to fix mutual left recursion in ANTLR4

Depp I have two mutually left recursive rules: frag : ID | NUMBER | TRUE | FALSE | expr ; expr: frag (PLUS | MINUS) frag | LBR expr RBR | frag ; The issue is:The following sets of rules are mutually left-recursive [frag, expr] I'm new to AN

Exploiting ANTLR 4's Left Recursive Ambiguity

Groostav I want a grammar and evaluator (ANTLR parse tree walker) that only contains binary nonterminals, without opening operators when visiting expression nodes to determine what to do (as is the case with left factorization grammars) , since the visitor wil

How to report syntax ambiguity in Antlr4 using C++ Target

Riaz This question is a follow-up report on a similar problem with Java code - how-to-report-grammar-ambiguity-in-antlr4 . I'm trying to port this code to C++, but I'm getting some compilation errors when using antlr templates. The main program in C++ is as fo

ANTLR4 self and mutual left recursion

TOEFL Is there an easy conversion or alternative to get this to work in ANTLR4? a : a p | b q | c ; b : b r | a s | d ; That is, aand bis self-left recursive and mutually left recursive, the other rules ( c, d, p, q, r, s) are simple rules just p

antlr4: syntax ambiguity, left recursion, all?

Kode Charlie My syntax looks like below and it doesn't compile. The error returned (from the antlr4 maven plugin) is: [INFO] --- antlr4-maven-plugin:4.3:antlr4 (default-cli) @ beebell --- [INFO] ANTLR 4: Processing source directory /Users/kodecharlie/workspace

How to report syntax ambiguity in Antlr4 using C++ Target

Riaz This question is a follow-up report on a similar problem with Java code - how-to-report-grammar-ambiguity-in-antlr4 . I'm trying to port this code to C++, but I'm getting some compilation errors when using antlr templates. The main program in C++ is as fo

How to report syntax ambiguity in Antlr4 using C++ Target

Riaz This question is a follow-up report on a similar problem with Java code - how-to-report-grammar-ambiguity-in-antlr4 . I'm trying to port this code to C++, but I'm getting some compilation errors when using antlr templates. The main program in C++ is as fo

ANTLR4 self and mutual left recursion

TOEFL Is there an easy conversion or alternative to get this to work in ANTLR4? a : a p | b q | c ; b : b r | a s | d ; That is, aand bis self-left recursive and mutually left recursive, the other rules ( c, d, p, q, r, s) are simple rules just p

ANTLR4 self and mutual left recursion

TOEFL Is there an easy conversion or alternative to get this to work in ANTLR4? a : a p | b q | c ; b : b r | a s | d ; That is, aand bis self-left recursive and mutually left recursive, the other rules ( c, d, p, q, r, s) are simple rules just p

Antlr4: Handling both precedence and left recursion

Gerino I am writing a language parser in Antlr4. I'm pretty good at it already, but I don't want to fall into the trap again, so here goes: expression | gate=expression QUESTION (ifTrue=expression)? COLON (ifFalse=expression)?

Antlr4 syntax left recursion error

Scholes I have a big question about antlr4 now. Whenever I try to populate antlr with that RPN syntax grammar UPN; //Parser expression : plus | minus | mult | div | NUMBER; plus : expression expression '+'; minus : expression expressio

Exploiting ANTLR 4's Left Recursive Ambiguity

Groostav I want a grammar and evaluator (ANTLR parse tree walker) that only contains binary nonterminals, without opening operators when visiting expression nodes to determine what to do (as is the case with left factorization grammars) , since the visitor wil

Antlr4 syntax ambiguity for two identical prefix rules

spell So for the SQL expression editor (simplified) I have the following syntax rules: FUNCTIONID: 'sum' | 'avg'; functionExpr: FUNCTIONID '(' expr ')' AGGFUNCTIONID: 'sum' | 'avg' aggFunctionExpr: AGGFUNCTIONID '(' expr ')' 'over' ... The code works for exp

Antlr4 mutual left recursion error

Crimson 7 I have a grammar for a mini language I want to create, but I get a mutual left recursion error between varandfunctioncall var : NAME | var '[' exp ']' | var '.' var | functioncall '.' var ; functioncall : NAME '(' (exp)? (',' exp)* ')' | var '.' func

ANTLR4 self and mutual left recursion

TOEFL Is there an easy conversion or alternative to get this to work in ANTLR4? a : a p | b q | c ; b : b r | a s | d ; That is, aand bis self-left recursive and mutually left recursive, the other rules ( c, d, p, q, r, s) are simple rules just p

antlr4: syntax ambiguity, left recursion, all?

Kode Charlie My syntax looks like below and it doesn't compile. The error returned (from the antlr4 maven plugin) is: [INFO] --- antlr4-maven-plugin:4.3:antlr4 (default-cli) @ beebell --- [INFO] ANTLR 4: Processing source directory /Users/kodecharlie/workspace

How to fix mutual left recursion in ANTLR4

Depp I have two mutually left recursive rules: frag : ID | NUMBER | TRUE | FALSE | expr ; expr: frag (PLUS | MINUS) frag | LBR expr RBR | frag ; The issue is:The following sets of rules are mutually left-recursive [frag, expr] I'm new to AN

How to report syntax ambiguity in Antlr4 using C++ Target

Riaz This question is a follow-up report on a similar problem with Java code - how-to-report-grammar-ambiguity-in-antlr4 . I'm trying to port this code to C++, but I'm getting some compilation errors when using antlr templates. The main program in C++ is as fo

How to report syntax ambiguity in Antlr4 using C++ Target

Riaz This question is a follow-up report on a similar problem with Java code - how-to-report-grammar-ambiguity-in-antlr4 . I'm trying to port this code to C++, but I'm getting some compilation errors when using antlr templates. The main program in C++ is as fo

How to report syntax ambiguity in Antlr4 using C++ Target

Riaz This question is a follow-up report on a similar problem with Java code - how-to-report-grammar-ambiguity-in-antlr4 . I'm trying to port this code to C++, but I'm getting some compilation errors when using antlr templates. The main program in C++ is as fo

How to report syntax ambiguity in Antlr4 using C++ Target

Riaz This question is a follow-up report on a similar problem with Java code - how-to-report-grammar-ambiguity-in-antlr4 . I'm trying to port this code to C++, but I'm getting some compilation errors when using antlr templates. The main program in C++ is as fo

ANTLR4 self and mutual left recursion

TOEFL Is there an easy conversion or alternative to get this to work in ANTLR4? a : a p | b q | c ; b : b r | a s | d ; That is, aand bis self-left recursive and mutually left recursive, the other rules ( c, d, p, q, r, s) are simple rules just p

ANTLR4 self and mutual left recursion

TOEFL Is there an easy conversion or alternative to get this to work in ANTLR4? a : a p | b q | c ; b : b r | a s | d ; That is, aand bis self-left recursive and mutually left recursive, the other rules ( c, d, p, q, r, s) are simple rules just p

Antlr4 syntax ambiguity for two identical prefix rules

spell So for the SQL expression editor (simplified) I have the following syntax rules: FUNCTIONID: 'sum' | 'avg'; functionExpr: FUNCTIONID '(' expr ')' AGGFUNCTIONID: 'sum' | 'avg' aggFunctionExpr: AGGFUNCTIONID '(' expr ')' 'over' ... The code works for exp