The following is the set of rules:
expr := sum
sum := term ( ( PLUS | MINUS ) term )*
term := element ( ( MULTIPLY | DIVIDE ) element )*
element := INTEGER | LPAREN sum() RPAREN
The following are the terminals:
PLUS := "+"
MINUS := "-"
MULTIPLY := "*"
DIVIDE := "/"
LPAREN := "("
RPAREN := ")"
INTEGER := ( DIGIT ) +
DIGIT := "0" – "9"
ANTLR Installation:
CLASSPATH=H:\antlr\antlr-2.5.0;.
JavaCC Installation:
Building the Grammar file for Antlr:
header
{
// A header section contains source code that
must be
//placed before any ANTLR-generated code.
Usually you will
//have the imported classes here.
}
class MyParser extends Parser;
options
{
}
{
// you can enter code for additional methods
and
// variables here ….
}
parser_ruls:
.
.
.
class MyLexer extends Lexer;
options
{
}
{
// you can enter code for additional methods
and
// variables here ….
}
lexer_ruls:
.
.
EOF
rule_name
[int a, String s] // input arguments
returns [int x] // return values
:
...
;
Antlr supports extended BNF notation according to the following four subrule:
( P1 | P2 | ... | Pn ) -> match P1 or P2 or …. Pn
( P1 | P2 | ... | Pn )? -> match zero or one of P1 or P2, …. Pn
( P1 | P2 | ... | Pn )* -> match zero or more of P1 or P2 ….Pn
( P1 | P2 | ... | Pn )+ -> match one or more of P1 or P2 ….Pn
header
{
import java.lang.*;
}
class CalcLexer extends Lexer;options{
// a lookahed of 2
k=2;
}
// white space, these will be skipped by the lexer:WS: ( ' ' | '\t' | '\r') { _ttype = Token.SKIP; };
// terminals:
EOL : '\n'
;
LPAREN : '(' ;
RPAREN : ')' ;
PLUS : '+' ;
MINUS : '-' ;
MULTIPLY : '*' ;
DIVIDE : '/' ;
INT :
(DIGIT)+ ;
// protected rules, can not be accessed from
the parser, can be used
// only from other lexer rules:
protected
DIGIT : '0'..'9' ;
Unless changed manually, each lexer rule returns a token.
class CalcParser extends Parser;
expr returns [float f]
{
f=0;
}
:
f=sum
( EOL )*
EOF
;
sum returns [float f]
{
f=0;
}
:
term
(
( PLUS term
| MINUS term )
)*
;
term returns [float f]
{
f=0;
}
:
element
(
( MULTIPLY
element | DIVIDE element )
)*
;
element returns [float f]
{
f=0;
}
:
INT | LPAREN sum RPAREN
;
For example:
sum returns [float f]
{
f=0; float f2=0;
}
:
f=term //assign to f the value returned by the
rule term
(
( PLUS f2=term
{ f += f2; }
| MINUS f2=term
{ f -= f2; }
)
)*
;
Notice that at the end of the rule, you do not need to explicitly return the value f, it will be done by antlr.
class CalcParser extends Parser;
expr returns [float f]
{
f=0;
}
:
f=sum
( EOL )*
EOF
;
sum returns [float f]
{
f=0;
float f2=0;
}
:
f=term
(
(
PLUS f2=term { f += f2; }
| MINUS f2=term { f -= f2; }
)
)*
;
term returns [float f]
{
f=0;
float f2=0;
}
:
f=element
(
( MULTIPLY
f2=element { f *= f2; }
| DIVIDE f2=element { f /= f2; }
)
)*
;
element returns [float f]
{
f=0;
Float x;
}
:
d:INT { x = new Float(d.getText());
f = x.floatValue();
}
|
LPAREN f=sum
RPAREN
;
class CalcLexer extends Lexer;
options
{
k=1;
}
WS: ( ' ' |
'\t' | '\r')
{
_ttype = Token.SKIP;
}
;
EOL : '\n' ;
LPAREN : '(' ;
RPAREN : ')' ;
PLUS : '+' ;
MINUS : '-' ;
MULTIPLY : '*' ;
DIVIDE : '/' ;
INT : (DIGIT)+
;
protected
DIGIT : '0'..'9' ;
On the command line, type the following command:
Java antlr.Tool calc.g
I am assuming that your path is pointing to the directory where java is installed, if not then just add the following path to your PATH environment variable:
Q:\WINDOWS\jdk-w32\bin
javac *.java
this will produce the class files.
class Calc
{
public static void main(String[] args)
{
try
{
CalcLexer lexer =
new CalcLexer(new DataInputStream(System.in));
CalcParser parser =
new CalcParser(lexer);
// Parse the input expression
System.out.print("Enter
Expression: ");
System.out.flush();
float res = parser.expr();
System.out.println("Result:
" + res);
}
catch(IOException e)
{
System.err.println("IOException:
" + e);
}
catch(ParserException e)
{
System.err.println("ParseException:
" + e);
}
}
}
Put the above code into a file called "Cal.java", compile it and run it using the java command:
Java Calc
Grammar file has the following structure:
options {
// options for the parser generator
}
PARSER_BEGIN(my_parser_name)
public class my_parser_name
{
// code for methods and variables for the
class.
}PARSER_END(my_parser_name)
// the SKIP section lists the white space that
separates the tokens:
SKIP :
{
}
// The TOKEN section defines the tokens for
the lexer:
TOKEN :
{
}
production_rules
.
.
.
EOF
Production rules have the following syntax:
java_return_type
identifier ( parameter_list )
:
{
//java code
}
{
expansion_choices
}
Writing the Grammar for the Cal Example:
options
{
LOOKAHEAD=1;
}
PARSER_BEGIN(Calc)
public class Calc
{
public static void main(String
args[])
{
try
{
Calc parser = new Calc(System.in);
System.out.print("Enter Expression: ");
System.out.flush();
float res = parser.expr();
System.out.println("Result: " + res);
}
catch (ParseException
e)
{
System.out.println("Exception: " + e);
}
}
}
PARSER_END(Calc)
// we want to skip these characters, notice that here characters are enclosed in "", not as antlr // syntax.
SKIP :
{
" "
| "\r"
| "\t"
}
// the end of line is a token
TOKEN :
{
< EOL: "\n" >
}
// the terminals in the grammar:
TOKEN :
{
< PLUS: "+" >
| < MINUS: "-" >
| < MULTIPLY: "*" >
| < DIVIDE: "/" >
| < LPAREN: "(" >
| < RPAREN: ")" >
}
// here the definition of an INT depends on
the definition of DIGIT,
// DIGIT is prefixed with "#", to mean that
it an not be referenced
// from the parser, and it an only be part
of another token (similar to // the protected lexer methods in antlr.
TOKEN :
{
< INT: ( <DIGIT> )+ >
| < #DIGIT: ["0" - "9"] >
}
float expr() :
{
float f = 0;
}
{
sum()
( <EOL> )*
<EOF>
}
float sum() :
{
float f = 0;
float f2 = 0;
}
{
term()
(
(
<PLUS> term() | <MINUS> term())
)*
}
float term() :
{
float f = 0;
float f2 = 0;
}
{
element()
(
( <MULTIPLY> element()
| <DIVIDE> element() )
)*
}
float element() :
{
float f = 0;
Token t;
Float x;
}
{
( <INT>
| <LPAREN> f=sum() <RPAREN> )
}
Notice that references to other rules is similar to function
calls and references to tokens is enclosed in <>.
PARSER_BEGIN(Calc)
public class Calc
{
public static void main(String args[])
{
try
{
Calc parser
= new Calc(System.in);
System.out.print("Enter
Expression: ");
System.out.flush();
float res
= parser.expr();
System.out.println("Result:
" + res);
}
catch (ParseException e)
{
System.out.println("Exception:
" + e);
}
}
}
PARSER_END(Calc)
SKIP :
{
" "
| "\r"
| "\t"
}
TOKEN :
{
< EOL: "\n" >
}
TOKEN :
{
< PLUS: "+" >
| < MINUS: "-" >
| < MULTIPLY: "*" >
| < DIVIDE: "/" >
| < LPAREN: "(" >
| < RPAREN: ")" >
}
TOKEN :
{
< INT: ( <DIGIT> )+ >
| < #DIGIT: ["0" - "9"] >
}
float expr() :
{
float f = 0;
}
{
(
f=sum()
( <EOL> )*
<EOF>
)
{
return f;
}
}
float sum() :
{
float f = 0;
float f2 = 0;
}
{
f=term()
(
(
<PLUS> f2=term() { f += f2; }
| <MINUS> f2=term()
{ f -= f2; }
)
)*
{
return f;
}
}
float term() :
{
float f = 0;
float f2 = 0;
}
{
f=element()
(
(
<MULTIPLY> f2=element() { f *= f2; }
|
<DIVIDE> f2=element() { f /= f2; }
)
)*
{
return f;
}
}
float element() :
{
float f = 0;
Token t;
Float x;
}
{
(
t=<INT> {
x= new Float(t.image);
f= x.floatValue();
}
|
<LPAREN>
f=sum() <RPAREN>
)
{
return f;
}
}