Here is an ANTLR grammar for Movimentum. First, we need some ANTLR and .Net noise:
grammar Movimentum;
options {
language=CSharp2;
}
@parser::header {
using System.Collections.Generic;
}
@parser::namespace { Movimentum.Parser }
@lexer::namespace { Movimentum.Lexer }
Then, I like to start grammars top down:
script
: config
objectdefinition*
( time
constraint*
)*
EOF
;
: config
objectdefinition*
( time
constraint*
)*
EOF
;
config
: CONFIG '('
NUMBER // frames per time unit
',' unit // angular unit
')'
;
At this time, we must start with the lexer definitions also:
CONFIG : '.config';
: ('0'..'9')+
( '.'
('0'..'9')*
)?
( ('E'|'e')
('-')?
('0'..'9')+
)?
;
And before we forget it, we add rules for whitespace and comments:
: ( '\t' | ' ' | '\r' | '\n' )+ { $channel = HIDDEN; }
;
COMMENT
: '/' '/' .* ( '\r' | '\n' ) { $channel = HIDDEN; }
;
The next important thing are the objectdefinitions:
objectdefinition
: IDENT
':'
source
anchordefinition+
: IDENT
':'
source
anchordefinition+
';'
;
;
We now need an IDENT in the lexer. For the moment, we restrict ourselves to ASCII letters in its definition:
IDENT
: ('a'..'z'|'A'..'Z')('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
: ('a'..'z'|'A'..'Z')('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
For the source, we currently define only a "filename" (I suspect I want plain texts later, and maybe even some standard forms like arrows and lines):
source
: FILENAME
;
and
FILENAME
: '\''
(~('\''))*
'\''
;
Finally, we need anchordefinitions. We allow only constant vectors in them:
anchordefinition
: IDENT
'='
( constvector
| IDENT '+' constvector
| IDENT '-' constvector
)
;
where constvector is defined as
constvector
: '['
('-')? constscalar
: '['
('-')? constscalar
','
('-')? constscalar
('-')? constscalar
']'
;
;
: NUMBER
| constvector X
| constvector Y
;
This requires our first operator definitions in the lexer:
X : '.x';
Y : '.y';
Y : '.y';
This should suffice to define objects and their anchors for the moment. Time to set up a .Net project, write the setup part of our crank-slider mechanism into a file and check whether we can read it (after we comment out the time and constraint references in script)!
UPDATE: Setting up the project and writing the first test case revealed two errors in the grammar above. First, a semicolon is missing at the end of the config.rule - it must be:
config
: CONFIG '('
NUMBER // frames per time unit
',' unit // angular unit
')'
';'
;The second is an LL problem: The constscalar rule cannot decide whether to take the second or the third alternative. Therefore, the common prefix has to be factored out:
constscalar
: NUMBER
| constvector
( X
| Y
)
;
In a test case, I can now parse the object definitions for the slider-crank mechanism flawlessly!
No comments:
Post a Comment