Sunday, April 22, 2012

Movimentum - Building the model

Now that we have model classes and a grammar, we must direct our parser to create the model while parsing. In ANTLR, one writes the corresponding statements into the grammar. Here is the original rule for script, below the "code-enhanced" rule:

    script
      : config
        thingdefinition*
        ( time
          constraint*
        )*
        EOF
      ;

    script returns [Script script]
      :                      {{ var ths = new List();
                                var sts = new List();
                             }}
        cfg=config
        ( th=thingdefinition { ths.Add(th); }
        )*
        ( t=time             {{ var cs = new List(); }}
          ( c=constraint     { cs.Add(c); }
          )*                 { sts.Add(new Step(t, cs)); }
        )*                   { script = new Script(cfg,ths,sts); }
        EOF
      ;

Some bits and pieces:
  • Each call to a subrule now assigns its result to a local variable (which ANTLR declares for us).
  • These variables can then be used as variables in the actions.
  • I prefer the "constructor style" of object creation: First, create all sub-objects, at last call the result constructor (new Script(...) in this case). The other possibility is to create the object at the beginning and then provide construction methods (e.g. AddStep(...) etc.). But the second method requires modification methods on the objects, and as I said, I like to keep immutability as intact as possible.
  • Actions are usually put by ANTLR inside if (!backtracking) { ... } guards. However, for some declarations, this leads to code that is not compilable. For these cases, ANTLR provides the {{...}} blocks - however, one must take care that these do not contain actions with side effects. As seen above, I use these blocks for the declarations and emtpy initializations of local collections. This will never do any harm (and for this rule, I even know that no backtracking will ever happen).
In the next production rule, we need the actual value of a number. Therefore, I replace NUMBER everywhere with a new rule number, which is defined as follows:

   number returns [double value]
     : NUMBER            { value = double.Parse($NUMBER.Text,
                                   CultureInfo.InvariantCulture); }
     ;

(This requires an additional "using" in the @parser::header section). With this, I can now complete the config rule:

    config returns [Config result]
      : CONFIG '('
              fptu=number  // frames per time unit
        ')'                  { result = new Config(fptu); }
        ';'
      ;

The rule for a thingdefinition works a little bit different from the rest, because it does that small computation for relative anchor definitions:

    thingdefinition returns [Thing result]
      : IDENT
        ':'
        s=source    {{ var defs = new Dictionary<string,
                                         ConstVector>();
                    }}
        (anchordefinition[defs]
        )+
        ';'         { result = new Thing($IDENT.Text, s, defs); }
      ;
  
    anchordefinition [Dictionary<string, ConstVector> defs]
      : n=IDENT
        '='    {{ ConstVector v = null; }}
        ( c=constvector 
               { v = c; }
        | i=IDENT '+' c=constvector 
               { v = ConstAdd($i.Line, defs, $i.Text, c, true); }
        | i=IDENT '-' c=constvector 
               { v = ConstAdd($i.Line, defs, $i.Text, c, false); }
        )      { defs[$n.Text] = v; }
      ;

ConstAdd is a helper function which we define in the non-generated partial class of MovimentumParser:

        private ConstVector ConstAdd(
                int lineNo,
                Dictionary<string, ConstVector> defs,
                string lhsName,
                ConstVector rhs,
                bool plus) {
            ConstVector lhs;
            if (!defs.TryGetValue(lhsName, out lhs)) {
                throw new Exception(string.Format(
                    "Line {0}: Anchor {1} not yet defined",
                    lineNo, lhsName));
            }
            return plus
                ? new ConstVector(lhs.X + rhs.X, lhs.Y + rhs.Y)
                : new ConstVector(lhs.X - rhs.X, lhs.Y - rhs.Y);
        }

The ConstVector itself is created by the rules constvector and constscalar:

    constvector returns [ConstVector result]
      :                      {{ double x = double.NaN, 
                                       y = double.NaN; }}
        '['
        ('-' c=constscalar   { x = -c; }
        |    c=constscalar   { x = c; }
        )
        ','
        ('-' c=constscalar   { y = -c; }
        |    c=constscalar   { y = c; }
        )
        ']'                  { result = new ConstVector(x,y); }
      ;

    constscalar returns [double result]
      : n=number             { result = n; }
      | c=constvector
        ( X                  { result = c.X; }
        | Y                  { result = c.Y; }
        )
      ;

We also need to read in an image. A straightfoward implementation would be:

    source returns [Image result]
      : FILENAME     { result = Image.FromFile($FILENAME.Text); }
      ;

However, when you try this (or think about it), it kills testability: Suddenly, we must really have all those image files at the correct place in the file system! So, we replace this with a call to a virtual method ImageFromFile in our Parser, and then use an overloaded (or mocked) version in the unit tests.

What about the constraints? Well, currently the rule for a constraint simply declares a result - ANTLR will initialize it with default(Constraint), which is null - and this is what we get right now for each constraint:

    constraint returns [Constraint result]
      : vectorexpr
        '='
        vectorexpr
        ';'
      | ...
      ;

I'll of course add more code here.

Here is the grammar file in its current state, here is the partial MovimentumParser class with the methods mentioned above, and here are a few of the current test cases.

No comments:

Post a Comment