Sunday, April 22, 2012

Movimentum - The script model

Now that we can parse scripts, we ... should test the parser. However, we wouldn't get more than a "yes/no" information whether the parse succeeded. So we continue straight to building a model from the input - we will use these models then also for a few unit tests for the parser.

ANTLR (and probably other parser generators) have the ability to create standard parse trees with little code. These trees, however, then consist of nodes of a single type which contain their children in generic collections - similar to the DOM and Linq.Xml structures for XML. However, I want to program against "full-featured" classes, so I write my own model classes. Using ReSharper, this is not much work. Let's do it for the top-level element script: First, I define a raw class like this

    class Script {
        public Config Config { get { return _config; } }
        public IEnumerable<Thing> Things { get { return _things; } }
        public IEnumerable<Step> Steps { get { return _steps; } }
    }

R# complains in red about all those missing things. With Alt+Enter on _config, _things, and _steps, we do "Create field ...":

    class Script {
        private Config _config;
        private IEnumerable<Thing> _things;
        private IEnumerable<Step> _steps;
        public Config Config { get { return _config; } }
        public IEnumerable<Thing> Things { get { return _things; } }
        public IEnumerable<Step> Steps { get { return _steps; } }
    }

R# now complains (with a wavy line) that each "Field is never assigned" - Alt+Enter on each field allows us to "Initialize Field from constructor(s) parameter":

    class Script {
        private Config _config;
        private IEnumerable<Thing> _things;
        private IEnumerable<Step> _steps;
        public Script(Config config, IEnumerable<Thing> things,
                      IEnumerable<Step> steps) {
            _config = config;
            _things = things;
            _steps = steps;
        }

        public Config Config { get { return _config; } }
        public IEnumerable<Thing> Things { get { return _things; } }
        public IEnumerable<Step> Steps { get { return _steps; } }
    }

(My) R# now complains that the fields could be made readonly - Alt+Enter on each field corrects that, and we are done:

    class Script {
        private readonly Config _config;
        private readonly IEnumerable<Thing> _things;
        private readonly IEnumerable<Step> _steps;
        public Script(Config config, IEnumerable<Thing> things,
                      IEnumerable<Step> steps) {
            _config = config;
            _things = things;
            _steps = steps;
        }

        public Config Config { get { return _config; } }
        public IEnumerable<Thing> Things { get { return _things; } }
        public IEnumerable<Step> Steps { get { return _steps; } }
    }

Question: Why don't I use auto properties - which would reduce the number of lines by one-third? Or auto-properties with setters, reducing the line number to one-third?
Answer: Because I always enforce immutability in my programs, even in private methods: Things that should not change should not change. And if that boiler-plate code gets too "noisy", I separate the classes into two partial classes, one with the boiler-plate, the other one with the "useful" code.

I do this same process now for all constructs from the grammar. Along the way, I think a little bit about abstraction, but not too much. Here is an example: Class Thing needs a source and the anchors.
  • For the source, I use type System.Drawing.Image - i.e., we read the images already during parsing. This will lead to "file not found" errors during parsing ... I live with that.
  • For the list of anchors, I want a simple Dictionary: The key is the name of the anchor, the value is the computed vector. This has a few semantic implications: (a) Anchors that use other anchors in their definition must come after the definition of the used ones. (b) We must be able to add and subtract ConstVectors already during parsing. I'll show the corresponding code when I write it.
Here is a piece of code that shows the "raw" model for step and constraints. Instead of the operator in the ScalarInequalityConstraint, I could also have
  • provided four subclasses; or
  • reduced the number of operators to two by reversing the left-hand-side and right-hand-side e.g. for GT and GE.
If I have to write more than one or two switches later, I might return to one of these ideas.

    class Step {
        public decimal Time { get { return _time; } }
        public IEnumerable<Constraint> Constraints {
            get { return _constraints; }
        }
    }

    abstract class Constraint {}

    class VectorEqualityConstraint : Constraint {
        public VectorExpr Lhs { get { return _lhs; } }
        public VectorExpr Rhs { get { return _rhs; } }
    }

    class ScalarEqualityConstraint : Constraint {
        public string Variable { get { return _variable; } }
        public ScalarExpr Rhs { get { return _rhs; } }
    }

    class ScalarInequalityConstraint : Constraint {
        public string Variable { get { return _variable; } }
        public ScalarInequalityOperator Operator { get { return _operator; } }
        public ScalarExpr Rhs { get { return _rhs; } }
    }

    enum ScalarInequalityOperator { LT, LE, GT, GE }

Here is the raw structure of the vector expressions. Again, I use operators instead of too many subclasses - but I might reconsider this later, when we get methods in these classes. The Vector class is already fully implemented, as I copied and adapted it from the ConstVector:

    abstract class VectorExpr {}

    class BinaryVectorExpr : VectorExpr {
        public VectorExpr Lhs { get { return _lhs; } }
        public BinaryVectorOperator Operator { get { return _operator; } }
        public VectorExpr Rhs { get { return _rhs; } }
    }

    enum BinaryVectorOperator { PLUS, MINUS, TIMES }

    class VectorScalarExpr : VectorExpr {
        public VectorExpr Lhs { get { return _lhs; } }
        public VectorScalarOperator Operator { get { return _operator; } }
        public ScalarExpr Rhs { get { return _rhs; } }
    }

    enum VectorScalarOperator { ROTATE }

    class UnaryVectorExpr : VectorExpr {
        public UnaryVectorOperator Operator { get { return _operator; } }
        public VectorExpr Inner { get { return _inner; } }
    }

    enum UnaryVectorOperator { MINUS, INTEGRAL, DIFFERENTIAL }

    class Vector : VectorExpr {
        private readonly ScalarExpr _x;
        private readonly ScalarExpr _y;
        public Vector(ScalarExpr x, ScalarExpr y) {
            _x = x;
            _y = y;
        }

        public ScalarExpr X { get { return _x; } }
        public ScalarExpr Y { get { return _y; } }
    }

    class Anchor : VectorExpr {
        public Thing Thing { get { return _thing; } }
        public string Name { get { return _name; } }
    }

Finally, here are the definitions for scalar expressions:

    abstract class ScalarExpr { }

    class BinaryScalarExpr : ScalarExpr {
        public ScalarExpr Lhs { get { return _lhs; } }
        public BinaryScalarOperator Operator { get { return _operator; } }
        public ScalarExpr Rhs { get { return _rhs; } }
    }

    enum BinaryScalarOperator { PLUS, MINUS, TIMES, DIVIDE }

    class UnaryScalarExpr : ScalarExpr {
        public UnaryScalarOperator Operator { get { return _operator; } }
        public ScalarExpr Inner { get { return _inner; } }
    }

    enum UnaryScalarOperator { MINUS, INTEGRAL, DIFFERENTIAL }

    class BinaryScalarVectorExpr : ScalarExpr {
        public VectorExpr Lhs { get { return _lhs; } }
        public BinaryScalarVectorOperator Operator { get { return _operator; } }
        public VectorExpr Rhs { get { return _rhs; } }
    }

    enum BinaryScalarVectorOperator { ANGLE }

    class UnaryScalarVectorExpr : ScalarExpr {
        public VectorExpr Inner { get { return _inner; } }
        public UnaryScalarVectorOperator Operator { get { return _operator; } }
    }

    enum UnaryScalarVectorOperator { LENGTH, X, Y }

    class Constant : ScalarExpr {
        public decimal Value { get { return _value; } }
    }

    class ScalarVariable : ScalarExpr { // Also _
        public string Name { get { return _name; } }
    }

    class T : ScalarExpr {
    }

    class IV : ScalarExpr {
    }

Now, I have to add the boiler-plate code. Bear with me ...

... two minutes later, everything is done (ReSharper is a fantastic tool!). Here is the created model file.
All classes in one file is not really ok. I have to decide later which code I leave in there (the boiler-plate code), and which I move out to separate files.

No comments:

Post a Comment