Wednesday, July 25, 2012

Movimentum - Polynomial folding 1

As explained in two previous postings, we want to standardize expressions as much as possible so that rules can easily match them for rewriting. One case in point are expressions that are a sum of some powers of a single variable, and an additional square root term. We would like to see to it that such expressions are either of the form
P[x] + sqrt(E[x])
or of the form
P[x] + –sqrt(E[x])

where P is a polynomial in x, and E is some arbitrary expression. Is it necessary that E is also of that form? Is seems not, because we will need to match E only after rewriting the whole expression, and then, additional standardization possibilities might kick in. However, our folding visitor will eagerly rewrite also subexpressions—it is just easier to program it like that.

The core part of the PolynomialFoldingVisitor is very simple. It follows the lines of the ConstantFoldingVisitor, but does not even bother to optimize for non-rewritten expressions (I hope very much that we can do optimization later). We start with the standard visits for constraints ...

    public AbstractConstraint Visit(EqualsZeroConstraint equalsZero, Ignore p) {
        return new EqualsZeroConstraint(equalsZero.Expr.Accept(this, p));
    }

    public AbstractConstraint Visit(MoreThanZeroConstraint moreThanZero, Ignore p) {
        return new MoreThanZeroConstraint(moreThanZero.Expr.Accept(this, p));
    }

    public AbstractConstraint Visit(AtLeastZeroConstraint atLeastZero, Ignore p) {
        return new AtLeastZeroConstraint(atLeastZero.Expr.Accept(this, p));
    }

... as well as all the expressions: Constants, variables, and polynomials are already standardized to the P + Ei form (there is no Ei in all these cases!), therefore the code is trivial; unary and binary expressions delegate to the respective operators:

    public IAbstractExpr Visit(IConstant constant, Ignore p) {
        return constant;
    }

    public IAbstractExpr Visit(INamedVariable namedVar, Ignore p) {
        return namedVar;
    }

    public IAbstractExpr Visit(IAnchorVariable anchorVar, Ignore p) {
        return anchorVar;
    }

    public IAbstractExpr Visit(UnaryExpression unaryExpr, Ignore p) {
        IAbstractExpr newInner = unaryExpr.Inner.Accept(this, p);
        return unaryExpr.Op.Accept(this, newInner, p);
    }

    public IAbstractExpr Visit(BinaryExpression binaryExpr, Ignore p) {
        IAbstractExpr newLhs = binaryExpr.Lhs.Accept(this, p);
        IAbstractExpr newRhs = binaryExpr.Rhs.Accept(this, p);

        return binaryExpr.Op.Accept(this, newLhs, newRhs, p);
    }

    public IAbstractExpr Visit(IGeneralPolynomial polynomial, Ignore p) {
        return polynomial;
    }

We "only" have to complete the code for the operators, and then we are done. Of course, the real meat is to be found there—therefore, I'll explain that code in increasing order of complexity:
  1. Formal square root is easiest, because it cannot even be evaluated.
  2. Unary operators that cannot be rewritten to polynomials are als easy: sine, cosine, and positive square root. We only rewrite them if their argument is a constant, so that we do also constant folding.
  3. Next, we look at square (which rewrites polynomials and square roots to new polynomials).
  4. Unary minus (which rewrites a polynomial to another, quite similar polynomial) comes next. Why do we look at unary minus only after square? With the square operator, is is possible to see all the problems of polynomial rewriting "in a nutshell"; unary minus is actually too simple and would lead us to a design that we'd have to refactor later.
  5. Then, we tackle the plus operator.
  6. The times operator is the most challenging of all.
  7. As an aftershot, we handle division.

1. Formal square root rewriting


This rewriting "does nothing"—it recreates the input expression:
    public IAbstractExpr Visit(FormalSquareroot op, IAbstractExpr inner, Ignore p) {
         return new UnaryExpression(inner, new FormalSquareroot());
    }

2. Sine, cosine, and positive square root rewriting


The code for these rewritings essentially does the following:
  • If the inner expression is a constant, evaluate the operator on the constant and return the new constant.
  • Otherwise, wrap the inner expression in the operator.
The FoldIfConstant method gets inner as a parameter, but also a delegate that can do the evaluation if inner is a constant:

    public IAbstractExpr Visit(PositiveSquareroot op, IAbstractExpr inner, Ignore p) {
        return FoldIfConstant(op, inner, x => x.Near(0) ? 0 : Math.Sqrt(x));
    }
    public IAbstractExpr Visit(Sin op, IAbstractExpr inner, Ignore p) {
        return FoldIfConstant(op, inner, x => Math.Sin(x / 180 * Math.PI));
    }
    public IAbstractExpr Visit(Cos op, IAbstractExpr inner, Ignore p) {
        return FoldIfConstant(op, inner, x => Math.Cos(x / 180 * Math.PI));
    }
    private static IAbstractExpr FoldIfConstant(UnaryOperator op,
             IAbstractExpr inner, Func<double, double> eval) {
        IConstant innerAsConst = inner as IConstant;
        return innerAsConst == null
                    ? (IAbstractExpr)new UnaryExpression(inner, op)
                    : Polynomial.CreateConstant(eval(innerAsConst.Value));
    }

3. Square rewriting, part 1


Now we come to the crucial code: "Hard rewriting." We look at the square operator first, because its interaction with square roots will show more clearly what rewritings we want to do. We know that visitors of subexpressions will already have reformulated the inner expression to one of the following three forms:
  • P[V] + E
  • P[V]
  • E
Here, P[V] is a polynomial in the single variable V, and E is not—i.e., it is either not a polynomial at all, or a polynomial in some other variable. Two remarks are probably necessary:
  • P[V] might also be a constant, in which case we assume its variable is "some arbitrary or dummy variable for constants."
  • Because E may be a polynomial in some other variable, i.e. Q[W], it turns out that our standardization is not a normalization, as the following example shows: The expression 2x + 3y may be represented in two different ways—either the leading polynomial is 2x, and the "rest" E is 3y, or vice versa. One could probably repair this by requiring that the alphabetically lowest variable must be used to create the polynomial—but right now, I am not interested in this problem.
Now, we want to apply a square operator to such an expression, and again get an expression of on of these forms in as many cases as possible. Let us start with the simplest case, namely the second form P[V]: Of course, the square of a polynomial is again a polynomial in the same variable! We can sketch this rewriting as follows:
(P[V])2 → {P[V]2}
The curly braces mean that the expression inside actually has to be simplified to a polynomial—we will therefore need a method that squares a polynomial! But before we write this, let us look at the other possible inner expressions.

The next interesting case is the lonely E. One could believe that no rewritings are possible here except the trivial reconstruction of the input expression, but this is not true. Consider the case where E is of the form sqrt(E), where E is some other expression (sorry for using two different E's here—E and E; but in the code, I use E and e for the inner expressions, which I did not want to change).

Ok: What is the square of sqrt(E)? Of course, it is simply E, and we should capture this in a specific rewriting! And now it pays off that the visitor also folds inner expressions—we can now be sure that E is already standardized, and therefore we can happily sketch the rewriting as
(sqrt(E))2 → E
Next inside operator: If there is a unary minus operator inside the square, we can get rid of that operator, and we could be tempted to write
(–E)2 → E2
But this is not correct! Assume that E is sqrt(x). Then, (–sqrt(x))2 should not be rewritten to the unary expression (sqrt(x))2, but to the polynomial x. There are two ways to solve this problem:
  • One possibility is to recursively run it again through the standardization! We only have to take care that we do not produce an infinite recursion. But as long as our standardization is a "folding," i.e., a simplification, there is some metrics on the expression that gets strictly smaller with each rewriting—e.g., the number of operators in the whole expression. And when there is a lower limit on this number (zero, in the example metrics), we know that simplification must end somewhen. We will use this for the most difficult of all operators, namely multiplication.
  • Here, we have a simpler solution: We add a more specific rewriting for square roots:
(–sqrt(E))2 → E
Do we also have to consider "minus inside minus," i.e., (– – E)2? No, because – – E must already have been simplified by a lower visitor; we'll look to that in the section about the unary minus operator.

Finally, we look at the third, most general form of an inner expression: P[V] + E. If we square it, we get of course P[V]2 + 2 P[V] E + E2. Which parts of this can be simplified? Of course, P[V]2 and 2 P[V] are both polynomials.

But we must be more careful: Suppose that P[V] is x, and E is sqrt(x). Then, (x + sqrt(x))2 = x2 + 2x sqrt(x) + x, which we would like to have simplifed to the polynomial x2 + x plus the middle expression. And there might be other cases where also the middle term has to be folded into the result somehow ... What do we do? Again, we have the two possibilities:
  • Recursive standardization.
  • Explicitly add rules for square roots that handle special cases.
Again, I opt for special casing here. The cases are, from most specific to least specific:
  • inner expression is P[V] + –sqrt(Q[V]) → square is {P[V]2 + Q[V]} + {–2P[V]}sqrt(Q[V]).
  • inner expression is P[V] + sqrt(Q[V]) → square is {P[V]2 + Q[V]} + {2P[V]}sqrt(Q[V]).
  • inner expression is P[V] + –sqrt(E) → square is {P[V]2} + ({–2P[V]}sqrt(E) + E). We parenthesize the last two terms so that the complete result is of the form P + E.
  • inner expression is P[V] + sqrt(E) → square is {P[V]2} + ({2P[V]}sqrt(E) + E). Again, the last two terms are grouped together.
  • inner expression is P[V] + –E → square is {P[V]2} + ({–2P[V]}E + (E)2).
  • inner expression is P[V] + E → square is {P[V]2} + ({2P[V]}E + (E)2).

We now have a specification for folding all expressions with square operator. But how to code it?

No comments:

Post a Comment