SWE Musings: Logical Operators in NHibernate 3.x

Just so that this blog doesn't dry up, I (allow myself to cross-)post part of a text I wrote about NHibernate 3.x's handling of logical operators. If you want to tune it to the discussion, you might want to read the group http://groups.google.com/group/nhibernate-development.

1. Considerations about the where semantics for an SQL Linq provider

Before we change the behavior of logical operators in NhHib.Linq, we should have some agreement about the intended behavior, i.e., a specification. Otherwise, one might argue that there is no problem with them - especially with the || operator! Unfortunately (and interestingly), it appears to me the answer to this is not as simple as some might think. The following are some general thoughts - bear with me, I'll talk about the logical operators, and especially the || operator, in a moment (namely in sections 2. to 6.).

The apparently correct way to specify an SQL Linq provider's (or any Linq provider's) behavior is:

    (BEHAV-1) If you have a set of objects S and evaluate a query Q using an SQL Linq provider, the result should be the same as when using the standard in-memory Linq provider Linq2Objects.

Obviously, this specification fails for methods that cannot be transformed to the SQL Linq provider's underlying technology (think virtual object methods, for example). So a more practical definition is:

    (BEHAV-2) ...BEHAV-1..., if the SQL Linq provider accepts Q, i.e., does not throw some exception saying "I do not support this and that".

But it gets more tricky: What are the possible results of Linq2Objects when evaluating Q? Essentially
(a) it can return an IQueryable; or
(b) it can throw an exception.

(BEHAV-2) would require that the SQL Linq provider throws an exception when Linq2Objects throws an exception. Here is an example (I suggest that the reader sketches this and the following class/association structures on a sheet of paper):

    Class CA has a nullable reference called B to class CB; and an int property P.
    Class CB likewise has a nullable reference called C to class CC.
    Class CC has an int property Q.
    In the database, we have a CA object na that references a CB where C is null (i.e. the CB does not reference a CC object).

    We ask

        from a in CA
        where a.B.C.Q > 25
        select a.P;

    Linq2Objects will throw a NullReferenceException when it evaluates a.B.C.Q for a = na.
    How could an SQL Linq provider mimic this exceptional behavior? I think we agree this is not simple:

Either we evaluate a.B.C.Q in-memory to force the NRE - which would of course uselessly load CB objects.
Or we create some code in the SQL statement that somehow throws an SqlException ... which would at least mystify the database's planner, I'd say.

AFAIK, every existing SQL Linq provider does something different: We allow an IQueryable result where Linq2Objects throws an exception. This contradicts (BEHAV-2)! A definition amended for this could be:

(BEHAV-3) ...BEHAV-2...; but if Linq2Objects throws a NullReferenceException, the SQL Linq provider may return an arbitrary IQueryable result.

The problem here is the "arbitrary" result: If you give this SQL Linq provider to programmers, they will come to rely on the result even in these cases - e.g., they will expect that that IQueryably result is always empty. If we could make it work so that the result is reasonably random, this could be avoided (as is the case with SQL SELECTs without ORDER BYs in database programming); but this is IMHO as hard as throwing an exception in the first place. So we give programmers consistent results even in these cases - and they will trust them. So we end up with

(BEHAV-4) ...BEHAV-2...; but if Linq2Objects throws a NullReferenceException, the SQL Linq provider may return an defined and reasonable IQueryable result.

And this is, in my opinion, the end of the story: It means that SQL Linq provider writers have to extend the semantics of Linq. The guidelines for this extension will of course be twofold:

The semantics must be reasonable (understandable; akin to known semantics like e.g. SQL).
It must be possible to implement the semantics efficiently.

I think we agree that the second point drives the definition: We implement the semantics as simple as possible; and then we explain it.

This is exactly what we now have to do for the logical operators and NHib.Linq.

------------------------------------------------------------------------------------------------------------------------

2. Logical operators in SQL Linq providers

"Logical operators" is not a term that is defined in Linq. Yet, we need to define which operators in expressions are interpreted as logical operators in HQL. A practical definition is the following:

|| with return type bool - which is mapped to OR
&& with return type bool - which is mapped to AND
! with return type bool - which is mapped to NOT
?: with return type bool - where a ? b : c is mapped to (a AND b) OR (NOT (a) AND c).

One can create Linq expressions where this mapping is wrong, but I will not talk about other ways of defining logical operators.

I will now define semantics for these operators that hopefully are "reasonable" in the sense von (BEHAV-4) above. In the course of this, I'll give a few examples that highlight problems of possible translations from Linq to SQL; and use these examples to decide on "better" semantics.

------------------------------------------------------------------------------------------------------------------------

3. Possible semantics for the || operator in the NHibernate-Linq provider

One possible semantics is obviously:

    (||-0) || is not supported.

We ignore this ridiculous idea.

Another possible semantics is:

    (||-1) If there is an entity member expressions at depth > 1 in the lhs of the ||, the same member expression must occur in the rhs at depth > 1; and vice versa.

What is an "entity member expression"?
    A member expression with a (result) type that is a mapped entity types. Other member expressions (with simple types or component types) are of no relevance to this discussion, except where explicitly noted.

What's the depth of a member expression?
    A member expression E.M in C# has a lhs expression E and a member M defined in E's type.
    When we parenthesize member expressions, the depth is the number of parentheses to the left of the expression.
    For example,
        - a.P is at depth 1 in (a.P);
        - a.B is at depth 2 in ((a.B).P) and ((a.B).C);
        - a.B is at depth 3 in (((a.B).C).Q); a.B.C is of depth 2 in the same expression.
Entity member expressions of depth > 1 are the ones that require a "Join" of some sort - that's the whole idea of the definition. Essentially, it says: If we have to join an association for the left side of the ||, we also must join it for the right side. This allows us to use only inner joins.

With (||-1), we would support

        from a in CA
        where a.B.C.Q > 25 || a.B.C.R < 15
        select a.P;

because

a.B (which is at depth 3 both in a.B.C.Q and a.B.C.R) occurs on both sides
a.B.C (which is at depth 2 on both sides) occurs on both sides

Likewise, we would support

        from a in CA
        where a.B.C != null && a.B.C.Q > 25 || a.B.C.R < 15
        select a.P;

... check out the depths yourself. However, we would not support

        from a in CA
        where a.B.C == null || a.B.C.Q < 15
        select a.P;

Reason: a.B.C is at depth 2 in a.B.C.Q. But a.B.C is at depth 1 on the left side, hence it does not occur at depth > 1 on the left side!
This is certainly not acceptable, as this is a customary idiom for checking values behind nullable references.

Here is another attempt - this time much less formal :-):

    (||-2) All entity member expressions below a || at depth > 1 are inner joined. The query is evaluated on the join result.

This is, as far as I know, the current NHib.Linq semantics. Alas, it is wrong - it does not conform with non-exceptional Linq2Objects behavior in some cases (see the example for NH-2583 in NHibernate Jira).

Next attempt:

    (||-3) All entity member expressions below a || at depth > 1 are left outer joined. The query is evaluated on the join result.

Here is an example which we should consider:

Class CD points to classes CE and CF via nullable properties E and F.
CE and CF each have a Nullable property P.
We set up a single CD object d0 that points to neither a CE nor a CF (so the tables for CE and CF remain empty); and ask

        from d in CD
        where d.E.P == null || d.F.P == null
        select d;

Obviously, this will throw a NRE if evaluated with Linq2Objects. Under (||-3), it returns d0. Is this "reasonable" as required by (BEHAV-4)? We would get the following effect: When we evaluate

        from d in CD
        where d.E.P == null
        select d;

we would not get d0: There is no ||, hence no outer join, but an inner join. But inner joining CD with CE (which is empty!) yields an empty result set. Likewise,

        from d in CD
        where d.F.P == null
        select d;

is empty. Still, the disjunction of the two queries with || yields d0! I say that this is not reasonable: The result size of an || should be less or equal than the sum of the two separate results. Let's call this common-sense rule "or-sum compatibility".

One remedy could be to always outer join classes mentioned via an entity member expression at depth > 1 anywhere in the query - even if there is no || operator:

    (||-4) All entity member expressions at depth > 1 are always left outer joined. The query is evaluated on the join result.

I do not think that this fundamentally wrong ...

... however, I propose another alternative:

    (||-5) Like (||-3); additionally, for each entity member expression M occurring in an boolean expression C at depth > 1, "M != null" is prepended to C with an &&.
        The "M != null" condition is called an "existence guard".

An example to explain:

        ...
        where ...some condition with a.B.C.Q...
        ...

becomes, under (||-5),

        where a.B != null
           && a.B.C != null
           && ...some condition with a.B.C.Q...

The reason for my proposal is that conceptually, a condition on an object (like a.B.C) should only be evaluated if the object exists - hence the term "existence guard". In the example from (||-3) above, i.e.,

        from d in CD
        where d.E.P == null || d.F.P == null
        select d;

the where condition would be interpreted as

where (d.E != null && d.E.P == null)
           || (d.F != null && d.F.P == null)

Hence, under this semantics, the complete query would not return d0 (as d0 has d.E == null and d.F == null), in agreement with the two queries asking for d.E.P == null and d.F.P == null. Thus, this would also yield an or-sum-consistent ("reasonable") behavior.

So, it's still 1:1 between (||-4) and (||-5). But here is another example:

Class CG points to class CH via a nullable property H.
CH has a Nullable property P.
First, we set up say 3 CHs; and ask

    from h in CH
    where ...condition on h...
    select h;

Let us say that this returns 2 of the 3 CHs.
Now, we set up 5 CGs; and let them point to all 3 CHs. Additionally, we set up another CG (let's call it g0) which does not point to a CH (i.e., g0.H == null). Now we ask

    (from g in CG
     where ...condition from above where h is replaced with g.H...
     select g.H)
    .Distinct();

That is, we find all CGs whose H fulfills the same condition as above; and then retrieve those CHs. Obviously, this set should just be same as the original one (as we made sure that the CGs reach all CHs). g0 should be ignored - after all, there is no CH behind it for which the condition can be evaluated and, hence, be true.

Now let us use the concrete condition

    h.P == null

in this setup. With semantics (||-5), we get exactly the expected result. With semantics (||-4), however, the g0 would be extended by null values; and "g0.H.P == null" would therefore be true; and hence, g0.H - which is null - would be returned. Thus, the second query would yield one more value (namely null) than the first query.
I call this problem the "projection anomaly". It is exhibited by designs that use outer joins but do not employ existence guards.

Added later: From a few tests that Patrick Earl (of NHibernate 3.x) and I have done, it appears that Microsoft's Entity Framework exhibits this behavior, i.e., it has (||-4) semantics.
Yes, this is a boundary case setup. But it shows that "nonchalantly including nulls in condition evaluations" can lead to, in my opinion, "non-reasonable" results. It must be noted that not only simple condition like h.P == null, but also complex conditions with ! and ?? operators (NOT and COALESCE in HQL and on the database) can "unexpectedly" return true for a vector of null values.

Because of the foregoing discussion, I want to argue that semantics (||-5) is the best one currently available.

(||-5) has the consequence that for all queries without || (but also ?: and ! - see below), the resulting HQL and SQL can be formulated with only inner joins.

(||-5) also has the consequence that "null reference violations" are a local phenomenon of the sub-expressions of an ||. In other words, if you have a condition like

        d.E.P == null || d.F.P == null

and d.E is null, then the left side evaluates to false, because it is defined to mean d.E != null && d.E.P == null. Hence, the "null reference violation" in d.E remains "local" to the left side. If now d.F exists (is != null) and d.F.P is null, the whole condition evaluates to true. This behavior cannot be "simulated" with Linq2Objects - there, the evaluation would throw a "global NRE" and hence abandon the evaluation. In this sense, (||-5) is a true semantical extension of the Linq2Objects semantics. As all of the preceding discussion should show, I hope that it is a "reasonable" extension in the sense of (BEHAV-4).

------------------------------------------------------------------------------------------------------------------------

4. Possible semantics for all the logical operator in the NHibernate-Linq provider

The discussion of the || operator has provided us with a semantics that interprets the NHib-Linq condition

    ...some condition with paths...

as roughly

    path1 != null && path2 != null && ... && ...some condition with paths...

where path1, path2, ... are outer joined. What is nice about that (||-5) semantics is that it is actually defined independently of the logical operators! So we can simply use it as a semantics for all logical conditions in NHib Linq. Here are a few examples for the other operators:

&& operator:
The NHib Linq query

        from d in CD
        where d.E.P == null && d.F.P == null
        select d;

is interpreted as equivalent to the Linq2Objects query

        from d in CD
        where (d.E != null && d.E.P == null)
           && (d.F != null && d.F.P == null)
        select d;

It is easy to see that outer joins are actually not necessary for the classes reached via E and F. Still, it is not wrong to use them, as the guards d.E != null and d.F != null will filter out joins where no d.E and d.F is reachable. Using inner joins here is a pure optimization, about which I'll talk below in a design section.

! operator:
The ! operator is actually interesting, because here we get deviations from the standard Linq2Objects semantics. Here is a first example of an NHib Linq query:

        from d in CD
        where !(d.E.P == 1)
        select d;

This is interpreted as equivalent to Linq2Objects' query

        from d in CD
        where !(d.E != null && d.E.P == 1)
        select d;

which is, by a simple de'Morgan, the same as Linq2Objects

        from d in CD
        where d.E == null || d.E.P != 1
        select d;

But if we replace !...== in the NHib Linq query with !=, i.e.,

        from d in CD
        where d.E.P != 1
        select d;

we get as equivalent Linq2Objects

        from d in CD
        where d.E != null && d.E.P != 1
        select d;

Thus, these two NHib Linq queries have two different meanings:

!(d.E.P == 1) means "not guaranteed d.E.P equal to 1, i.e., either it is some other value, or d.E is not even there"
d.E.P != 1 means "the existing d.E.P has some value other than 1; but d.E must be there"

One could call this unreasonable - then one would have to find another semantics. I think we can (or even have to) live with this. Remember that it is always possible to use only NHib Linq queries that are also valid Linq2Objects queries (i.e., not throwing an exception): Then, such extended ("funny") meanings do not occur.

In standard (propositional) logic, one can use de'Morgan's rules to rewrite an || with ! and &&. De'Morgan is actually also valid for shortcut operators, i.e. in Linq2Objects. It seems to me that De'Morgan also holds for the (||-5) semantics, but I'm too lazy to prove this here and now (if you try it: Take care that you do not replace !(... == ...) with (... != ...) - this is not equivalent, as we saw above; but this has nothing to do with de'Morgan).

?: operator:
When we interpret (a ? b : c) as (a && b) || (!(a) && c), then we can apply the semantics rules also to boolean conditions with ?: operators.

The discussion about the "right" (extended) semantics for NHibernate 3.x has just started in http://groups.google.com/group/nhibernate-development.

SWE Musings

Sunday, April 3, 2011

Logical Operators in NHibernate 3.x - Some Contribution of Mine

No comments:

Post a Comment