SWE Musings: April 2011

Saturday, April 30, 2011

Paradoxes of Test Driven Development cont'd.

3. The dimensionality paradox

After a few days, I'm no longer that happy with this name ... But let's look at a high-level example, and maybe find a better name later.

Assume that you have to write a regular expression recognizer; it only needs to output "match" or "no match" for a regular expression and one or possibly many input strings. You start with simple examples, like a or ab, then a?, ab?, continue maybe with [ab]* and (ab)* etc. The first few examples can be handled with a simple parser and a simple interpreter, but more and more complex regexes will make these machines more and more complex - and less and less performant. At some point, the iteratively developed design will no longer hold up to the combined requirements of efficiency and maintainability ...

If you take a look at modern regex recognizers, they contain a key design component: The regexp is translated into a "program" for a special "processor", which then does the input traversal. The design of this processor is influenced (a) by the problems of regexp recognition (e.g., the necessities of fast loop traversal; and backtracking in the case of non-greedy operators and back-references), but also (b) by the runtime processor of the regexp library - virtual machines in the case of Java and .Net, but maybe also real silicon processors.

This powerful, but also complicated in-between concept (between the regexp language with its semantics; and the recognition result "match" / "no match") is something you have to invent. It cannot be derived from the "mere" input-output behavior of the problem at hand. Rather, you need to step back quite far from the problem to see that this design is a possibility that will solve the problem better than more straightforward designs.

Now, what I said is, in a sense, unfair: Test driven design does not prevent that you get more and more understanding during the design and implementation process; and one possibility is that at one point, you rip out much of the previous design and replace it with a new one that now handles all the design aspects better. This is a valid test-driven proceeding, but it is certainly no longer that "incremental".

A similar, but certainly much more "theoretical" example for such an important "intermediate invention" is the heap sort algorithm: Where the data a first put into that strange, not at all sorted heap structure - and then can be pulled out sorted in an efficient way.

------

In a sense, all three paradoxes are special cases of the problem of induction: On which basis can we generalize from a few special cases? However, in another sense, I cheated: These are not paradoxes at all! Why? - because engineering is a discipline where we decide how we want something to be designed. Thus, we are not in the situation that we must detect the generalized rules (as e.g. natural sciences have to do), but we design the generalized rules from a generalized unterstanding. Yet, for all the good that TDD has done, I think we should not forget this:

Also under TDD, it is admissible - and, even more, desirable - that we create designs that transcend simple case-by-case solutions to problems that we present to ourselves (or that are presented to us) as a "mere set of cases."

Tuesday, April 26, 2011

Paradoxes of Test Driven Development

It is, fortunately, common nowadays to construct code by iterative test-driven design (TDD):

Declare that the code under construction is assumed to have a new behavioral property it did not have up to now.
Write a piece of test code that checks for this behavior.
Run the test to prove that the behavior is still missing ("make it red").
Continue construction so that the behavior emerges.
Run the test again to prove that the behavior is now established ("make it green").
Run all previous tests to prove that the construction did not remove a previously established behavior ("make all green").

Having said this, it is still worthwhile to point out the three paradoxes of iterative TDD:

The infinity paradox: A limited number of concrete test cases can never define a behavior sufficiently.
The hill-climbing paradox: Local iterations are not guaranteed to reach a globally agreeable (or sufficient) solution.
The dimensionality paradox: Separation of concerns along one axis may break concerns (or aspects) that would have been useful to keep together.

Of course, all these paradoxes are easily solved by intelligent engineers (one might call this, in itself, a paradox:

The paradox-transgression paradox: That somewhing consistent and useful can emerge from a paradoxical situation.)

Ok - I'm getting too philosophical. Let's look at the three paradoxes above. BTW, I have invented their names on the fly - so if you don't like them, or there are better ones (e.g. in texts I have not read), feel free to substitute your or someone else's names instead!

1. The infinity paradox

This is a well-known feature of TDD. Let's assume you have to implement a function that adds two integral numbers and returns their sum. "Mhm ... sum of numbers ... not an easy topic," you mumble. "Could you give me a very simple example?" "Sure: The sum of zero and zero is zero". "Aha, let's implement this. We do not know any other sum, so we can easily write:"

    public int Sum(int a, int b) {
        return 0;
    }

"But I can also tell you that the sum of three and zero is three." "Well, then so be it:"

        if (a == 3) {
            return 3;
        } else {
            return 0;
        }

"However, also zero plus three is three." "Fine:"

        if (a == 3 && b == 0) {
            return 3;
        else if (a == 0 && b == 3) {
            return 3;
        } else {
            return 0;
        }

"And five plus six is eleven." "Weeell:"

        if (a == 3 && b == 0) {
            return 3;
        else if (a == 5 && b == 6) {
            return 11;
        else if (a == 0 && b == 3) {
            return 3;
        } else {
            return 0;
        }

"And four plus seven is also eleven." "Okaaaaayy:"

        if (a == 3 && b == 0) {
            return 3;
        else if (a == 5 && b == 6) {
            return 11;
        else if (a == 0 && b == 3) {
            return 3;
        else if (a == 4 && b == 7) {
            return 11;
        } else {
            return 0;
        }

You get the idea.

We see: A finite and manageable number of cases is not sufficient to define fully a rule that should hold for large or even infinite input sets. We need to bring in additional techniques. There are three that I consider to be most important:

Number one is what I call "conceptualization." There is no way around "describing the complexity as it is." A sum is a sum is a sum is a sum: It cannot be simplified to something that is fundamentally weaker, like a list of cases. Only conceptualization can yield the implementation

return a + b;

(Formalists turn this argument on its head: They claim that the formal - axiomatic, behavioral, ... - specification alone is sufficient to define a behavior. But time has disproven this - we also need examples; and preferably well-chosen examples).

Number two is integration (especially integration testing): When a concept is used in a broader context or environment, certain aspects of the concept might become more pronounced. So you write an integration test (or manually execute a system test) that actually uses your sum in the intended environment.

However, isn't this paradoxical again (a sub-paradox?)? After all, I have just argued that the "sum concept" cannot be reduced to a "list of cases." But also integration testing is only a "list of cases", albeit in a different, more encompassing domain. So why bother? Simple reason: This usage in a different and relevant domain will extract at least some implicit assumptions not spelt out in the examples or even the conceptual specification.

For example, when using the summing function above in a scenario where numbers count upwards all the time, the topic of overflow suddenly becomes very interesting. If the same function is used to increment and decrement a number representing the size of a stack, overflow is most probably no problem, but we now have to handle negative numbers, even though we'll expect the result never to be negative. When the value represents a temperature in degrees Celsius or Fahrenheit, we finally also have to accept negative numbers as result; etc.

One capability I admire very much is to be able to abstract away from a iven context (and thereby e.g. arrive at the concept of "integral number"), but on the other hand, always keep the "real world", the application in focus and not to "fall over the YAGNI edge."

Even though it only creates additional cases (and hence cannot overcome the fundamental infinity paradox), number three of the important techniques is intelligent selection of cases.

Good testing discipline has all the key ideas and phrases: "Boundary cases", "error cases", "coverage". However, testing argues for such techniques on a basis that does not necessarily interest us in TDD: Namely on "typical programmer errors". In classical testing, boundary cases are selected primarily because it is assumed that programmers tend to make "one off errors".
In TDD, there are no programmer errors - "not yet," at least. The focus on boundary cases here is, predominantly, one of clear behavioral definition. Having cases that show that 0 + x is the same as x, and that this is also true for x + 0 underline the fact that 0 is a neutral element of the summing function - regardless of the probability of a programmer to get that wrong.

Of course, there are other techniques to get the behavior right - reviews, random testing, formal and informal proofs etc. They will help, but are even more work!

2. The hill-climbing paradox

"Hill-climbing" is a mathematical optimization method for multi-dimensional functions. Essentially, it says: If you want to reach the top of the hill (the "optimum"), always go in the direction of the steepest incline. Sounds obvious, doesn't it? The problem with hill-climbing is that you might reach a "local optimum": A small hill far away from the mountain you actually wanted to conquer. Once you are atop the small hill, naive hill-climbing prevents you from going anywhere else: There simply is no incline any more in any direction.

Naive iterative development can lead you astray in similiar ways. Here is a very artificial example: You have to implement a function F(int a, int b) whose result is defined by the following table (which is finite! - so there's no infinity paradox involved):

    0 1 2 3

--+--------

0 | 0 1 2 3

1 | 3 0 1 2

2 | 2 3 0 1

3 | 1 2 3 0

You might start with the first line, i.e., have cases for (0,0), (0,1), (0,2), and (0,3). Including some conceptualization (abstraction; pattern recognition), you end up with something like

    if (a == 0) {

        return b;

}

Including the second line, you arrive at

    if (a == 0) {

        return b;

    } else if (a == 1) {

        return b == 0 ? 3 : b - 1;

}

etc. If, however, you start with different "dimensions", you might see that the difference b - a alone determines the result (by noting the falling "diagonal stripes" in the matrix), and your cases might be

    b - a = 0 ... this always yields zero.
    b - a > 0 ... this always yields b - a.
    b - a < 0 ... this yields b - a + 4.

And if you take a "very holistic" approach, you might actually see that the function is just (b - a + 4) % 4.

The morale of that story? Depending on the method you use to plan your next step (up the hill), you might end at very different solutions. And the simplest method of deciding on the direction - in the matrix example, the "line by line" method - might well lead to a sub-optimal solution (a local optimum).

This little example does of course not prove that much. Still, doing the same with a large state table (as used in some types of parsers) might actually give you a quite intractable piece of code rather than a manageable one; or vice versa.

Test cases are good. Believing that common sense test case selection will lead you to a good algorithm is not.

Continued at Paradoxes of test driven development, cont'd.

The PickAkin Kata - intermezzo

Ilker sits in the same room as I do. While my Windows had to boot (it didn't let go of a directory I needed to rename), I - yes, it's my fault - started a discussion with him about my posting. Here was his major critique: "Why do you start with an integration test? Even if you do not intend to run it right now, why do you use this complex test as a means to pin down the programming model?" (In this case, the whole programming model consists of an API - there are no installs, no generators, no build processes and the like). My answer was twofold:

First, I made an implicit assumption in that series which I should make explicit: I do not write a cookbook about what I would (or you could) or should do. Rather, I want to draw a few topics out into the open; and for this, I bundle bits and pieces maybe differently than when actually doing it "live".

In a real programming exercise, you might decide on a few pieces of the programming model when writing the first test; a few more in the third test; another part in the sixth or twenty-sixth test, where you also might revise a decision from earlier etc. Already if you are a group, this will start to get hard: Different people in the group might believe that decisions should be made at other times - so you'd have to make that decision process more explicit, like I tried to do it. But for my presentation purposes, distributing those decisions and, more interesting, my meta-discussion about why I want to void my own assumptions, would become chaotic.

Second, that "programming model issue" is important to me. After all, this is the "user interface" that is given to other programmers. So it is worth an explicit consideration.

But why then write that acceptance test? Wouldn't a simpler test be better? My - maybe at first astonishing - answer is: This is the simplest test to write! Every other test one could conceive - e.g. using only empty collections - would have to include an additional abstraction step away from the information given to me by Ilker. So everyone who'd want to understand that test would have to accept my arguments that "this smaller test is essentially what we need". But for the data we are given, there's no need to talk about this abstraction of "essentially being the same." So, I argue that, after having read the problem description, this is actually the simplest test regarding the programming model.
It is of course not the simplest test regarding the behavior of the code to be constructed - but that's a totally different issue.

We also had an additional debate that centered on the concept of decision-making in general. But I will not go into this - maybe Ilker writes something, maybe I later.

My PickAkin kata ramblings continue with a solution without TDD ...

The PickAkin Kata - part two

First, I reverse myself: This is a great kata! Why? Exactly because you can use it to tie so many aspects of programming to it. Here are three which I did not intend to write about - as you might guess when you read the following: My test code is wrong, or bad, or "sub-optimal" - in three ways.

The first, obvious problem is that I exchanged the expected and actual parameters. This is a problem I have had since the times I started using JUNit in the last millenium: Customarily, in languages of C heritage we check for a value as follows:

   if (a != 4) throw ...;

In xUnit, we write

   Assert.AreEqual(4, x);

(unless you like the new .That() style even for such simple assertions). Somewhere deep in my brain, the former seems so deeply ingrained that I need to remind me explicitly all the time to "do it the other way in NUnit". But with everything we need to remind ourselves explicitly of, we are bound to fail; especially under stress - like publishing the code to a blog ;-). I think we can learn an important fact from this - some might call it trivial, I like to call it fundamental:

Every engineering error we human beings can make will be made sooner or later.

I leave it to you to draw conclusions about all sorts of technologies that have more physical forces at work than that simple PickAkin kata.

The second problem with my test code is that it checks for the most important fact last. As long as the test runs green, this is ok. If it runs red when we expect it to run red - before we implement sufficient code to make it green -, this is also ok. But it is cumbersome, at least, if we expect it to run green, and it runs red - i.e., when we are in debugging. Then, a "secondary information" like the ceoList being wrong helps less than the "primary information" that the akin list is wrong.

Last, when we are in debugging, we need readable errors. If we check for the correct akin list by

CollectionAssert.AreEquivalent(new[] { "A1", "B2", "A2",
      "A3", "B3", "A4", "D1", "D2", "B1" }, akinList);

but get a different result from some computed IEnumerable<>, the error message looks about as follows:

Expected: equivalent to < "A1", "B2", "A2", "A3", "B3", "A4", "D1", "D2", "B1" >
But was: <System.Linq.Enumerable+WhereArrayIterator`1[System.String]>

You can produce this result e.g. by trivially implementing Pick.Akin(...) as

    akinList = listAReduced = listBReduced = listA.Where(x => x.Equals(x));

Thus, it helps to add .ToList() or .ToArray() to IEnumerable<>s. The trivial code, when checked with

CollectionAssert.AreEquivalent(new[] { "A1", "B2", "A2",
      "A3", "B3", "A4", "D1", "D2", "B1" }, akinList.ToList());

completes with the much nicer error message

    Expected: equivalent to < "A1", "B2", "A2", "A3", "B3", "A4", "D1", "D2", "B1" >
    But was: < "A1", "B1", "A2", "A3", "C1", "D1", "E1", "E2" >

So here is the revised integration test - this time already with an [Ignore] tag because it is not yet time to run it. BTW, in my humble opinion, each and every [Ignore] (and [Explicit]) tag is worth a textual explanation why the respective test is only to be run explicitly, or to be ignored altogether, at this time:

[Test]
[Ignore("Acceptance test cannot run for not yet production-ready implementation")]
public void AcceptanceTestUsingIlkersData() {
    IEnumerable ceoList = new[] {
        "A1", "B1", "A2", "A3", "C1", "D1", "E1", "E2" };
    IEnumerable ctrlList = new[] {
        "F1", "D2", "B2", "B3", "A4" };
    IEnumerable ceoListAfter;
    IEnumerable ctrlListAfter;
    IEnumerable akinList;
    Pick.Akin(ceoList, ctrlList, (a, b) => a[0] == b[0],
        out ceoListAfter, out ctrlListAfter, out akinList);
    CollectionAssert.AreEquivalent(new[] {
        "A1", "B2", "A2", "A3", "B3", "A4",
        "D1", "D2", "B1" }, akinList.ToList());
    CollectionAssert.AreEquivalent(new[] {
        "C1", "E1", "E2" }, ceoListAfter.ToList());
    CollectionAssert.AreEquivalent(new[] { "F1" },
ctrlListAfter.ToList());
}

So long!

Next part: The PickAkin kata - intermezzo

Sunday, April 24, 2011

The PickAkin Kata - part one

My colleague Ilker made a kata posting from a small algorithm question I sent around in our department. He wrote a nice story around it, made it suitably short, and hopefully more than one got bitten by that bug ... ah, well, "found interest in that little algorithmic problem."
I thought a little bit whether I should somehow reply to Ilker's posting - and then I started to get the feeling that that problem (my problem?) was not a good kata, after all. And alone this feeling makes it worthwhile to write about it, I thought.

How would you go about solving this puzzle? Well, I think a standard idea is to write an acceptance test first. Why? Certainly, we would not want to make this test run successfully - it is much too complex for an initial unit test in a TDD culture. No, the reason is that in that step we set the stage for interesting properties of the solution - in this case, the API. Here is my acceptance test:

[Test]
public void AcceptanceTestUsingIlkersData() {
    IEnumerable ceoList = new[] {
        "A1", "B1", "A2", "A3", "C1", "D1", "E1", "E2" };
    IEnumerable ctrlList = new[] {
        "F1", "D2", "B2", "B3", "A4" };
    IEnumerable ceoListAfter;
    IEnumerable ctrlListAfter;
    IEnumerable akinList;
    Pick.Akin(ceoList, ctrlList, (a, b) => a[0] == b[0],
        out ceoListAfter, out ctrlListAfter, out akinList);
    CollectionAssert.AreEquivalent(ceoListAfter, new[] {
        "C1", "E1", "E2" });
    CollectionAssert.AreEquivalent(ctrlListAfter, new[] {
        "F1" });
    CollectionAssert.AreEquivalent(akinList, new[] {
        "A1", "B2", "A2", "A3", "B3", "A4", "D1", "D2", "B1" });
}

... aaah, don't hit me - or at least not that hard! Already at this point, I have made quite a lot of decisions that could have been decided differently:

Implementing it in C# (instead of F#; or SQL)
The API function is called Akin, on a class Pick.
The API function is a static function.
The API function is a generic function - although you do not see this.
The input are two sets as well as a "being similiar" function.
The result is returned in three out parameters.
The parameters are actually of type IEnumerable, with an additional comment that the order is not important.

These are only the ones I know that I decided explicitly. You probably see a few more that, for me, were "a given", "granted", "not even worth mentioning." I won't even defend my choices - rather, I will write down reasons why these choices are bad. After all, that's one very important job of an engineer:

Know the problems of what you are going to introduce.

So here are reasons against my choices:

A set-based language (like SQL ...) would a better fit - after all, we had to annotate all parameters with "order is not important", which is a direct contradiction to the basic definition of IEnumerable.
The name PickAkin is a classic function name: "do what" - similar to FindMatching, InitLazily etc. Splitting the externally defined name into class and method is strange.
Static functions have problems with testability, with injection, with state (implicit singletons).
Generic functions add additional comprehension complexity; and need additional tests if the type parameter can also assume struct types.
Passing functions as delegates creates problems like "access to modified closure"; a classical abstract function in an abstract class with an override in the concrete application does not have this problem.
Out parameters need to be defined beforehand in the client code. At the minimum, this requires additional names for the result sets. Instead, the result could be returned in a classical way as the return type - probably in a special result type. Or, it could be returned into ICollection parameters by using Add on them.
Using immutable set types would directly capture the intent of the parameters.

Now it's your turn: You defend my choices (write down the reasons, as I did above). Why would you do that? Because you probably had different ideas - with their own problems. Defending a different design will highlight these problems - as I did above.

Slow going, isn't it, this kata?

If you do this in a dojo session with 2, 3, 4 people, you might have completely deadlocked at this point - because all these assumptions and decisions have to be brought out. Of course, if you are versed dojo-ers, you probably know how to resolve such discussions. Still, I think that trying out e.g. three different designs alone in this simple case will teach you all (the 2, 3, 4 of you) quite a few different viewpoints. Isn't that the idea of a kata, after all? Doing it over and over to find many "right ways"?

Next part: The PickAkin Kata - part two

Sunday, April 3, 2011

Logical Operators in NHibernate 3.x - Some Contribution of Mine

Just so that this blog doesn't dry up, I (allow myself to cross-)post part of a text I wrote about NHibernate 3.x's handling of logical operators. If you want to tune it to the discussion, you might want to read the group http://groups.google.com/group/nhibernate-development.

1. Considerations about the where semantics for an SQL Linq provider

Before we change the behavior of logical operators in NhHib.Linq, we should have some agreement about the intended behavior, i.e., a specification. Otherwise, one might argue that there is no problem with them - especially with the || operator! Unfortunately (and interestingly), it appears to me the answer to this is not as simple as some might think. The following are some general thoughts - bear with me, I'll talk about the logical operators, and especially the || operator, in a moment (namely in sections 2. to 6.).

The apparently correct way to specify an SQL Linq provider's (or any Linq provider's) behavior is:

    (BEHAV-1) If you have a set of objects S and evaluate a query Q using an SQL Linq provider, the result should be the same as when using the standard in-memory Linq provider Linq2Objects.

Obviously, this specification fails for methods that cannot be transformed to the SQL Linq provider's underlying technology (think virtual object methods, for example). So a more practical definition is:

    (BEHAV-2) ...BEHAV-1..., if the SQL Linq provider accepts Q, i.e., does not throw some exception saying "I do not support this and that".

But it gets more tricky: What are the possible results of Linq2Objects when evaluating Q? Essentially
(a) it can return an IQueryable; or
(b) it can throw an exception.

(BEHAV-2) would require that the SQL Linq provider throws an exception when Linq2Objects throws an exception. Here is an example (I suggest that the reader sketches this and the following class/association structures on a sheet of paper):

    Class CA has a nullable reference called B to class CB; and an int property P.
    Class CB likewise has a nullable reference called C to class CC.
    Class CC has an int property Q.
    In the database, we have a CA object na that references a CB where C is null (i.e. the CB does not reference a CC object).

    We ask

        from a in CA
        where a.B.C.Q > 25
        select a.P;

    Linq2Objects will throw a NullReferenceException when it evaluates a.B.C.Q for a = na.
    How could an SQL Linq provider mimic this exceptional behavior? I think we agree this is not simple:

Either we evaluate a.B.C.Q in-memory to force the NRE - which would of course uselessly load CB objects.
Or we create some code in the SQL statement that somehow throws an SqlException ... which would at least mystify the database's planner, I'd say.

AFAIK, every existing SQL Linq provider does something different: We allow an IQueryable result where Linq2Objects throws an exception. This contradicts (BEHAV-2)! A definition amended for this could be:

(BEHAV-3) ...BEHAV-2...; but if Linq2Objects throws a NullReferenceException, the SQL Linq provider may return an arbitrary IQueryable result.

The problem here is the "arbitrary" result: If you give this SQL Linq provider to programmers, they will come to rely on the result even in these cases - e.g., they will expect that that IQueryably result is always empty. If we could make it work so that the result is reasonably random, this could be avoided (as is the case with SQL SELECTs without ORDER BYs in database programming); but this is IMHO as hard as throwing an exception in the first place. So we give programmers consistent results even in these cases - and they will trust them. So we end up with

(BEHAV-4) ...BEHAV-2...; but if Linq2Objects throws a NullReferenceException, the SQL Linq provider may return an defined and reasonable IQueryable result.

And this is, in my opinion, the end of the story: It means that SQL Linq provider writers have to extend the semantics of Linq. The guidelines for this extension will of course be twofold:

The semantics must be reasonable (understandable; akin to known semantics like e.g. SQL).
It must be possible to implement the semantics efficiently.

I think we agree that the second point drives the definition: We implement the semantics as simple as possible; and then we explain it.

This is exactly what we now have to do for the logical operators and NHib.Linq.

------------------------------------------------------------------------------------------------------------------------

2. Logical operators in SQL Linq providers

"Logical operators" is not a term that is defined in Linq. Yet, we need to define which operators in expressions are interpreted as logical operators in HQL. A practical definition is the following:

|| with return type bool - which is mapped to OR
&& with return type bool - which is mapped to AND
! with return type bool - which is mapped to NOT
?: with return type bool - where a ? b : c is mapped to (a AND b) OR (NOT (a) AND c).

One can create Linq expressions where this mapping is wrong, but I will not talk about other ways of defining logical operators.

I will now define semantics for these operators that hopefully are "reasonable" in the sense von (BEHAV-4) above. In the course of this, I'll give a few examples that highlight problems of possible translations from Linq to SQL; and use these examples to decide on "better" semantics.

------------------------------------------------------------------------------------------------------------------------

3. Possible semantics for the || operator in the NHibernate-Linq provider

One possible semantics is obviously:

    (||-0) || is not supported.

We ignore this ridiculous idea.

Another possible semantics is:

    (||-1) If there is an entity member expressions at depth > 1 in the lhs of the ||, the same member expression must occur in the rhs at depth > 1; and vice versa.

What is an "entity member expression"?
    A member expression with a (result) type that is a mapped entity types. Other member expressions (with simple types or component types) are of no relevance to this discussion, except where explicitly noted.

What's the depth of a member expression?
    A member expression E.M in C# has a lhs expression E and a member M defined in E's type.
    When we parenthesize member expressions, the depth is the number of parentheses to the left of the expression.
    For example,
        - a.P is at depth 1 in (a.P);
        - a.B is at depth 2 in ((a.B).P) and ((a.B).C);
        - a.B is at depth 3 in (((a.B).C).Q); a.B.C is of depth 2 in the same expression.
Entity member expressions of depth > 1 are the ones that require a "Join" of some sort - that's the whole idea of the definition. Essentially, it says: If we have to join an association for the left side of the ||, we also must join it for the right side. This allows us to use only inner joins.

With (||-1), we would support

        from a in CA
        where a.B.C.Q > 25 || a.B.C.R < 15
        select a.P;

because

a.B (which is at depth 3 both in a.B.C.Q and a.B.C.R) occurs on both sides
a.B.C (which is at depth 2 on both sides) occurs on both sides

Likewise, we would support

        from a in CA
        where a.B.C != null && a.B.C.Q > 25 || a.B.C.R < 15
        select a.P;

... check out the depths yourself. However, we would not support

        from a in CA
        where a.B.C == null || a.B.C.Q < 15
        select a.P;

Reason: a.B.C is at depth 2 in a.B.C.Q. But a.B.C is at depth 1 on the left side, hence it does not occur at depth > 1 on the left side!
This is certainly not acceptable, as this is a customary idiom for checking values behind nullable references.

Here is another attempt - this time much less formal :-):

    (||-2) All entity member expressions below a || at depth > 1 are inner joined. The query is evaluated on the join result.

This is, as far as I know, the current NHib.Linq semantics. Alas, it is wrong - it does not conform with non-exceptional Linq2Objects behavior in some cases (see the example for NH-2583 in NHibernate Jira).

Next attempt:

    (||-3) All entity member expressions below a || at depth > 1 are left outer joined. The query is evaluated on the join result.

Here is an example which we should consider:

Class CD points to classes CE and CF via nullable properties E and F.
CE and CF each have a Nullable property P.
We set up a single CD object d0 that points to neither a CE nor a CF (so the tables for CE and CF remain empty); and ask

        from d in CD
        where d.E.P == null || d.F.P == null
        select d;

Obviously, this will throw a NRE if evaluated with Linq2Objects. Under (||-3), it returns d0. Is this "reasonable" as required by (BEHAV-4)? We would get the following effect: When we evaluate

        from d in CD
        where d.E.P == null
        select d;

we would not get d0: There is no ||, hence no outer join, but an inner join. But inner joining CD with CE (which is empty!) yields an empty result set. Likewise,

        from d in CD
        where d.F.P == null
        select d;

is empty. Still, the disjunction of the two queries with || yields d0! I say that this is not reasonable: The result size of an || should be less or equal than the sum of the two separate results. Let's call this common-sense rule "or-sum compatibility".

One remedy could be to always outer join classes mentioned via an entity member expression at depth > 1 anywhere in the query - even if there is no || operator:

    (||-4) All entity member expressions at depth > 1 are always left outer joined. The query is evaluated on the join result.

I do not think that this fundamentally wrong ...

... however, I propose another alternative:

    (||-5) Like (||-3); additionally, for each entity member expression M occurring in an boolean expression C at depth > 1, "M != null" is prepended to C with an &&.
        The "M != null" condition is called an "existence guard".

An example to explain:

        ...
        where ...some condition with a.B.C.Q...
        ...

becomes, under (||-5),

        where a.B != null
           && a.B.C != null
           && ...some condition with a.B.C.Q...

The reason for my proposal is that conceptually, a condition on an object (like a.B.C) should only be evaluated if the object exists - hence the term "existence guard". In the example from (||-3) above, i.e.,

        from d in CD
        where d.E.P == null || d.F.P == null
        select d;

the where condition would be interpreted as

where (d.E != null && d.E.P == null)
           || (d.F != null && d.F.P == null)

Hence, under this semantics, the complete query would not return d0 (as d0 has d.E == null and d.F == null), in agreement with the two queries asking for d.E.P == null and d.F.P == null. Thus, this would also yield an or-sum-consistent ("reasonable") behavior.

So, it's still 1:1 between (||-4) and (||-5). But here is another example:

Class CG points to class CH via a nullable property H.
CH has a Nullable property P.
First, we set up say 3 CHs; and ask

    from h in CH
    where ...condition on h...
    select h;

Let us say that this returns 2 of the 3 CHs.
Now, we set up 5 CGs; and let them point to all 3 CHs. Additionally, we set up another CG (let's call it g0) which does not point to a CH (i.e., g0.H == null). Now we ask

    (from g in CG
     where ...condition from above where h is replaced with g.H...
     select g.H)
    .Distinct();

That is, we find all CGs whose H fulfills the same condition as above; and then retrieve those CHs. Obviously, this set should just be same as the original one (as we made sure that the CGs reach all CHs). g0 should be ignored - after all, there is no CH behind it for which the condition can be evaluated and, hence, be true.

Now let us use the concrete condition

    h.P == null

in this setup. With semantics (||-5), we get exactly the expected result. With semantics (||-4), however, the g0 would be extended by null values; and "g0.H.P == null" would therefore be true; and hence, g0.H - which is null - would be returned. Thus, the second query would yield one more value (namely null) than the first query.
I call this problem the "projection anomaly". It is exhibited by designs that use outer joins but do not employ existence guards.

Added later: From a few tests that Patrick Earl (of NHibernate 3.x) and I have done, it appears that Microsoft's Entity Framework exhibits this behavior, i.e., it has (||-4) semantics.
Yes, this is a boundary case setup. But it shows that "nonchalantly including nulls in condition evaluations" can lead to, in my opinion, "non-reasonable" results. It must be noted that not only simple condition like h.P == null, but also complex conditions with ! and ?? operators (NOT and COALESCE in HQL and on the database) can "unexpectedly" return true for a vector of null values.

Because of the foregoing discussion, I want to argue that semantics (||-5) is the best one currently available.

(||-5) has the consequence that for all queries without || (but also ?: and ! - see below), the resulting HQL and SQL can be formulated with only inner joins.

(||-5) also has the consequence that "null reference violations" are a local phenomenon of the sub-expressions of an ||. In other words, if you have a condition like

        d.E.P == null || d.F.P == null

and d.E is null, then the left side evaluates to false, because it is defined to mean d.E != null && d.E.P == null. Hence, the "null reference violation" in d.E remains "local" to the left side. If now d.F exists (is != null) and d.F.P is null, the whole condition evaluates to true. This behavior cannot be "simulated" with Linq2Objects - there, the evaluation would throw a "global NRE" and hence abandon the evaluation. In this sense, (||-5) is a true semantical extension of the Linq2Objects semantics. As all of the preceding discussion should show, I hope that it is a "reasonable" extension in the sense of (BEHAV-4).

------------------------------------------------------------------------------------------------------------------------

4. Possible semantics for all the logical operator in the NHibernate-Linq provider

The discussion of the || operator has provided us with a semantics that interprets the NHib-Linq condition

    ...some condition with paths...

as roughly

    path1 != null && path2 != null && ... && ...some condition with paths...

where path1, path2, ... are outer joined. What is nice about that (||-5) semantics is that it is actually defined independently of the logical operators! So we can simply use it as a semantics for all logical conditions in NHib Linq. Here are a few examples for the other operators:

&& operator:
The NHib Linq query

        from d in CD
        where d.E.P == null && d.F.P == null
        select d;

is interpreted as equivalent to the Linq2Objects query

        from d in CD
        where (d.E != null && d.E.P == null)
           && (d.F != null && d.F.P == null)
        select d;

It is easy to see that outer joins are actually not necessary for the classes reached via E and F. Still, it is not wrong to use them, as the guards d.E != null and d.F != null will filter out joins where no d.E and d.F is reachable. Using inner joins here is a pure optimization, about which I'll talk below in a design section.

! operator:
The ! operator is actually interesting, because here we get deviations from the standard Linq2Objects semantics. Here is a first example of an NHib Linq query:

        from d in CD
        where !(d.E.P == 1)
        select d;

This is interpreted as equivalent to Linq2Objects' query

        from d in CD
        where !(d.E != null && d.E.P == 1)
        select d;

which is, by a simple de'Morgan, the same as Linq2Objects

        from d in CD
        where d.E == null || d.E.P != 1
        select d;

But if we replace !...== in the NHib Linq query with !=, i.e.,

        from d in CD
        where d.E.P != 1
        select d;

we get as equivalent Linq2Objects

        from d in CD
        where d.E != null && d.E.P != 1
        select d;

Thus, these two NHib Linq queries have two different meanings:

!(d.E.P == 1) means "not guaranteed d.E.P equal to 1, i.e., either it is some other value, or d.E is not even there"
d.E.P != 1 means "the existing d.E.P has some value other than 1; but d.E must be there"

One could call this unreasonable - then one would have to find another semantics. I think we can (or even have to) live with this. Remember that it is always possible to use only NHib Linq queries that are also valid Linq2Objects queries (i.e., not throwing an exception): Then, such extended ("funny") meanings do not occur.

In standard (propositional) logic, one can use de'Morgan's rules to rewrite an || with ! and &&. De'Morgan is actually also valid for shortcut operators, i.e. in Linq2Objects. It seems to me that De'Morgan also holds for the (||-5) semantics, but I'm too lazy to prove this here and now (if you try it: Take care that you do not replace !(... == ...) with (... != ...) - this is not equivalent, as we saw above; but this has nothing to do with de'Morgan).

?: operator:
When we interpret (a ? b : c) as (a && b) || (!(a) && c), then we can apply the semantics rules also to boolean conditions with ?: operators.

The discussion about the "right" (extended) semantics for NHibernate 3.x has just started in http://groups.google.com/group/nhibernate-development.