Saturday, November 24, 2018

Euler's Identity

A little off the track of software but not entirely disconnected...

I've recently taken to mentioning Euler's Identity when we talk in Algorithms class about Euler and the Bridges of Königsberg.
I point out that it includes five of the most important numbers in mathematics: 0, 1, π, i (the square root of -1), and e, the base of natural logarithms (Euler's number); it also involves four of the most important operators: +, =, * and exponentiation.

The only numbers or operators that could be reasonably considered to complete the set would be the number 2 and division.

Have you ever wondered about the definition of π as the ratio of the circumference of a circle to its diameter? Why the diameter? Why not the radius? There are so many situations where we have to talk about 2π, for example the number of radians in a complete circle, or the "reduced" Planck constant (h/2π) as used in Schrödinger's equation).

So, what would be the effect of redefining π as the ratio of the circumference of a circle to its radius? To avoid the most appalling confusion, we would of course have to give it a different symbol. The greek letter tau has been proposed. Employing 𝝉 = 2π, the Euler identity would appear thus:
Now, we would have six numerical quantities and five operators. I have to admit though that it doesn't look quite so elegant this way.

For a more complete discussion of this use of 𝝉, please see Turn (geometry): section Tau Proposals.

OK, back to work!

Wednesday, July 18, 2018

The Sherlock Holmes Guide to Programming, Debugging and Performance Tuning

Back in 2009, I published on this very blog a set of programming "laws" which I modestly called "Hillyard's Laws of Programming." Here they are:
There has even been a fourth law, although that one is a little more nebulous even than the first three.

Recently, I have been working my way through the Sherlock Holmes canon, but listening to audio recordings rather than reading. I know the stories almost by heart but there is nothing like listening to someone else's interpretation to trigger little observations that may have escaped one on previous readings. I have thus realized that Sherlock Holmes has made many pronouncements to Watson, his amanuensis, that show that the art of detection and that of software development have so much in common that Holmes would have been a first-rate programmer if there had been computers in his day.

Of course, we must not forget that Holmes, despite his familiarity, was in fact fictional--the creation of Sir Arthur Conan Doyle. Doyle was a strange man. Despite being scientifically trained, he was nevertheless a believer in all sorts of hocus-pocus. But the statements which he has Holmes make are prescient when considered in the realm of programming.

There are of course many sub-disciplines involved in software engineering, development, coding, whatever you want to call it, including (but not limited to):
  • programming (relating use cases to a particular design);
  • debugging;
  • performance tuning.
Of these, the greatest degree of mystery pertains to debugging and, perhaps to a lesser extent, performance tuning. It is to these activities that most of these statements relate most appropriately.

From The Sign of Four, one of the early novellas, he writes:
"Eliminate all other factors, and the one which remains must be the truth."
Or, similarly, look first at the following statement from the The Adventure of the Beryl Coronet:
"It is an old maxim of mine that when you have excluded the impossible, whatever remains, however improbable, must be the truth."
I alluded to this in my "First Law." If you have positively eliminated a fragment of code from your pool of suspicion, then the problem must be in some other part of the program, even if that seems highly unlikely. I have personally spent hours looking at the same bit of code, trying to find a flaw in it, only to realize later that I was looking in the wrong place! 

It is all too easy to assume that some part of the code (which was "working before" or which has been tested by someone else, etc.) is perfect. Sherlock Holmes puts it well in The Adventure of the Reigate Squires: 
“Now, I make a point of never having any prejudices, and of following docilely wherever fact may lead me, …”
This observation applies manifestly to performance tuning also. It is so easy to make assumptions, such as "if I cache this, then the performance must improve." Never do any such thing without testing the result.

When faced with a plethora of possibly conflicting results, it is important to know which you should trust the most. For example, you cannot trust the order in which buffered I/O occurs. If you need to be sure of the order, then you should use logs or unbuffered I/O. 

Sherlock Holmes summed it up thus, again from the same story:
“It is of the highest importance in the art of detection to be able to recognize, out of a number of facts, which are incidental and which vital. Otherwise your energy and attention must be dissipated instead of being concentrated.”
So, to employ our example above, the order of buffered output is incidental whereas the order in logs is (usually) vital.

My second "law" has to do with the situation you sometimes find yourself in where there are two seemingly independent problems with your code. Let's say you are concentrating on problem A, which is proving challenging but so far intractable, while you are aware of an apparently minor problem B for which you think you have a simple solution. It's tempting to concentrate your efforts on the more interesting problem (A). But you would be well advised to take a slight detour and fix problem B. You never know: that fix might also be the solution to problem A (it's happened to me many times).

Holmes understood this also, as evidenced by this comment from The Adventure of the Musgrave Ritual: 
“‘At least,' said [Holmes], 'it gives us another mystery, and one which is even more interesting than the first. It may be that the solution of the one may prove to be the solution of the other.”
The third "law" relates to the practice of peer programming. I can't count the number of times I've asked someone for help and then, midway through explaining the background of the problem, I've realized my own error. Holmes was aware of this phenomenon too, for he states in The Adventure of the Blue Carbuncle: 
“Not at all. I am glad to have a friend with whom I can discuss my results.”
And, even more explicitly, he discusses it in The Adventure of Silver Blaze:
“At least I have got a grip of the essential facts of the case. I shall enumerate them to you, for nothing clears up a case so much as stating it to another person, and I can hardly expect your co-operation if I do not show you the position from which we start.”
A certain amount of imagination is also extremely helpful when trying to solve a problem. If you imagine a particular scenario, it may follow that the currently mystifying behavior of your code comes to be a natural outcome of your imagined situation. Again from Silver Blaze (incidentally, one of the very best stories):
”See the value of imagination," said Holmes. "It is the one quality which Gregory lacks. We imagined what might have happened, acted upon the supposition, and find ourselves justified. Let us proceed."
Sometimes a clue comes to you not from observed behavior but from expected behavior that you do not observe. Many's the time I have instrumented some method with a log message or unbuffered print statement only to find that I get no output whatsoever. This usually is enough to tell me that, despite my expectations, the method was never actually called. One of the most famous exchanges of Sherlock Holmes covers this point (again from Silver Blaze):
[Inspector Gregory] “Is there any other point to which you would wish to draw my attention?” 
“To the curious incident of the dog in the night time.” 
“The dog did nothing in the night-time.” 
"That was the curious incident,” remarked Sherlock Holmes.
Let us now return to the second passage quoted above, having to do with casting any prior prejudices aside. I would venture to suggest that this is perhaps the most important guideline that Holmes give us: to carefully gather as much of our evidence as possible before forming a theory. He sums this attitude up in the very first of the Sherlock Holmes stories published in the Strand Magazine--A Scandal in Bohemia:
“This is indeed a mystery,” I [Watson] remarked [to Holmes]. “What do you imagine that it means?” 
“I have no data yet. It is a capital mistake to theorise before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”
His basic theme is similar in the following statement from The Sign of Four:


“No, no: I never guess. It is a shocking habit,—destructive to the logical faculty.” 
I hope that these utterances of Sherlock Holmes will help you take the proper course of action when presented with a problem in programming, debugging or performance tuning. Clearly, my remarks are intended to apply to any program language or system, not just Java.

OK, back to work!

Updated with quotations from The Sign of Four, Mar 12 2019

Friday, January 19, 2018

Things that Java got wrong - part 4

This is perhaps rather less serious than some of my other nitpicks regarding Java. This relates to the admonishment by the Java collections classes that they only work correctly if the hashCode and equals methods of an object are consistent.

If that is a requirement--and I think we'd all agree that it is--then why allow it to be otherwise? Require that those two methods are delegated to an inner class that forces them to be consistent.

This is the kind of thing I have in mind...

The first component is a class called Equable which implements both equals and hashCode. There is a constructor which takes an Iterable of objects, which correspond to the fields of a user type. Here's a link to the source on github.

BaseEquable is an abstract class which can be extended by a user type and which defines both equals and hashCode in terms of an abstract method getEquable which returns, for a sub-type, an instance of Equable. Source link.

A user type which extends BaseEquable simply has to implement getEquable something like this (where, in this example, there are two fields: x and y):

@Override
public Equable getEquable() {
Collection<Object> elements = new ArrayList<>();
elements.add(x);
elements.add(y);
return new Equable(elements);
}

Now, because the actual work of equals and hashCode is delegated to methods which enforce consistency, there is no danger of those two methods being inconsistent. I've also demonstrated (in the same repository of github) that it's easy to extend to including a consistent version of compareTo also.

OK, back to work!