Saturday, November 24, 2018

Euler's Identity

A little off the track of software but not entirely disconnected...

I've recently taken to mentioning Euler's Identity when we talk in Algorithms class about Euler and the Bridges of Königsberg.
I point out that it includes five of the most important numbers in mathematics: 0, 1, π, i (the square root of -1), and e, the base of natural logarithms (Euler's number); it also involves four of the most important operators: +, =, * and exponentiation.

The only numbers or operators that could be reasonably considered to complete the set would be the number 2 and division.

Have you ever wondered about the definition of π as the ratio of the circumference of a circle to its diameter? Why the diameter? Why not the radius? There are so many situations where we have to talk about 2π, for example the number of radians in a complete circle, or the "reduced" Planck constant (h/2π) as used in Schrödinger's equation).

So, what would be the effect of redefining π as the ratio of the circumference of a circle to its radius? To avoid the most appalling confusion, we would of course have to give it a different symbol. The greek letter tau has been proposed. Employing 𝝉 = 2π, the Euler identity would appear thus:
Now, we would have six numerical quantities and five operators. I have to admit though that it doesn't look quite so elegant this way.

For a more complete discussion of this use of 𝝉, please see Turn (geometry): section Tau Proposals.

OK, back to work!

Wednesday, July 18, 2018

The Sherlock Holmes Guide to Programming, Debugging and Performance Tuning

Back in 2009, I published on this very blog a set of programming "laws" which I modestly called "Hillyard's Laws of Programming." Here they are:


There has even been a fourth law, although that one is a little more nebulous even than the first three.

Recently, I have been working my way through the Sherlock Holmes canon, but listening to audio recordings rather than reading. I know the stories almost by heart but there is nothing like listening to someone else's interpretation to trigger little observations that may have escaped one on previous readings. I have thus realized that Sherlock Holmes has made many pronouncements to Watson, his amanuensis, that show that the art of detection and that of software development have so much in common that Holmes would have been a first-rate programmer if there had been computers in his day.

Of course, we must not forget that Holmes, despite his familiarity, was in fact fictional--the creation of Sir Arthur Conan Doyle. Doyle was a strange man. Despite being scientifically trained, he was nevertheless a believer in all sorts of hocus-pocus. But the statements which he has Holmes make are prescient when considered in the realm of programming.

There are of course many sub-disciplines involved in software engineering, development, coding, whatever you want to call it, including (but not limited to):

  • programming (relating use cases to a particular design);
  • debugging;
  • performance tuning.
Of these, the greatest degree of mystery pertains to debugging and, perhaps to a lesser extent, performance tuning. It is to these activities that most of these statements relate most appropriately.

Let us look first at the following statement from the The Adventure of the Beryl Coronet:
"It is an old maxim of mine that when you have excluded the impossible, whatever remains, however improbable, must be the truth."
I alluded to this in my "First Law." If you have positively eliminated a fragment of code from your pool of suspicion, then the problem must be in some other part of the program, even if that seems highly unlikely. I have personally spent hours looking at the same bit of code, trying to find a flaw in it, only to realize later that I was looking in the wrong place! 

It is all too easy to assume that some part of the code (which was "working before" or which has been tested by someone else, etc.) is perfect. Sherlock Holmes puts it well in The Adventure of the Reigate Squires: 
“Now, I make a point of never having any prejudices, and of following docilely wherever fact may lead me, …”
This observation applies manifestly to performance tuning also. It is so easy to make assumptions, such as "if I cache this, then the performance must improve." Never do any such thing without testing the result.

When faced with a plethora of possibly conflicting results, it is important to know which you should trust the most. For example, you cannot trust the order in which buffered I/O occurs. If you need to be sure of the order, then you should use logs or unbuffered I/O. 

Sherlock Holmes summed it up thus, again from the same story:
“It is of the highest importance in the art of detection to be able to recognize, out of a number of facts, which are incidental and which vital. Otherwise your energy and attention must be dissipated instead of being concentrated.”
So, to employ our example above, the order of buffered output is incidental whereas the order in logs is (usually) vital.

My second "law" has to do with the situation you sometimes find yourself in where there are two seemingly independent problems with your code. Let's say you are concentrating on problem A, which is proving challenging but so far intractable, while you are aware of an apparently minor problem B for which you think you have a simple solution. It's tempting to concentrate your efforts on the more interesting problem (A). But you would be well advised to take a slight detour and fix problem B. You never know: that fix might also be the solution to problem A (it's happened to me many times).

Holmes understood this also, as evidenced by this comment from The Adventure of the Musgrave Ritual: 
“‘At least,' said [Holmes], 'it gives us another mystery, and one which is even more interesting than the first. It may be that the solution of the one may prove to be the solution of the other.”
The third "law" relates to the practice of peer programming. I can't count the number of times I've asked someone for help and then, midway through explaining the background of the problem, I've realized my own error. Holmes was aware of this phenomenon too, for he states in The Adventure of the Blue Carbuncle: 
“Not at all. I am glad to have a friend with whom I can discuss my results.”
And, even more explicitly, he discusses it in The Adventure of Silver Blaze:
“At least I have got a grip of the essential facts of the case. I shall enumerate them to you, for nothing clears up a case so much as stating it to another person, and I can hardly expect your co-operation if I do not show you the position from which we start.”
A certain amount of imagination is also extremely helpful when trying to solve a problem. If you imagine a particular scenario, it may follow that the currently mystifying behavior of your code comes to be a natural outcome of your imagined situation. Again from Silver Blaze (incidentally, one of the very best stories):
”See the value of imagination," said Holmes. "It is the one quality which Gregory lacks. We imagined what might have happened, acted upon the supposition, and find ourselves justified. Let us proceed."
Sometimes a clue comes to you not from observed behavior but from expected behavior that you do not observe. Many's the time I have instrumented some method with a log message or unbuffered print statement only to find that I get no output whatsoever. This usually is enough to tell me that, despite my expectations, the method was never actually called. One of the most famous exchanges of Sherlock Holmes covers this point (again from Silver Blaze):
[Inspector Gregory] “Is there any other point to which you would wish to draw my attention?” 
“To the curious incident of the dog in the night time.” 
“The dog did nothing in the night-time.” 
"That was the curious incident,” remarked Sherlock Holmes.
Let us now return to the second passage quoted above, having to do with casting any prior prejudices aside. I would venture to suggest that this is perhaps the most important guideline that Holmes give us: to carefully gather as much of our evidence as possible before forming a theory. He sums this attitude up in the very first of the Sherlock Holmes stories published in the Strand Magazine--A Scandal in Bohemia:
“This is indeed a mystery,” I [Watson] remarked [to Holmes]. “What do you imagine that it means?” 
“I have no data yet. It is a capital mistake to theorise before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”
I hope that these utterances of Sherlock Holmes will help you take the proper course of action when presented with a problem in programming, debugging or performance tuning. Clearly, my remarks are intended to apply to any program language or system, not just Java.

OK, back to work!

Friday, January 19, 2018

Things that Java got wrong - part 4

This is perhaps rather less serious than some of my other nitpicks regarding Java. This relates to the admonishment by the Java collections classes that they only work correctly if the hashCode and equals methods of an object are consistent.

If that is a requirement--and I think we'd all agree that it is--then why allow it to be otherwise? Require that those two methods are delegated to an inner class that forces them to be consistent.

This is the kind of thing I have in mind...

The first component is a class called Equable which implements both equals and hashCode. There is a constructor which takes an Iterable of objects, which correspond to the fields of a user type. Here's a link to the source on github.

BaseEquable is an abstract class which can be extended by a user type and which defines both equals and hashCode in terms of an abstract method getEquable which returns, for a sub-type, an instance of Equable. Source link.

A user type which extends BaseEquable simply has to implement getEquable something like this (where, in this example, there are two fields: x and y):

@Override
public Equable getEquable() {
Collection<Object> elements = new ArrayList<>();
elements.add(x);
elements.add(y);
return new Equable(elements);
}

Now, because the actual work of equals and hashCode is delegated to methods which enforce consistency, there is no danger of those two methods being inconsistent. I've also demonstrated (in the same repository of github) that it's easy to extend to including a consistent version of compareTo also.

OK, back to work!

Wednesday, February 11, 2015

Things that Java got wrong, part 3

Last time in this series (Things that Java got wrong, part 2), I talked about interfaces and the lack of an ability to include default method bodies. Happily, this major oversight has been fixed in Java 1.8.

But now I want to rail against another aspect of interfaces and abstract types (or rather their usage) that I think is by far the worst thing that the Java designers ever messed up. The "Number" type.

First, let's look at three simple reasons why java.lang.Number is so bad.
  • it should be an interface -- or rather several interfaces -- but instead it is an abstract type;
  • if it is going to be an abstract type, then at least let's have it implement Comparable<Number> --- but it doesn't; all of its sub-classes, say X, implement Comparable<X> but that's not the same thing at all: if you need a generic type that implements Number and Comparable, it can't be done without creating your own type!
  • it doesn't even have a method to let you find out if the type is integral or real (forget about complex) -- you have a Number, you can check if it implements Integer, Long, etc. but if somebody creates a new sub-class of Number that happens to be integral, you won't catch it.
I ran into all of these problems recently while working on a new open-source framework for dealing with fuzzy objects (i.e. objects with uncertainty) and they caused me some big headaches. You can find the project at FuzzyJ.

Let's think about how we would go about defining an interface (or interfaces) to represent numbers. Sounds pretty straightforward, right? But it isn't quite that simple. There's a world of difference between the integers, where the successor or predecessor operators make perfect sense, and the real numbers where those operators don't make much sense -- while operators such as round are useful. And then there are complex numbers, rational numbers, irrational numbers, etc. etc. In other words, different types of numbers require different methods. In fact, to put it another way by inverting the question, the operators essentially define the number classes. Is there a fundamental set of operators that would apply to all numbers? There really isn't. But a reasonable set that works with most types of number is this: addition, multiplication, negation, perhaps some others, including compare with.

But already we run into problems. If the set of numbers you're modeling is the positive integers, then negation makes no sense.

So, let's start out with something like this:

public interface Numeric extends Comparable<numeric> {
 Numeric add(Numeric other);
 Numeric multiply(Numeric other);
}
This will work for the positive integers and most other classes. If we want to extend the class to all integers, then we can define the following:
public interface Integral extends Numeric {
 Numeric negate(Numeric other);
}

So far so good. We can now define an IntegralBase class based on the int primitive:

public class IntegralBase implements Integral {

 private int value;

 public IntegralBase(int value) {
  super();
  this.value = value;
 }

 @Override
 public Numeric add(Numeric other) {
  if (other instanceof IntegralBase)
   return new IntegralBase(this.value+((IntegralBase) other).value);
  throw new RuntimeException("cannot add non-IntegralBase object");
 }

 @Override
 public Numeric multiply(Numeric other) {
  if (other instanceof IntegralBase)
   return new IntegralBase(this.value*((IntegralBase) other).value);
  throw new RuntimeException("cannot multiply non-IntegralBase object");
 }

 @Override
 public int compareTo(Numeric other) {
  if (other instanceof IntegralBase)
   return Integer.compare(this.value, ((IntegralBase) other).value);
  throw new RuntimeException("cannot add non-IntegralBase object");
 }

 @Override
 public Integral negate(Integral other) {
  return new IntegralBase(-this.value);
 }
}
But we're already beginning to get into difficulties. We don't have anything good to do if we try to add (multiply, or compare) an object which is not Integral (or, more specifically, an IntegralBase). What if the "int" primitive isn't sufficient for our purposes and we need a BigInteger? We could define a BigIntegral class just like the one above. Or we could make Integral generic, except of course that "int" cannot be a generic type because it's a primitive.

But even this is better than the setup that the Java designers gave us. What we have in Java is an abstract type (not an interface) called Number.

public abstract class Number implements java.io.Serializable {

    public abstract int intValue();

    public abstract double doubleValue();

    // etc. etc.
}
That's basically all there is apart from longValue, floatValue, etc. There's no good way to find out if the object we are dealing with is a whole number (operable with one set of operations) or a real number (operable with another set, with some overlap).

The designers of the math3 package from Apache "commons" have helped somewhat. They do bring in a little mathematics withe the Field and FieldElement interfaces. And they provide a type for rational numbers in BigFraction.

But in my humble opinion, Java, while it is admittedly a general-purpose language, could have done so much better right from the start.

OK, back to work.

Monday, November 17, 2014

TiVo

I love TiVo, that's to say I love digital video recorders. I've been letting TiVo simplify my life -- and avoid commercials -- for 13 years now.

Nevertheless, I'm going to use the TiVo user interface as an example of how not to write user interfaces. It seems that they cobbled together something pretty basic when they got started in 1999 and they haven't improved it since. There have been a few minor tweaks and/or name changes but nothing substantial. I don't have the Roamio -- maybe the user interface there is different [see postscript] -- but the classic UI on my "series 3" is simply a bad design that has never been fixed.

According to Wikipedia, there are seven principles of user interface design. While the TiVo design does an adequate job with six of the seven principles, I believe it falls quite short in the seventh:
  • Conformity with user expectations: the dialogue conforms with user expectations when it is consistent and corresponds to the user characteristics, such as task knowledge, education, experience, and to commonly accepted conventions.
What this says in other words is that the UI should operate on the same model of the world (or, more specifically, the relevant subset of the world) as does the user. That makes it user-centric, rather than information-centric, system-centric or whatever. It is the job of the UI (not the user) to translate between the user's model of the world and the system's internal model.

Let me start with the simplest and most fundamental error: when you are, say, watching live TV and you go up to the top-level menu, you would naturally expect that "live TV" would be the current selection. But no, "Now Playing List" is the new selection. That means that if you inadvertently clicked up to the menu and then pressed "Select" you would expect to be back watching live TV -- but you aren't. That breaks perhaps the #1 rule of user interface design: the principle of least surprise. Or, to put it in terms of the above definition, the UI is supposed to be conform to user expectations and be consistent.

Another major mismatch between the TiVo UI model and the way viewers think: channels. Back in the day when there were just a few channels available, essentially one per network, the concept of a channel meant something. You just "knew" which channel a program would be on and it didn't make any sense for it to be on a different channel. But that situation was long gone, here in the USA at least, when TiVo was introduced so it has never made any sense. The viewer simply doesn't care which channel something is on. And, truth be told, neither does TiVo. Yet the user is required, when setting up a Season Pass, for example, to specify the channel. The Season Pass largely ignores this information because it actually lists all of the upcoming episodes, regardless of channel. The user does distinguish between first-run and repeats. And TiVo asks about that. But when listing episodes, it doesn't make any distinction. Consistency!

Another issue that is a fundamental breach of UI design (but strangely is not mentioned in the Wiki article) is that the controller should always be "live." That is to say, there should never be an operation that the user can initiate that he can't cancel or switch to some other operation. Frequently, TiVo goes into a funk while it is reacting to a user command -- and the user is helpless until the action finishes. And there isn't even an indication of how long the action is likely to take.

But my biggest complaint of all is that TiVo has not changed the model to accommodate high-definition TV. Although HDTV was, in theory at least, around in 1999 when TiVo was launched, it didn't become mainstream until the mid-2000s. The PBS HD channel began operations in 2004, for example. Should TiVo have anticipated HD? Of course they should, but it probably would have been acceptable for them to remodel the UI after their first few years of operation. Note that I am talking about the UI here. At some point (around 2005?) new TiVos did support recording and playing HD programs. But the UI continues in blissful ignorance of this rather important concept. For example, when setting up a Season Pass, you cannot specify that you do (or do not) want to record in HD. You can try to persuade the TiVo by specifying you want to record from an HD-only channel (in order to do this, however, you have to delete the old season pass and reprogram it -- unbelievable). But even then, the only way you can insist that a program be recorded in HD is to tell TiVo that you don't receive the corresponding non-HD channel(s). Bizarre in the extreme.

There are many other issues that I have with the TiVo UI. Things that they certainly ought to have fixed in 13 years! But I've covered the main points.

The conclusion? When you're designing a UI, don't think about the way your system works, or how the information is stored in your internal storage. Think about the way the user will want to interact with the system, how he or she will "think" about what they are doing. Model that instead and make all of the interactions consistent with that model. Yes, that's work. But isn't that what your paid for?

OK, back to work!

Postscript: I drafted this on 10/31 and the very next day my TiVo expired (the fan stopped working). No, I don't think it was a conspiracy between Google and TiVo. The TiVo people were quite helpful in getting me an upgrade to the Roamio. It was a significant operation to get it working, requiring collaboration with three different and not entirely cooperative entities: TiVo, Comcast and me. And my old expander disc, while "compatible" with the new model, is completely unreadable. So, basically, I lost everything that I had previously recorded. This seems the height of poor system design. Why on earth would they consider the internal disc and the external disc to be one single volume?

But the look and feel has improved enormously. The TiVo menus are now in HD and they have fixed quite a few of the problems I mentioned above. How is it, though, that those improvements were not available to the old Series 3? There are still breaches of the principle of least surprise. For instance, if you set up a season pass now and choose "new" only, the default channel chosen will, it seems, most likely be a channel that only shows re-runs. You can change it if you happen to notice, but TiVo will not warn you that the season pass will do nothing.

Wednesday, August 27, 2014

Exception handling -- part 2

I last previously talked about exception handling in a blog a couple of years ago: Exception Handling. That was a fairly short blog which attempted to cure some of the more nefarious problems in exception handling which I sometimes see in code. Following these recommendations (nothing that isn't already very obvious) will result in "OK" code.

Now, I want to write up some more advanced guidelines such that following them will, in my humble opinion of course, improve "OK" code to "good" code.

I'm going to refer to the excellent tutorial on Java Exceptions as I continue. You should definitely read and inwardly digest that material. I particularly recommend the section Unchecked Exceptions -- the Controversy.

In the throwing and handling of exceptions, it seems to me that context is everything. This is why we sometimes wrap one exception in another -- because it allows us to add context. But there's another reason to wrap an exception. Here's a common situation:

import java.security.GeneralSecurityException;
import java.sql.Blob;
import java.sql.SQLException;

import javax.crypto.Cipher;

import org.apache.derby.iapi.jdbc.BrokeredConnection;
import org.apache.derby.iapi.jdbc.BrokeredConnectionControl;

public class MyConnection extends BrokeredConnection {

 public MyConnection(final BrokeredConnectionControl bcc, final Cipher cipher) throws SQLException {
  super(bcc);
  this.cipher = cipher;
 }

 public Blob createBlob() throws SQLException {
  return new EncryptedBlob(this.cipher) {

   public byte[] getBytes(final long pos, final int length) throws SQLException {
    final byte[] data = new byte[length];
    // Fill in the actual data from somewhere
    try {
 return this.cipher.doFinal(data, (int) pos, 0);
    } catch (final GeneralSecurityException e) {
 throw new SQLException("crypto problem with cipher "+this.cipher
   + ", length: " + length, e);
    }
   }
  };
 }

 Cipher cipher;
}

public abstract class EncryptedBlob implements Blob {

 protected Cipher cipher;

 /**
  * @param cipher the cipher to use for encryption/decryption
  * 
  */
 public EncryptedBlob(Cipher cipher) {
  super();
  this.cipher = cipher;
 }

 public long length() throws SQLException {
  return 0;
 }

        // etc. etc.
}

Here, we are extending the (Apache) Derby Connection implementation to allow for encryption/decryption of blobs. The details aren't important. But note the signature of the getBytes() method in the blob implementation. It throws a SQLException. But when we try to perform encryption, we are going to have to deal with a GeneralSecurityException. We have no choice about whether to catch or specify: we must catch it. We could eat the exception but that wouldn't be very good (see previous blog)! But since we have the ability to throw a SQLException, we will do just that: wrap the caught exception inside a SQLException. This of course also gives us the opportunity to provide some context: in this case, we don't want to pass back actual data which would be potentially insecure but, since the cipher details and the length of the byte array are quite likely to be relevant, we add those in.

What happens when we are implementing a method that doesn't throw an exception? An example of this is in the ActionListener interface, the actionPerformed(ActionEvent e) method.

 new ActionListener() {
   
  public void actionPerformed(ActionEvent e) {
   // call method that throws a checked exception  
   }
  };

The problem here is that we can't specify the exception in the method signature and we can't wrap the exception in a checked exception. We have to either handle it somehow, or throw it as a RuntimeException. That's OK. It essentially will therefore treat whatever exception is thrown as a programming (logic) error. In other words, by the time the exception bubbles up to a potential handler, we really won't be able to handle it unless we specifically catch RuntimeExceptions. And then we would have to look to see if the cause was possible to be handled. This doesn't really make good sense.

However, I should also note that, as beauty is in the eye of the beholder, so too an exception is a programming (logic) error according to the programmer. Suppose you parsing a String as a Number. If the String was created by code, then the exception is probably justified as a RuntimeException (which it is: NumberFormatException extends RuntimeException). But what if this String was read from a file or the user just typed it in. That's not a logic error. Now, we want it to be a checked exception because we must handle it somehow. So, I'm not sure that the line between checked and unchecked exceptions is quite as clear as the tutorial suggests. In this particular case, for example, the Java designers seem to have got it wrong.

If you find yourself wrapping an exception of type A in a new exception of type A, then you almost certainly shouldn't be doing it. You wrap for necessity (as above) or context. Unless there's significant context to add, you should probably leave the exception alone and let it be passed up the stack.

Now, I want to talk about handling exceptions by logging. Let's say you decide to catch an exception rather than passing it up as is. You must handle the exception by one of the following:
  • performing some other logic based on the information that an exception was thrown [for example converting a String to a Number: you first call Integer.parse(x) and if that throws a NumberFormatException  you instead try Double.parse(x)].
  • wrapping it in a new exception (as described above).
  • logging it.
In the last case, you are asserting that it's OK to continue. Meanwhile, for the purpose of improving the product you have kept a record of the incident and, if it was caused by a bug you have, in the logs, the stack trace to help in debugging.

But you shouldn't do more than one of these things. Think of it this way: there shouldn't be more than one reference to the exception. We should never, for example, find an exception has been logged twice. In the following code, the try/catch in doMain() is completely unnecessary. And it results in two copies of the exception going forward. Bad practice.
public class X {

 public void doSomethingUseful() throws Exception {
  throw new Exception("problem");
 }

 protected void doMain() throws Exception {
  try {
   doSomethingUseful();
  } catch (final Exception e) {
   logger.log(Level.WARNING, "something bad", e);
   throw new Exception("wrapped", e);
  }
 }

 private static Logger logger = Logger.getLogger(X.class.getName());

 public static void main(final String[] args) {
  final X x = new X();
  try {
   x.doMain();
  } catch (final Exception e) {
   e.printStackTrace();
  }
 }
}

If you are writing UI code and an exception bubbles up from below, then it makes sense to do the following:
  1. log it; and
  2. if appropriate, tell the user what went wrong (in terms the user will understand, which is generally not the way exceptions are created) and what he/she can do about it.
Finally, I want to strongly suggest the following rules (which I will not attempt to justify):

  • There should be no more than one try/catch/finally block in any one method. Parallel to this rule is that there should be no more than one loop clause in a method (and ideally only one if construct).
  • Don't bother to catch any exceptions in a private method unless you are going to do something really useful with it.
  • As much as possible (Java 1.7 is good here) bunch the catch clauses together.
  • Be careful only to catch the types of exception that you really want to handle (and can handle) -- don't for example specify Exception in a catch clause because you will end up catching RuntimeExceptions and you won't know if it's safe to proceed. It's OK to catch a superclass (e.g. GeneralSecurityException) but Exception is just too generic.
  • User try-with-resource when appropriate (Java 1.7).
OK, back to work!

Friday, April 18, 2014

Reactive Programming

LinkedIn invited me to write a blog on the site (I don't know if this is a special privilege or the ask everyone) but I wanted to write about Reactive Programming. I have a friend who is looking for work and has been around the block in the software industry for a long time (we're the same age). So, I was able to use his situation as the seed for the idea.

If you'd like to read it, here it is.

OK, back to work!