Friday, October 19, 2012

More ways to try to maximize quality

Yesterday we looked at unit testing, especially test-driven development.  I'm going to add one more type of test here: a test for a private method (or any method which is otherwise out of scope).

Take a look at these two methods in the test class:
 @Test
 public void getTotal_A$() throws Exception {
  final Randomizer target = new Randomizer(getRandom());
  assertEquals(0, getTotal(target));
  target.setBands(1, 2, 3, 4);
  assertEquals(10, getTotal(target));
 }

 private int getTotal(final Randomizer target) throws Exception {
  Object total = TestUtilities.invokeInstanceMethod(target, "getTotal", new Class[] {}, new Object[] {}); //$NON-NLS-1$
  assertTrue(total instanceof Integer);
  return ((Integer) total).intValue();
 }


Most of the time, we can probably skip testing private methods.  But there are times when the private method contains some especially tricky logic, which is then further complicated by its caller(s) and we want to isolate the test to focus on precisely the private method.  I use a utility method which is in a class called TestUtilities which I put in the test directory along with the unit test classes.  The particular method used here is simple enough - I'll just show it in-lined (although the class actually has a number of methods with convenient signatures and is rather more elegant).
 public static Object invokeInstanceMethod(final Object instance, final String methodName, final Class[] paramClasses,
   final Object[] paramObjects) throws Exception {
  final Method method = instance.getClass().getDeclaredMethod(methodName, paramClasses);
  method.setAccessible(true);
  return method.invoke(instance, paramObjects);


Now let's take a look at some tools to try to improve code quality.  In a previous blog, I extolled the virtues of cleaning up code (using the cleanup tool in Eclipse, for example) before commiting to source control. However, perhaps I didn't sufficiently emphasize that it is not just formatting that is significant, but code cleanup. So, let's assume that our code has been cleaned up by Eclipse (or whatever) and compiles and passes all of its unit tests.

First, I want to take a look at the dreaded NullPointerException.  You definitely want to turn on the Java compiler warning which warns that this is possible ("Null pointer access" under "Potential programming problems"). This can find some possible problems, but it's not nearly as good in Eclipse Indigo as it is in Juno (Eclipse version 4).  Juno has annotations to allow you to explicitly set the expected behavior of references.  There are also the
javax.validation.constraints annotations like @NotNull though I'm not sure how useful these really in practice.  Until we can adopt the Juno annotations, we can try to avoid using variables (and initializing them to null) when we should be using constants.  And, unless it is explicitly covered as a potential and valid return value, methods should never return null.  If null is an invalid response it is better to throw an exception.  Assertions can also be used (by specying the -ea compiler option), but these have always seemed a crude way of doing things.  Annotations are much more appropriate.

I like another Eclipse plugin called FindBugs.  This will run on selected code (whole projects if you like) and come up with some potential bugs.  In this case, it takes mild exception to the test class, generating 7 warnings like the following:
RandomizerTest.java:54 Random object created and used only once in RandomizerTest.nextBoolean_A$()
This is not particularly helpful because we knew that already - this is the test class, remember?  Strangely, it doesn't seem to care that there are about a dozen potential null-pointer exceptions.  Occasionally, it will complain about something that you want to keep as is.  There is generally an annotation that you can specify which will tell it not to "bug" you about that in the future (similar to the annotations that you can use to switch off specific compiler warnings).
Then finally, when you've fixed all of the warnings that the compiler gives, and all of the bugs that FindBugs finds, and dealt with any null pointer issues, it's time to look for code "smells".  This is my favorite part.  Most of the time I just look and get that feeling that something's, well, fishy.  But you can do it more scientifically.  The tool I'm most familiar with, another Eclipse plugin, is called JDeodorant.

It has a strange user interface that I really dislike but it does work and can detect four "flavors" of code smell: God class, Long method, Type checking and Feature envy. Even though my example is quite short and reasonably well programmed, JDeodorant does find a few odors and with recommendations for fixing, for example under God class, it does suggest a refactoring and similarly under Long method.  It will offer to refactor the code for you.  This is all well and good if the changes are just to one file because you can revert to a previous version if you like.  But be careful if several files will be refactored, as feature envy often suggests.  I would only use that after checking everything else into source control.

Going beyond the level of source code, of course there are profiles and other performance enhancing tools but these are beyond the scope of this article.  There is one other tool that I like to help maintain architectural integrity of packages (packages are perhaps the least well thought-out aspect of Java).  It is called SonarJ and can really help with getting your house in order, architecturally speaking, by detecting cycles (A calls B, B calls C, C calls A, etc.).  It will recommend how to break these cycles by introducing interfaces.  Even a relatively simple package (or set of packages) can be surprisingly complex - and you really need a good tool to keep things in order.
OK, back to work!

Thursday, October 18, 2012

Test-driven development

To my mind, the most important software develoment tool, next to the compiler, I suppose, is a unit test framework.  I wrote one of these back in 1985 and I regret that I couldn't persuade the company to develop it further, perhaps even replacing our actual product.  But back in those days, the other tools just weren't ready for unit testing -- everything took far too long and we didn't have anything like continuous integration or even an IDE.  There was a powerful Q/A lobby which saw unit testing as trying to put them out of work!

Fast forward to 2012 and not only do we have unit testing, we have mockers and test-code generators.  I will admit right now that I am not an expert (barely a novice) in the realm of mocking.  It may be that some of what I am about to describe could be done better a different way.  No matter, onward.

I like to use a little tool called JUnitHelper which is an Eclipse plugin.  Now, it certainly has its quirks and idiosyncracies.  But it does do a basic first cut at a set of methods that "cover" all of the (non-private) methods declared in a source file.  And it makes it easy to hop back and forth between test class and test target class (Alt 9 and Alt 8).  When you enter Alt 9 (you can of course do it via menu commands too) it will add any test methods that have become necessary since you last did an Alt 9.  If there is no test class at all (defined by adding "Test" to type name), it will create one for you.  There are options so you can vary (a little) the various things that it does for you.

So here's what it produces for the Randomizer class that we talked about in the last blog (assuming we have it set to generate JUnit 4 tests):
package phasmid;

import static org.hamcrest.CoreMatchers.equalTo;
import static org.hamcrest.CoreMatchers.is;
import static org.hamcrest.CoreMatchers.notNullValue;
import static org.junit.Assert.assertThat;

import org.apache.commons.math3.random.RandomGenerator;
import org.junit.Test;

public class RandomizerTest {

 @Test
 public void type() throws Exception {
  // TODO auto-generated by JUnit Helper.
  assertThat(Randomizer.class, notNullValue());
 }

 @Test
 public void instantiation() throws Exception {
  // TODO auto-generated by JUnit Helper.
  RandomGenerator random = null;
  Randomizer target = new Randomizer(random);
  assertThat(target, notNullValue());
 }

 @Test
 public void nextBoolean_A$() throws Exception {
  // TODO auto-generated by JUnit Helper.
  RandomGenerator random = null;
  Randomizer target = new Randomizer(random);
  boolean actual = target.nextBoolean();
  boolean expected = false;
  assertThat(actual, is(equalTo(expected)));
 }
etc. etc.....
I'm not in love with having those // TODO lines added but I suppose that's their way of advertising.  I'd prefer it to use more readable assertions such as:
  assertNotNull(Randomizer.class);

  instead of
  assertThat(Randomizer.class, notNullValue());


or 
  boolean actual = target.nextBoolean();
  assertFalse(actual);

  instead of


  boolean actual = target.nextBoolean();
  boolean expected = false;
  assertThat(actual, is(equalTo(expected)));

  or, better yet,
  assertFalse(target.nextBoolean());


But these details aren't too significant.  Another possible improvement might be to create a @Before setup method which reinitializes a field to a given state before each test.  Then you would, typically, only need to have one initializer for each field, instead of an initializer per field per method.  But, again, small potatoes.

I will actually create that method just so that the code is a little more readable.

So, now let's get to some real work.  We want to be able to set an array of relative frequencies for the various "bands" of the wheel of fortune.  So we create a test method to set those bands.  We will also create a test method to get the next index.  We will try to name these tests according to the scheme specified in the options for JUnitHelper (but if we don't, it won't be the end of the world, we will just ultimately end up with two test methods for one target method).

This is how the start of our test class looks now:
 private RandomGenerator random;

 @Before
 public void setup() {
  setRandom(new JDKRandomGenerator());
  getRandom().setSeed(0);
 }
 
 @Test
 public void setBands_A$() {
  Randomizer target = new Randomizer(getRandom());
  target.setBands(new int[]{1,2});
 }

 @Test
 public void nextIndex_A$() {
  Randomizer target = new Randomizer(getRandom());
  int actual = target.nextIndex();
 }

 @Test
 public void type() throws Exception {
  assertNotNull(Randomizer.class);
 }


Errors appear for setBands(...) and nextIndex() because these methods don't yet exist (so you also don't get any auto-complete when typing those names, obviously).

So, we opt for the appropriate quick fix: create the methods in the target class. But before fleshing these new method bodies out, let's actually define the behavior that we want right now.

We want the default behavior (if no bands are set) of nextIndex() to match what nextInt() would return if its parameter was Integer.MAX_VALUE.  So, let's code that in our test method for nextIndex.  But we will also add logic to test the invocation of setBands() and a subsequent call to getIndex() [actually, we will make many such calls].

We also check that the delegated methods behave as expected.

This is what (the start of) our test class looks like now:
 private static final int SEED = 0;

 private RandomGenerator random;

 @Before
 public void setup() {
  setRandom(new JDKRandomGenerator());
  getRandom().setSeed(SEED);
 }

 @Test
 public void setBands_A$() {
  Randomizer target = new Randomizer(getRandom());
  target.setBands(new int[] { 1, 2 });
  int[] counts = new int[] { 0, 0};
  final int n = 100000;
  for (int i = 0; i < n; i++) {
   int nextIndex = target.nextIndex();
   assertTrue(nextIndex == 1 || nextIndex == 0);
   counts[nextIndex]++;
  }
  assertTrue(counts[0] + counts[1] == n);
  assertTrue(Math.abs(counts[1] - 2 * counts[0]) < 500);
 }

 @Test
 public void nextIndex_A$() {
  Randomizer target = new Randomizer(getRandom());
  int actual = target.nextIndex();
  getRandom().setSeed(SEED);
  int nextInt = target.nextInt(Integer.MAX_VALUE);
  assertEquals(nextInt, actual);
 }

 @Test
 public void type() throws Exception {
  assertNotNull(Randomizer.class);
 }

 @Test
 public void instantiation() throws Exception {
  Randomizer target = new Randomizer(getRandom());
  assertThat(target, notNullValue());
 }

 @Test
 public void nextBoolean_A$() throws Exception {
  Randomizer target = new Randomizer(getRandom());
  final boolean actual = target.nextBoolean();
  boolean expected = new Random(SEED).nextBoolean();
  assertEquals(expected, actual);
 }

 @Test
 public void nextBytes_A$byteArray() throws Exception {
  Randomizer target = new Randomizer(getRandom());
  byte[] actual = new byte[] {};
  target.nextBytes(actual);
  byte[] expected = new byte[] {};
  new Random(SEED).nextBytes(expected);
  assertTrue(Arrays.equals(expected, actual));
 }
Notice that we've completely define the desired behavior of the setBands() and nextIndex() methods but we still haven't actually coded the case where there are bands set.  Not surprisingly, the unit test runs excatp that there is an assertion error in the setBands_A$() method.

Wouldn't it be nice if we could get that written for us.  Well, on second thoughts, wouldn't we be out of a job then?  Actually, there are methods to create such conforming code, via evolutionary programming but for now, we'll just do it our own way.

This is what we end up with for the two methods in question:
 /**
  * @param bands
  *            a set of relative frequencies for the index values 0, 1, ...
  */
 public void setBands(int... bands) {
  this._bands = bands;
  int count = 0;
  for (int i = 0; i < getBands().length; i++)
   count += getBands()[i];
  setTotal(count);
 }

 /**
  * The behavior of this method depends on whether any bands were defined for
  * this randomizer. If bands were defined, the result is an index into the
  * bands. If bands were not defined, the result is a number uniformly
  * distributed between 0 and {@link Integer#MAX_VALUE}. In both cases, the
  * superclass method {@link RandomGenerator#nextInt(int)} is called.
  * 
  * @return the "index" as defined above.
  */
 public int nextIndex() {
  if (getTotal() > 0) {
   int randomizer = nextInt(getTotal());
   for (int i = 0; i < getBands().length; i++) {
    randomizer -= getBands()[i];
    if (randomizer < 0)
     return i;
   }
   return -1; // logic error
  }

  return nextInt(Integer.MAX_VALUE);
 }


Of couse it isn't necessary to do test-driven development as shown above.  It's perfectly valid to write the method and then do the testing.  But it can actually be easier and more obvious what's going on when you do it this way.  And, moreover, your unit tests then actually become documentation on how your methods should be used.

Finally, you need to test for coverage.  The full suite of tests that I've defined above is not really a complete set.  There is a lot more we could test for.  But let's see what sort of coverage we get, by running EclEmma, another Eclipse plugin that I find works very well.  You might prefer Cobertura or one of the others.  Come to think of it you might prefer other test runners than JUnit.

EclEmma tells us that things aren't too bad: 356 covered instructions, 4 missed.  I think we could do rather more testing of the nextIndex() method, perhaps using a greater number of trials than 100,000, maybe with a greater number of bands.

Tomorrow we're going to look at unit testing private methods and also search for bugs and code smells.

OK, back to work!

Monday, October 15, 2012

My process for developing software: creating a simple class

This is the first of several blogs that outline the way I like to develop (or modify) code. I'm talking in particular about Java development using Eclipse, although most of what I'm saying should be generally applicable.

In just about every phase of development, I avoid typing as much as is possible.  Not only is typing relatively slow (although I'm fairly quick myself) but it is error-prone.

And bear in mind that this is a personal view - I don't expect everyone to agree with it in every detail.  Also, I am going to discuss certain tools but there are usually alternatives to those specific tools that do something similar.  For example, I am using Eclipse Indigo.  I haven't manage to get Juno to behave properly yet.  And you may prefer IntelliJ or something else.

So, let's get started.

For now, let's assume that you are developing a new class (or class family that implements a new interface) and you're starting pretty much from scratch.

Before I do anything else, I naturally search the web for something similar.  Anything that is some sort of utility code has almost certainly been done by someone else somewhere.  This is especially true of anything to do with Collections.

So, let's assume that we couldn't find anything suitable (or we just didn't like what there was).  As an example, we're going to develop a "wheel of fortune" class which we will call Randomizer.

Unfortunately, in the early days of Java, they didn't "eat their own dog food" and did not create interfaces to define many of the standard classes.  java.util.Random is a case in point.  So the good people of Apache created an interface called RandomGenerator which does define the methods we want.

So, we will start by creating a new class Randomizer and have it implement RandomGenerator and Serializable.  But we will arrange for it to delegate its random number generation duties.

In the spirit of avoiding typing I try to include things like comments and constructors.  But I never want a main program (well, maybe once in every few hundred classes) and, because we're going to delegate the random number generation responsibilities, we will not for now implement the inherited abstract methods.

This is what results:


/**
 * Phasmid blog sandbox
 * 
 * Module: Randomizer
 * Created on: Oct 15, 2012
 */

package phasmid;

import java.io.Serializable;

import org.apache.commons.math.random.RandomGenerator;

/**
 * @author phasmid
 * 
 */
public class Randomizer implements RandomGenerator, Serializable {

 /**
  * 
  */
 public Randomizer() {
  // XXX Auto-generated constructor stub
 }

}

Randomizer is underscored in red in Eclipse because it doesn't implement those abstract methods.  Let's talk about warnings now.  I have Eclipse give me just about every possible warning.  Everything which by default is ignored, I change to warn.

So, first we go to the constructor and replace the comment with:

super();


Yes, I know that it isn't necessary but, to me, a constructor without anything in it looks naked.  Now we click on the constructor and invoke the Change Method Signature refactoring (Alt/Shift/C in Eclipse).  We're going to add a parameter called random of type RandomGenerator.  The default value doesn't matter because nothing calls the constructor yet so null will do.  But if the constructor is in use, I usually try to name the parameter with the same name - that might just match the appropriate identifier in the caller or, if not, it will at least be clear what's needed.  I'm all for naming identifiers according to context, but something that might be passed from one constructor to the next will usually only need one universal name.

At this point, the parameter name should be underscored in yellow (a warning showing that this is an unused parameter in a non-overridden method).  [In the Preferences->Java->Compiler->Errors/Warnings->Unnecessary dode->Value of parameter is not used, I have unchecked the checkbox for Ignore parameters documented with @param tag.

So now I click on the parameter name (random) and invoke the quick-fix (Ctrl/1), selecting the Assign parameter to new field option.  For some reason I don't understand, the options you get when you hover over the warning do not include this most obvious solution.  The quick fixes are like that - very inconsistent and frequently missing the most obvious solution.

So, with a (private and final) field named _random defined, this is what the contents of the class look like now.

 private final RandomGenerator _random;

 /**
  * @param random
  *            XXX
  * 
  */
 public Randomizer(RandomGenerator random) {
  super();
  _random = random;
 }

We will eventually clean this up but not just yet.  We've got warnings (underscored here) and we are going to address those immediately.

We click on _random and invoke the Source -> Generate Delegate Methods option (Alt/Shift/S).  This fills in all of our delegate methods and, since the field implements the same interface as the class, our error goes away.

But we end up with some warnings for every reference to _random.  This is because it is an instance field and we have not prefixed the references with "this."  Up until a few years ago, I always wanted to suppress those qualifiers.  But now, it seems to me that they add some clarity and do no harm.  Here's an Eclipse annoyance: if we choose the quick fix "create getter and setter", it works well.  But if you later add more references, you can't convert them all to use the getter at the same time.  BTW, I make the getRandom() have private scope since it would be overkill to have the class implement the interface and have public access to a field that does the same thing.

We're left with three warnings: two references to _random which need to be qualified.  And the fact that we haven't got a serial UID.  All three of these problems can be easily fixed with quick fixes.

A bit of (automatic) cleanup [we'll talk more about this later] and we have the following code:
[starting with]

 /**
  * @param random
  *            XXX
  * 
  */
 public Randomizer(final RandomGenerator random) {
  super();
  this._random = random;
 }

 /**
  * {@inheritDoc}
  * 
  * @see org.apache.commons.math.random.RandomGenerator#nextBoolean()
  */
 @Override
 public boolean nextBoolean() {
  return getRandom().nextBoolean();
 }


[and ending with]

 /**
  * {@inheritDoc}
  * 
  * @see org.apache.commons.math.random.RandomGenerator#setSeed(long)
  */
 @Override
 public void setSeed(final long arg0) {
  getRandom().setSeed(arg0);
 }

 /**
  * @return the random
  */
 private RandomGenerator getRandom() {
  return this._random;
 }

 private static final long serialVersionUID = 1238102722233258232L;

 private final RandomGenerator _random;



Next blog: test-driven development.

OK, back to work!

Thursday, October 11, 2012

Method scope

I want to talk in this blog about an aspect of Java that is often paid little attention: method scope.  My way of doing things might not be 100% mainstream, but I believe it makes good sense.

If your class extends a super-class, then you obviously will not be able to reduce the scope of any methods.  Should you widen the scope of any super-methods?  Probably not.

What about non-overridden methods?  If the method is defined by an interface, then it must be declared public -- end of story.  But should any other methods be public?  Well, let's think about how methods in our class will get invoked.  One mechanism of invocation is via reflection, in particular, introspection.  Introspection looks for public methods with names of the form:
  • boolean isX() and its mate setX(boolean);
  • T getX() and its mate setX(T).
These are known as "bean" methods, or property descriptor setters/getters.  Any such method can be invoked by reflection, in particular by inversion-of-control-container (dependency injection) type configuration.  So, if you want these properties to be settable by reflection, you need to declare them public.  There's only one snag with this mechanism: Java doesn't provide any way to mark these bean methods as discoverable (and invokable)  via reflection.  You just have to "know." In practice, there may also be other reflection-invokable methods such as void addX(T) or void putX(String,T).

If you have these reflectible methods or other non-overridden methods that you want to be invoked from other classes by the normal calling mechanism, then you need to declare them public and, if the method receiver type will normally be an interface, then they must go in the interface.

What this implies is that, for a class that will normally be referenced via its interface (thus appropriately hiding the specifics of any concrete class), the only public methods in that class will be the bean methods and the interface methods.  No other method needs to be declared public because there will be no way to invoke it.

Meanwhile, what about protected and default scope?  I tend to use protected scope only for methods that are used internally (that's to say they are invoked by a base class) and declared either abstract or with a trivial default behavior and which are expected to be overridden by concrete classes to define class behavior. Occasionally, I will define a concrete non-overridable method as protected when I am sure that I only want it to be invoked by sub-classes.  And, typically, the only need for default scope is when you create an inner class within a class and you want to create a method which will allow communication between an inner class and its outer class.

Everything else should be declared private.  I like to use long names for private methods which thus give a good description of what the private method does (naturally, it only does one thing!).  I do not create javadoc annotation for private method or fields because I typically filter out all private objects from the resulting javadoc.

I also feel that private methods should not normally handle any exceptions.  Exceptions should be handled at a level where handling is either required or opportune.  Therefore, private methods may have a long list of thrown exceptions, as required.

And while good practice suggests keeping short parameter lists for public methods, private methods can use as many parameters as they please.

Finally, a word on the other method modifiers, apart from scope.  Marking individual methods as final is unusual, but is necessary in certain cases, for example delegated callback methods.  And what about the distinction between class methods (static) and instance methods?  I have the Java compiler configured to warn me about methods which can be defined as static, but aren't.  This is very useful and something which I have long looked for.  I feel that declaring a method as an instance method when it really should be a class method is just plain wrong because it implies a dependence on this when none exists.

OK, back to work!

Wednesday, October 3, 2012

Variables considered harmful

When I look through other code written by other developers, I'm often shocked at how often I find variables used when there is no reason for them to be variable.

Let's take a look at this awful specimen:

public class VariablesConsiderHarmful {

 public static void main(String[] args) {
  new VariablesConsiderHarmful().doYourStuff(args);
 }

 /**
  * @param args
  */
 private void doYourStuff(String[] args) {
  // Do some initialization
  
  String x = null;
  String y = null;
  if ("x".equals(args[0]))
   x = args[1];
  if ("y".equals(args[0]))
   y = args[1];
  if (x == null)
   x = "xxx";
  if (y == null)
   y = "yyy";
  
  System.out.println("x=" + x + ", y=" + y);
 }
}

Let's suppose that the middle section of doYourStuff  (between // do some initialization and the System.out.println line) is much longer and more complex than I've suggested here.  Naturally, you would like to refactor it into a private method getXY().  But if you do that, you will find it is impossible due to there being two returned values (x and y).

How about refactoring into two private methods: getX() and getY()?  The code is so convoluted (believe me, I've seen much worse) that it is very difficult to separate the definition of x from the definition of y.  That's largely because of the way they have been expressed, quite unnecessarily, as variables.  And yet, there never was any reason to make them variables. I believe this idea of declaring all of your variables at once and later filling in their values is a relic of the bad old days of programming in languages like FORTRAN IV where everything was a variable (even things that you think are constants).  But we going back 30+ years now.  Why would anyone program this way in Java today?  Yet they do!

Apart from obfuscation, there is a serious danger in this manner of coding: it's easy to miss initializing something because of a logic flaw.  If you have the appropriate Java warning turned on, it will warn you in the case where you initialized the variable with null (implicitly or explicitly).  But I've seen cases where the variable was initialized with an empty String even though that empty value would be invalid.  The compiler can't protect you then from your own folly.

Let's clean up the doYourStuff method somewhat:

 private void doYourStuff(String[] args) {
  // Do some initialization

  final String x = "x".equals(args[0]) ? args[1] : "xxx";
  final String y = "y".equals(args[0]) ? args[1] : "yyy";

  System.out.println("x=" + x + ", y=" + y);
 }

Isn't that so much more elegant?  If you want to, we can easily create a getX() and getY() method using our "Extract Method" refactoring twice.

Yes, of course, some of you will be thinking that we should be using Scala or some other functional programming language where variables are the exception rather than the norm.  I totally agree.  But let's try to avoid using variables when variability serves no purpose whatsoever!

OK, back to work!