Viewing a dependency tree in Maven

To find out what dependencies you are getting and from where, execute “mvn dependency:tree”. Send it to a file using “mvn dependency:tree -Doutput=file”.

Election Guide, May 2010

This coming Tuesday, we have primary elections. I have been doing my research on the candidates for the various races — all primary elections, and I am registered as a Democrat. I will summarize the results of that research here along with my endorsements and intended votes. [More...]

Petitioning the FCC on Net Neutrality

I sent the following message to the FCC, which is currently accepting public input prior to promulgating new rules on “Net Neutrality”. [More...]

Logging APIs – Evaluating Options

In my previous post, I defined a number of different features that logging libraries could have. This time, I will evaluate some Java libraries based on those features. [More...]

Logging APIs – Feature List

Logging is not the world’s most interesting computing problem, but it is important, and it’s been on my mind lately because people have been pointing out that my company’s use of logging is currently a bit of a mess and ought to be cleaned up. [More...]

Password in Pieces

I came across the following question on reddit:

My bank on the online banking login instead of having a password field it presents you with 3 password fields 1 character each where it asks you for 3 characters from your password, chosen randomly. E.g. the 2nd, 4th and 7th.

I wanted to respond to this, because not only is it an incredibly misguided attempt at security which seriously weakens actual security, it also sounds familiar. Because a few months ago my employer considered doing something just like this. Let me recount the story [More...]

Raising the limit on IDs processed

It is a fairly simple screen for entering “mass alerts”. There are (omitting some irrelevant details) just two fields: one in which the user enters the text of an alert, and the other in which they enter a list of customer-ids specifying who we should show the alert to. This is normally pasted in from a spreadsheet by the users who are setting up new alert messages.

The feature that we need to implement (or “story” in Scrum parlance) is an increase in the maximum number of customers that can be set at once. You see, there is a “feature” that limits the number of IDs that can be set at one time to about 200. (“About” 200 because most id’s are 9 digits long and they are separated by whitespace; the actual limit is 2000 characters, enforced in Javascript as the field is input.) So when they need to set an alert on 600 IDs, they run through the screen 3 times. When they have 2.5 million IDs to update they open up a “story” for the development team.

I think we asked someone why it was limited to 200 IDs. No one is quite sure, but it’s probably to avoid overtaxing the database query or running a middleware service that takes too long… something like that. “Sure,” we say, “we can increase the limit.” We figure maybe we’ll group it in chunks of 200 and call it in a loop or something. We schedule it to be worked on in this month’s “sprint”.

A couple of man-days of effort go into building it. Some testing determines that (on much less powerful dev hardware) a single call can easily handle thousands of IDs without running into timeout issues — more than that, actually, as we left a factor of 4 or 5 for safety. So the front end breaks the list into chunks of that size. We thought we’d build it to handle unlimited capacity, but there’s an IE6 bug (yes, our corporate overlords require the use of and obsolete broken browser) that limits us to about 60,000 IDs.

Our Corporate Overlords

Our Corporate Overlords

So we have completed the feature and the business can now enter more than 50x as many IDs at a time. But that’s not quite the end of the story. Because as part of regression testing, our QA staff does some exhaustive testing of the screen, and they discover that there apparently isn’t a limit on the size of other field, the one that contains the alert message. We check the database table for the appropriate max message length, and it turns out to be exactly 2000 characters.

Wait… I think I’ve heard that number before.

Apparently, whoever built this page in the very first place accidentally limited the length of the wrong field. There never was a reason for a limit on the number of IDs processed at once… the limit came entirely because of a bug. Yet we’ve been living with this absurd limitation for several years, simply because no one ever questioned the limit. (Or if they did question it, they got some vague answer like “I assume it’s for performance reasons.”)

I’m sure there is some lesson we should draw from this experience… I’ll leave it to you to figure out what the lesson is.

Upgrading GWT/AppEngine to v1.6+

I had a project using Google Web Toolkit (GWT) and App Engine. It was developed in Eclipse (which I don’t like much, mostly because I don’t know how to use it very well) because Google recommends this and provides support in the form of Eclipse plugins for working with these tools.

Well, they released a new version and I hit the “upgrade” button. After that, my project didn’t work anymore. I tried for a day to resolve it and I just couldn’t understand anything. Finally I “solved” it by uninstalling Eclipse and reinstalling it, then following the tutorial steps to create a brand new project and copying in my old files one-by-one. Another full day lost (I can only work a couple of hours per day on hobby projects).

Surely they wouldn’t do it again, right? So I carefully saved everything and held my breath the next time Google released an upgrade. It promptly broke everything like last time. Only this time I solved it differently: I uninstalled Eclipse and did NOT reinstall it.

The big difference is that in the intervening month JetBrains had announced that a slightly-impaired version of IntelliJ IDEA would be available for free. The stripped down version doesn’t have support for GWT and App Engine (which the paid version does have), but it’s something I can use. At work, I use IntelliJ (properly paid for) but it’s awfully expensive to pay for my own copy at home. (Can’t use the same copy because that would disturb the corporate bean-counters, even though it is allowed by the license.) The stripped down version is fine if I can run from the command line.

There are instructions for running GWT via ant. And there are instructions for adding support for App Engine. But they are broken in (what I think is) exactly the same way that the Eclipse plugin is broken. Details from a forum posting led me to realize the problem was that it now needs a “javaagent” specified. A “javaagent” is some sort of a pre-processor that runs before main() — apparently introduced with Java 1.5.

So after following Google’s instructions, I now add the following: In my <hosted> target, along with the other <jvmarg> elements, I add a new one which looks like this:

<jvmarg value="-javaagent:${appengine.sdk}/lib/agent/appengine-agent.jar"/>

After that, I can build it using ant. I’ll also need to use the command line for deploys, that looks like this:

"C:\Program Files\appengine-java-sdk-1.2.6\bin\appcfg" update war

And now it works again.

Estimate Units

When you estimate tasks, should the estimates be done in hours, or in days?

As I see it, the big advantage of estimating in hours is that if you THINK in hours, you tend to get a more accurate estimate. There are lots of development tasks which will seem like they should take “no more than 2 days”, but if you think about all the individual steps (I have write create the page and the new service. And the stored procedure. And I’ll have to get a security review and a code review. And I have to remember to do the unit tests. Oh yes, and save time for bug fixes), the total comes out a big bigger.

As I see it, the big advantage of estimating in days is that it’s quicker and simpler. If you team is sitting there arguing whether a task is 3 hours or 4 hours, then you’re wasting time — after all, development estimates are never THAT accurate anyway: we always need to allow for the unexpected.

Considering these, I could be persuaded to do it either way. What is NOT useful is to think how many days it will take, multiply by the number of hours per day, then spend time arguing about whether it is one more or one less than this number.

An Exception to Every Rule

I like automated code scanners, really I do. They can scan your code either before or after you check it in and review it for code formatting, memory errors, or even potential security problems. It can prevent lots of foolish errors and unnecessary inconsistencies.

But there is one catch [More...]