Logging APIs – Feature List
2010-02-01 | Filed Under Programming
Logging is not the world’s most interesting computing problem, but it is important, and it’s been on my mind lately because people have been pointing out that my company’s use of logging is currently a bit of a mess and ought to be cleaned up. [More...]
Post Links
Permalink | Trackback | Leave a Comment
Password in Pieces
2009-12-05 | Filed Under Programming
I came across the following question on reddit:
My bank on the online banking login instead of having a password field it presents you with 3 password fields 1 character each where it asks you for 3 characters from your password, chosen randomly. E.g. the 2nd, 4th and 7th.
I wanted to respond to this, because not only is it an incredibly misguided attempt at security which seriously weakens actual security, it also sounds familiar. Because a few months ago my employer considered doing something just like this. Let me recount the story:
I work for a bank, so we care a LOT about security. Customers call into our call center and to identify themselves they get connected to the IVR (interactive voice response unit… telephone system) to enter their PIN (a 4 to 8 digit passcode). An important feature is that we cut out the phone reps from hearing this… because we want your password to be a secret EVEN FROM OURSELVES. All of this is good security design.
We opened up a new call center in Hawaii, and they had some problems. Apparently the phone system we were using had a time limit when transferring a call — if it wasn’t picked up by the remote phone switch within a few milliseconds then it was disconnected. The ping time between the Hawaii call center and our east-coast data center was just a little too long and many of the calls were being disconnected when they were transferred to the IVR to enter the PIN.
The first solution that they thought of was to stop using the IVR to enter PIN numbers. Instead, the idea was that they would instead create a system where the phone reps asked the customers for certain digits out of their PIN (just as described in enanoretozon’s reddit question). They would type this in and then the customer could log in. Apparently, this was the standard practice at our German subsidiary, and had somehow become blessed as the official corporate-wide best practice.
Well, it may be an official “best practice”, but it’s still a very bad idea, for two reasons. The first reason should be completely obvious if you just try it. First, say your phone number out loud. Now say the 3rd, 6th, and 4th characters of it. For most normal people, the second will take many times longer, and be much harder, even though it is only 3 digits. There is always a tradeoff between security and usability (We could provide perfect security if we never allowed anyone to take their money out of the bank. Of course, usability would have dropped to zero.), and entering random digits has SUCH poor usability that it is not worth it.
Besides that, it is also less secure. There are, if you consider it, multiple different kinds of attacks that we need to protect against. One kind, certainly, is attacks by unscrupulous bank employees who might misuse a customer’s login credentials. But another far more likely attack is a third-party who wants to steal from a customer’s account.
Such an attacker, if they didn’t know the customer’s PIN, would have to guess. To prevent repeated guessing, we will temporarily lock out a customer’s account after a certain number of incorrect login attempts. But a clever attacker would just try different customers, making just one guess for each one.
With (for instance) a 6-digit pin, the expected number of guesses before the attacker got one right is around 100,000. Long before an attacker managed to try even a small fraction of 100,000 guesses, we would have noticed what they were doing and put a stop to it. But we only ask for 3 particular digits out of the password, then the attacker only needs to try about 1000 times before she is expected to guess correctly. There is a good chance that we would catch that, but (particularly if they spoof their phone number) we might not.
So we traded better defense against a rare attack (we don’t hire a lot of employees who commit bank fraud) for much worse defense against a common attack (we detect and stop attempted attacks of various sorts every single week!). It is NOT an improvement.
So… after these points were raised, we chose not to implement our German counterpart’s policy. What did we do instead?
Very simple: we fixed the phone system so it could transfer calls properly.
Post Links
Permalink | Trackback | Leave a Comment
Raising the limit on IDs processed
2009-11-13 | Filed Under Programming
It is a fairly simple screen for entering “mass alerts”. There are (omitting some irrelevant details) just two fields: one in which the user enters the text of an alert, and the other in which they enter a list of customer-ids specifying who we should show the alert to. This is normally pasted in from a spreadsheet by the users who are setting up new alert messages.
The feature that we need to implement (or “story” in Scrum parlance) is an increase in the maximum number of customers that can be set at once. You see, there is a “feature” that limits the number of IDs that can be set at one time to about 200. (“About” 200 because most id’s are 9 digits long and they are separated by whitespace; the actual limit is 2000 characters, enforced in Javascript as the field is input.) So when they need to set an alert on 600 IDs, they run through the screen 3 times. When they have 2.5 million IDs to update they open up a “story” for the development team.
I think we asked someone why it was limited to 200 IDs. No one is quite sure, but it’s probably to avoid overtaxing the database query or running a middleware service that takes too long… something like that. “Sure,” we say, “we can increase the limit.” We figure maybe we’ll group it in chunks of 200 and call it in a loop or something. We schedule it to be worked on in this month’s “sprint”.
A couple of man-days of effort go into building it. Some testing determines that (on much less powerful dev hardware) a single call can easily handle thousands of IDs without running into timeout issues — more than that, actually, as we left a factor of 4 or 5 for safety. So the front end breaks the list into chunks of that size. We thought we’d build it to handle unlimited capacity, but there’s an IE6 bug (yes, our corporate overlords require the use of and obsolete broken browser) that limits us to about 60,000 IDs.

Our Corporate Overlords
So we have completed the feature and the business can now enter more than 50x as many IDs at a time. But that’s not quite the end of the story. Because as part of regression testing, our QA staff does some exhaustive testing of the screen, and they discover that there apparently isn’t a limit on the size of other field, the one that contains the alert message. We check the database table for the appropriate max message length, and it turns out to be exactly 2000 characters.
Wait… I think I’ve heard that number before.
Apparently, whoever built this page in the very first place accidentally limited the length of the wrong field. There never was a reason for a limit on the number of IDs processed at once… the limit came entirely because of a bug. Yet we’ve been living with this absurd limitation for several years, simply because no one ever questioned the limit. (Or if they did question it, they got some vague answer like “I assume it’s for performance reasons.”)
I’m sure there is some lesson we should draw from this experience… I’ll leave it to you to figure out what the lesson is.
Upgrading GWT/AppEngine to v1.6+
2009-11-09 | Filed Under Programming
I had a project using Google Web Toolkit (GWT) and App Engine. It was developed in Eclipse (which I don’t like much, mostly because I don’t know how to use it very well) because Google recommends this and provides support in the form of Eclipse plugins for working with these tools.
Well, they released a new version and I hit the “upgrade” button. After that, my project didn’t work anymore. I tried for a day to resolve it and I just couldn’t understand anything. Finally I “solved” it by uninstalling Eclipse and reinstalling it, then following the tutorial steps to create a brand new project and copying in my old files one-by-one. Another full day lost (I can only work a couple of hours per day on hobby projects).
Surely they wouldn’t do it again, right? So I carefully saved everything and held my breath the next time Google released an upgrade. It promptly broke everything like last time. Only this time I solved it differently: I uninstalled Eclipse and did NOT reinstall it.
The big difference is that in the intervening month JetBrains had announced that a slightly-impaired version of IntelliJ IDEA would be available for free. The stripped down version doesn’t have support for GWT and App Engine (which the paid version does have), but it’s something I can use. At work, I use IntelliJ (properly paid for) but it’s awfully expensive to pay for my own copy at home. (Can’t use the same copy because that would disturb the corporate bean-counters, even though it is allowed by the license.) The stripped down version is fine if I can run from the command line.
There are instructions for running GWT via ant. And there are instructions for adding support for App Engine. But they are broken in (what I think is) exactly the same way that the Eclipse plugin is broken. Details from a forum posting led me to realize the problem was that it now needs a “javaagent” specified. A “javaagent” is some sort of a pre-processor that runs before main() — apparently introduced with Java 1.5.
So after following Google’s instructions, I now add the following: In my <hosted> target, along with the other <jvmarg> elements, I add a new one which looks like this:
<jvmarg value="-javaagent:${appengine.sdk}/lib/agent/appengine-agent.jar"/>After that, I can build it using ant. I’ll also need to use the command line for deploys, that looks like this:
"C:\Program Files\appengine-java-sdk-1.2.6\bin\appcfg" update war
And now it works again.
Post Links
Permalink | Trackback | 2 Comments
Estimate Units
2009-02-13 | Filed Under Uncategorized
When you estimate tasks, should the estimates be done in hours, or in days?
As I see it, the big advantage of estimating in hours is that if you THINK in hours, you tend to get a more accurate estimate. There are lots of development tasks which will seem like they should take “no more than 2 days”, but if you think about all the individual steps (I have write create the page and the new service. And the stored procedure. And I’ll have to get a security review and a code review. And I have to remember to do the unit tests. Oh yes, and save time for bug fixes), the total comes out a big bigger.
As I see it, the big advantage of estimating in days is that it’s quicker and simpler. If you team is sitting there arguing whether a task is 3 hours or 4 hours, then you’re wasting time — after all, development estimates are never THAT accurate anyway: we always need to allow for the unexpected.
Considering these, I could be persuaded to do it either way. What is NOT useful is to think how many days it will take, multiply by the number of hours per day, then spend time arguing about whether it is one more or one less than this number.
Post Links
Permalink | Trackback | 2 Comments
An Exception to Every Rule
2008-12-31 | Filed Under Programming
I like automated code scanners, really I do. They can scan your code either before or after you check it in and review it for code formatting, memory errors, or even potential security problems. It can prevent lots of foolish errors and unnecessary inconsistencies.
But there is one catch [More...]
Post Links
Permalink | Trackback | 2 Comments
The Death of Ontology
2008-12-29 | Filed Under Programming
Once upon a time, all good software used some sort of a command language. Whether it was a word processor like Emacs, a typesetting program like TeX, or even something like a graphing program, everything had a command line at the bottom and some kind of command language that could be used to control it. Learning to use a piece of software meant reading the manual to see what commands it had, and what kinds of modifiers and arguments those commands took. There was, I expect, a whole science behind the creation of good command languages. I know, for instance, that many allowed you to use abbreviations so as they were unambiguous, so the better designed ones used a vocabulary that was carefully selected to be memorable but with no words that shared more than a couple of leading letters.
Then with a single innovation, the entire science of command language design became forever irrelevant. [More...]
Post Links
Permalink | Trackback | Leave a Comment
How Long is an Email Address?
2008-12-17 | Filed Under Programming
Suppose you are setting up your database table, and you want to create a column to store an email address. How many characters should you allow in the field? [More...]
Post Links
Permalink | Trackback | 2 Comments
My Security Nightmare
2008-12-04 | Filed Under Programming
As Willie Sutton didn’t say, “I rob banks because that’s where the money is.”
I work for a bank, and so I worry more about security than most programmers. After all, if a hacker were were truly motivated and competent who would they pick to go after? Probably a bank (the other good option is political or corporate espionage). Recently I saw two security-related stories which, when combined, form my ultimate nightmare: an effective attack for which I cannot imagine a possible defense. [More...]
Post Links
Permalink | Trackback | Leave a Comment
Election Guide, Nov 2008
2008-11-01 | Filed Under Politics
Here is a description of all items that will be on my local ballot for this upcoming election, along with my own personal recommendations on how I expect to vote, and why. For quite some time now, I’ve done this sort of research before elections; this time I decided to write it out. [More...]
