CAPCHAs are those odd little boxes that show some badly malformed letters and numbers and ask you to type them in. The idea is to check whether you are a human.
The problem is that CAPCHAs are pretty difficult for humans. And they’re fairly easy for computers. There are the simple work-arounds (like paying to break CAPCHAs on Mechanical Turk). And there are the high-tech solutions where you simply build a computer that can solve them. My biggest concern though is the new kind of CAPTCHA that people have begun using. I find it to be a real problem, and it, too, can be worked around by anyone who is sufficiently motivated, but it is becoming a disturbingly common new way of identifying real humans:
I wanted to pass on an excellent idea that I read from Martin Fowler‘s Blog. He calls it Immutable Servers, but I claim, if you think about it properly, it is merely the application of version control to systems administration.
Everyone understands just how much version control has transformed the development of software code. It enables developers make changes freely, rolling back changes if they need to. It enables them to look back in history and find out how things stood at any point in time, what was changed on a certain date, or when a given change was introduced. And with advanced usage, it allows “branching”, where one can experiment with a group of changes for a long time (while still working on the original branch as well) then merge them together later.
These features aren’t just for code. They are great for text documents that get edited frequently. They are a great idea for file systems. And system administrators are familiar with the idea of keeping all of their system administration scripts in a version control system. But some things are extremely difficult to put under version control. Databases are notoriously difficult to version (although Capital One 360 manages it). And servers, being pieces of physical hardware, are impossible to check into Git.
Except that they’re not. Servers are not pieces of physical hardware anymore… they were until the last decade, but in recent years that has changed. The vast majority of the servers in our data center either are run or can be run on virtual servers. The current buzzword is “cloud computing”, but whatever you call it, we have the technology to spin up and deploy servers from a template in a matter of minutes. (The fact that it takes weeks to get a server set up for your project has nothing to do with technical problems… that’s just our own failure to take full advantage of the technology that we own.)
So, given that the servers are probably running on a virtual machine anyway, it’s a good idea to keep a virtual machine template with the correct configuration (for quickly restoring the machine). Of course, if you do this you will need to update the template every time you make a significant configuration change. Updating the image doesn’t necessarily mean you launch a virtual machine each time, make the change, then save a new image — you can use tools like Puppet or Chef as part of the image deployment process so often it is just a matter of editing a configuration file.
For the final step, Martin Fowler proposes that you take this to its logical conclusion. If every change needs to be made on the real server AND on the template, why not simplify your workflow (and make it more reliable at the same time) by making the changes directly to the image and deploying a new copy each time. You never change the production server, just roll out a new one each time. This sounds crazy to anyone who hasn’t yet drunk the “cloud computing” cool-aid, to anyone for whom creating a new instance of a server takes more than a couple of minutes, but if you DO have an environment that flexible, then you might get all the benefits of version control but for servers. Netflix is one example of a major company that has taken this approach quite successfully.
“Satoshi Nakamoto” is the alias of the anonymous person who invented and published the protocol for Bitcoin. So far, no one knows for sure who it is, although attempts have been made to unmask the person (or people) by an analysis of their writing style and similar indicators. Now, in a blogpost, Sergio Demian Lerner has found a way to recognize coins mined by the same computer and has picked out the distinctive pattern of a certain individual who began mining almost from block one and continued mining at a consistent rate with regular restarts for a long time, without spending any of those coins.
This, he says, is Satoshi, and I applaud Sergio for this clever way to recognize an individual miner. Like Sergio, I am pleased that Satoshi’s fortune in Bitcoins is now apparently worth around $100 million USD. But Sergio also suggests that he expects this will lead to the unmasking of Satoshi once others track this to a Bitcoin somewhere which HAS been spent. (Bitcoin has many advantages, but it is NOT fully anonymous: in fact, anyone can track a payment back to see which (anonymous) account it came from previously.)
I hope he is wrong about the unmasking. I prefer to imagine that Satoshi Nakamoto is living and working a normal job, still haunting cryptography boards in the evenings and on weekends, and occasionally checking the news to see how that Bitcoin thing is progressing. I imagine that someday, many years from now, when she dies her husband will open that envelope she left in the safe-deposit-box and it will contain a hard drive and stack of papers labeled “Now that I am gone, please publish this for the world to read.”
Okay, it’s just a romantic dream, but I’m hanging onto it as long as I can.
2013-01-31 | Filed Under Software Development
So, “Sunil Kumar” of Panzer Solutions wrote to me a ten days ago offering a position. Normally, I appreciate hearing from recruiters. As it happens, I have no interest in a new job; I am happy with my current position and have plenty of new challenges there recently. But it is nice to hear the signs that my industry is doing well, and keeping up contacts with recruiters in my area is a good idea.
But Mr. Kumar didn’t write me about a position commiserate with my specific skills, he wrote to tell me “We have more than 100 W2 working currently with successful hit.” (That’s not quite English, but it’s fairly close.) There are recruiters who work hard to match up a particular applicant with a position where their skills and their career/environment preferences are a good fit. When I am doing the hiring (and just to note, Capital One is hiring right now in the Wilmington area), I love working with these recruiters: they bring me just 3 resumes and I end up wanting to bring in all 3 for further interviews. That’s a much more pleasant experience than digging through a stack of resumes most of whom can’t pass the FizzBuzz test.
Mr. Kumar is in a different category altogether: he clearly thinks recruiting is a numbers game: if he just sends enough applicant names to enough open positions then he’ll be successful. He won’t be, because he’s not adding value. So I politely wrote back to Mr. Kumar explaining this and asking that he not send me “blind mailing” style job offers. A week later I have received TWO other emails from Mr. Kumar stating that “Panzer Solutions is looking to hire 10-20 New H1b’s and OPT EAD’s in coming one month.” (Still, not quite English.) Besides being a violation of federal employment law (I’m not a lawyer, but I was under the impression that companies were not permitted to favor H1B holders over citizens), this is no better than spam, either for the recipient (me) or the employer to whom the names are offered.
So I am Naming and Shaming Mr. Sunil Kumar of Panzer Solutions, and I will never do business with him or his company. Here’s hoping this article jumps to the top of the search rankings for those names so that others will recognize their uselessness sooner and Panzer and Mr. Kumar can quickly go out of business and leave space for better recruiters who actually make the hiring process easier, not harder.
2012-09-21 | Filed Under Technology
After looking at the iPhone 5, I see people saying that phones already have everything they need… nothing new will happen. That’s completely absurd. There are tons of little things, for instance I want a browser that can do everything the PC browsers can do. But there are also HUGE changes needed. Here are some things that I want for my phone:
- Talk to it. Today I *almost* have this: my Jellybean-based phone can perform near real-time voice recognition (without a network connection). But the error rate is still high enough that after adding in the time to go back and correct errors the whole process takes longer than typing it in on the device’s keyboard. But not much longer… I expect this one very soon.
- Context aware. My phone should know when it’s OK to ring, and when it isn’t (if I’m in a meeting or a movie). While I’m driving, it shouldn’t send me texts. When I start asking for directions it should guess (with a degree of accuracy) where I might want to go to.
- Expandable screen. I want an iPad sized screen, but I want it to fit in my pocket. The only way to do that is to have an expandable or pull-out screen of some sort, or perhaps a projector.
- Keyboard. Something real that I can type on — keyboards are SO amazingly effective. But I don’t like carrying around a bluetooth keyboard (they’re either too small to type on or too big to carry comfortably).
2012-08-30 | Filed Under Programming
Ben Northrop wrote to complain that story points are not accurate. They don’t (always) map linearly to hours spent, so adding up story points over a large project won’t accurately give hours for the project. In the spirit of expressing controversial opinions, I will agree, and explain why I think that’s a good thing.
I believe that story points serve as a “rough” estimate. In the teams I work with, story point estimates are made quickly (a few minutes to be sure we understand the story, then quickly discuss and reach a consensus estimate). They are quantized (must round off to some Fibonacci number) which means that any given estimate is necessarily imperfect.
As such, they provide a cheap (didn’t take long to generate) but rough (not perfectly accurate) estimate, and they have to be respected as such. Story point estimates would not be useful to answer questions like “Will this project deliver in October or November?”, but they ARE useful for questions like “Would this be a 3-month project or a 1 year project?” For some purposes, a more precise estimate is needed, and then it may be necessary to invest a few hours to a few weeks to perform detailed work to generate a more precise estimate. However, I think that such situations are rare: people *want* perfect estimates ahead of time but rarely *need* them. Also I think that people are usually fooling themselves: most (usually waterfall) projects with precise up-front estimates later discover that those estimates are not accurate.
One of the strengths of story points is that everyone (including the customer) REALIZES that they are rough and don’t correspond to a precise delivery date — something that can be difficult to explain for estimates expressed in hours.
2012-03-12 | Filed Under Programming
Suppose you wanted to build a tool for anonymously capturing the websites that a user visited and keeping a record of the public sites while keeping the users completely anonymous so their browsing history could not be determined. One of the most difficult challenges would be finding a way to decide whether a site was “public” and to do so without keeping any record (not even on the user’s own machine) of the sites visited or even tying together the different sites by one ID (even an anonymous one). [More...]
2012-03-10 | Filed Under Programming
Suppose you were building a tool integrated with web browsers to anonymously capture the (public) websites that a user visited and store them to a P2P network shared by the users of this tool. What would the requirements be for this storage P2P network? [More...]
2012-03-05 | Filed Under Programming
2012-03-04 | Filed Under Programming
Do you remember Google Web Accelerator? The idea was that you downloaded all your pages through Google’s servers. For content that was static, Google could just load it once, then cache it and serve up the same page to every user. The advantage to the user was that they got the page faster, and more reliably; the advantage to Google was that they got to crawl the web “as the user sees it” instead of just what Googlebot gets… and that they got to see every single page you viewed, thus feeding even more into the giant maw of information that is Google.
Well, Google eventually dropped Google Web Accelerator (I wonder why?), but the idea is interesting. Suppose you wanted to build a similar tool that would capture the web viewing experience of thousands of users (or more). For users it could provide a reliable source for sites that go down or that get hit with the “slashdot” effect. For the Internet Archive or someone a smaller search engine like Duck Duck Go, it would provide a means of performing a massive web crawl. For someone like the EFF or human-rights groups it would provide a way to monitor whether some users (such as those in China) are being “secretly” served different content. But unlike Google Web Accelerator, a community-driven project would have to solve one very hard problem: how do this while keeping the user’s browsing history secret — the exact opposite of what Google’s project did. [More...]