<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Dragons in the Algorithm &#187; Programming</title>
	<atom:link href="http://mcherm.com/permalinks/1/category/programming/feed" rel="self" type="application/rss+xml" />
	<link>http://mcherm.com</link>
	<description>Adventures in Programming</description>
	<lastBuildDate>Fri, 03 Feb 2012 20:21:09 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.4</generator>
		<item>
		<title>Host Error 2</title>
		<link>http://mcherm.com/permalinks/1/host-error-2</link>
		<comments>http://mcherm.com/permalinks/1/host-error-2#comments</comments>
		<pubDate>Fri, 03 Feb 2012 20:21:09 +0000</pubDate>
		<dc:creator>mcherm</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://mcherm.com/?p=617</guid>
		<description><![CDATA[Another posting on how to understand Profile errors. If you ever see &#8220;Host error number XXX&#8221;, it means that this was the XXX&#8217;th error of the day that this Profile instance wrote to the logs. Get someone to look it up in the Profile logs. Also, Calling mrpc ZWRAP with [925, 8864, ""44758220"", &#124;!&#124;] will [...]]]></description>
			<content:encoded><![CDATA[<p>Another posting on how to understand Profile errors.<span id="more-617"></span></p>
<p>If you ever see &#8220;Host error number XXX&#8221;, it means that this was the XXX&#8217;th error of the day that this Profile instance wrote to the logs. Get someone to look it up in the Profile logs.</p>
<p>Also, <em>Calling mrpc ZWRAP with [925, 8864, ""44758220"", |!|]</em> will fail if 8864 is not a valid profile userid (which is the case for me).</p>
]]></content:encoded>
			<wfw:commentRss>http://mcherm.com/permalinks/1/host-error-2/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Namespace for a valid SOAP message</title>
		<link>http://mcherm.com/permalinks/1/namespace-for-a-valid-soap-message</link>
		<comments>http://mcherm.com/permalinks/1/namespace-for-a-valid-soap-message#comments</comments>
		<pubDate>Mon, 12 Dec 2011 14:35:29 +0000</pubDate>
		<dc:creator>mcherm</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://mcherm.com/?p=606</guid>
		<description><![CDATA[A brief hint: if you see an error message like this: InputStream does not represent a valid SOAP 1.1 Message check the namespace of the SOAP envelope SOAP 1.1: http://schemas.xmlsoap.org/soap/envelope/ SOAP 1.2: http://www.w3.org/2003/05/soap-envelope/]]></description>
			<content:encoded><![CDATA[<p>A brief hint: if you see an error message like this:</p>
<p style="padding-left: 30px;">InputStream does not represent a valid SOAP 1.1 Message</p>
<p>check the namespace of the SOAP envelope</p>
<p>SOAP 1.1: <a rel="nofollow" href="http://schemas.xmlsoap.org/soap/envelope/" target="_blank">http://schemas.xmlsoap.org/soap/envelope/</a></p>
<p>SOAP 1.2: <a rel="nofollow" href="http://www.w3.org/2003/05/soap-envelope/" target="_blank">http://www.w3.org/2003/05/soap-envelope/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://mcherm.com/permalinks/1/namespace-for-a-valid-soap-message/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Binary Backward Compatibility</title>
		<link>http://mcherm.com/permalinks/1/binary-backward-compatibility</link>
		<comments>http://mcherm.com/permalinks/1/binary-backward-compatibility#comments</comments>
		<pubDate>Thu, 08 Dec 2011 03:00:12 +0000</pubDate>
		<dc:creator>mcherm</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://mcherm.com/?p=601</guid>
		<description><![CDATA[I saw this interesting article about a weakness in the Scala language. The weakness applies not just to Scala, but to pretty much any language: the community using the language cannot grow past a certain point until it somehow solves the problem of libraries depending on other libraries in a large (deep) tree. Why is [...]]]></description>
			<content:encoded><![CDATA[<p>I saw this <a href="http://lift.la/scalas-version-fragility-make-the-enterprise">interesting article</a> about a weakness in the Scala language. The weakness applies not just to Scala, but to pretty much any language: the community using the language cannot grow past a certain point until it somehow solves the problem of libraries depending on other libraries in a large (deep) tree.<span id="more-601"></span> Why is this a problem? Because when the language moves forward (to the next version) a deep dependency tree means you can&#8217;t move forward until <em>every</em> library in the tree is moved to the new version, and making every library in the community do that simultaneously is extremely difficult. You can see the problem right now in the Python community: Python 3 was realeased THREE YEARS ago, but today many major libraries still don&#8217;t support it.</p>
<p>What I found most interesting was something that David Pollak (the post&#8217;s author) alluded to but did not emphasize: an example of a language that <em>has</em> solved this problem. Surprisingly, it is the much-maligned Java. (And perhaps this feature is one of the reasons for Java&#8217;s success in &#8220;the enterprise&#8221;, where backward compatibility to old or unmaintained libraries is often a very big deal.) The Java solution is to provide an incredibly strong amount of backward compatibility at the binary level (not just the source). As far as I know, essentially all code written under Java 1.0 (16 years ago) will still compile under the most recent Java release, and code <em>compiled</em> by that Java 1.0 compiler will still run under the most recent JVM. The price paid is some real ugliness in the name of backward compatibility like old APIs that still return Hashtable or ArrayList instead of Map or List, and type erasure that makes typed collection less powerful than they could be). But however much you may scoff at Java for poor language design, this feat of backward compatibility is something quite impressive.</p>
]]></content:encoded>
			<wfw:commentRss>http://mcherm.com/permalinks/1/binary-backward-compatibility/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Story Points</title>
		<link>http://mcherm.com/permalinks/1/story-points</link>
		<comments>http://mcherm.com/permalinks/1/story-points#comments</comments>
		<pubDate>Thu, 29 Sep 2011 01:27:08 +0000</pubDate>
		<dc:creator>mcherm</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://mcherm.com/?p=585</guid>
		<description><![CDATA[If you have complete and accurate requirements for your project which won&#8217;t change, and your development team is spot-on in estimating and highly consistent in their development pace. and there are no surprises, then you can produce highly accurate project timeline estimates up front. Such accurate estimates are (or, more accurately, would be) quite useful [...]]]></description>
			<content:encoded><![CDATA[<p>If you have complete and accurate requirements for your project which won&#8217;t change, and your development team is spot-on in estimating and highly consistent in their development pace. and there are no surprises, then you can produce highly accurate project timeline estimates up front. Such accurate estimates are (or, more accurately, would be) quite useful and well worth the effort it takes to produce them because of how nicely you can schedule everything. But how about the rest of us, for which none of this is true?<span id="more-585"></span></p>
<p>There really isn&#8217;t much benefit to putting in lots of hours developing a detailed estimate if the project isn&#8217;t going to proceed according to plan <em>anyway</em> (and it rarely does). This is why most agile development approaches &#8212; including Scrum &#8212; use a less-precise but also less time-consuming approach. By going with rough requirements, and a simple imprecise estimation process a team can produce rough estimates in a surprisingly short amount of time. The time saved writing requirement documents and producing estimates can be used to build something useful instead.</p>
<p>The process that I have found to be most useful starts out with requirements that are simple: just a paragraph or two written down for a feature and a few minutes discussion to make sure everyone understands it. The team meets, making sure to include someone from the &#8220;business side&#8221; who can answer questions about what is needed, the developers, QA, DBAs, and whatever other specialists are needed. The business person explains what is needed; the team talks through how they will code and how it will be tested. Then we&#8217;re ready to estimate.</p>
<p>Everyone just says how long they think it will take. To avoid &#8220;groupthink&#8221; where everyone just agrees with the first person to speak, it&#8217;s good to have each person come up with their idea independently before comparing: selecting cards and all revealing at the same time is one way to do this. Everyone estimates: yes, that means the DBA may estimate a Java coding task, but that&#8217;s OK. To avoid long useless debates over whether it&#8217;s 23.2 or 23.4 we usually limit the estimated sizes to some discrete values: 1, 2, 3, 5, 8, 13, 20, and &#8220;more&#8221; are a widely used set of values (the values chosen to make it easy to split a task). If, after hearing what was said we all agree on the size then we&#8217;re done (this is where we all discount the DBA&#8217;s estimate of the Java task); if not then we discuss for a few more minutes: maybe someone realized an extra step the others missed or knows where to find test data without having to enter it. If we still disagree after that, just take the larger estimate.</p>
<p>That&#8217;s it! It takes only a few minutes to produce estimates this way. Of course, the estimates are worth what you put into them: the business MUST realize that these are only rough numbers. A common way to do that is to estimate in &#8220;Story Points&#8221; instead of &#8220;hours&#8221; or &#8220;days&#8221;. Speaking in terms of a unit that is less concrete seems to help remind everyone that this is only a rough value. But it is a rough value that did NOT take weeks of preparation, and thus well worth it.</p>
]]></content:encoded>
			<wfw:commentRss>http://mcherm.com/permalinks/1/story-points/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How Even Immutables are Hard with Threads</title>
		<link>http://mcherm.com/permalinks/1/how-even-immutables-are-hard-with-threads</link>
		<comments>http://mcherm.com/permalinks/1/how-even-immutables-are-hard-with-threads#comments</comments>
		<pubDate>Wed, 24 Aug 2011 03:09:52 +0000</pubDate>
		<dc:creator>mcherm</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://mcherm.com/?p=574</guid>
		<description><![CDATA[Armen Rigo has a blog posting (worthy of an article of its own) proposing using STM (Software Transactional Memory) in PyPy. In a discussion on reddit someone suggested that you could have weaker threading guarantees and just use locks manually. It wouldn&#8217;t be so hard, they explained, because: You really only have to do it [...]]]></description>
			<content:encoded><![CDATA[<p>Armen Rigo has <a href="http://morepypy.blogspot.com/2011/08/we-need-software-transactional-memory.html">a blog posting</a> (worthy of an article of its own) proposing using STM (Software Transactional Memory) in PyPy. In <a href="http://www.reddit.com/r/Python/comments/jrm0t/pypy_status_blog_we_need_software_transactional/">a discussion on reddit</a> someone suggested that you could have weaker threading guarantees and just use locks manually.<span id="more-574"></span> It wouldn&#8217;t be so hard, they explained, because:</p>
<blockquote><p>You really only have to do it for data that is not read-only. I would for example say that it&#8217;s pretty rare for classes to change after they have been set up for the first time (presumably before any threads are even started), making the class basically read-only, which could be safely shared across threads.</p></blockquote>
<p>I wanted to give a detailed response with why this approach is nieve. Actually, it has been tried before and failed. It may work OK with certain kinds of languages (mostly &#8220;functional&#8221; languages), but fails with other kinds of languages, and Python is an extreme example of the kind of language where it won&#8217;t work.</p>
<p>For an example, consider Java. The JVM (Java Virtual Machine) has special features that were added to support exactly this behavior, but in practice few programmers use them. Let&#8217;s take a really simple example: suppose you create some data structure and a function to initialize it. In thread A you create the object, then initialize it, then pass it off to existing threads B and C. Threads B and C simultaneously read stuff from the data structure in ways that WOULD be dangerous except that the data structure is immutable after initialization.</p>
<p>The problem is that the guarantees provided in threading are MUCH weaker than you think. It&#8217;s not just that there are different threads all working at the same time and reading and writing from the same memory locations, the architecture of modern CPUs makes that impossible. You see, it takes hundreds of times longer to read something from memory or write it to memory as it takes to process something in the registers. So to execute &#8220;X = Y + 1&#8243;, the computer COULD spend 100 cycles reading Y, then 1 cycle adding 1 then 100 cycles writing X for a total of 201 cycles to execute. But that would be unbearably slow. Instead, it takes 100 cycles to do a bulk read of the whole memory area around where Y is stored into high-speed caches. It takes another 100 cycles to do a bulk read of the whole memory area aroudn where X is stored. It takes 1 cycle to add, then takes 100 cycles to do a bulk write of the memory area containing X. That&#8217;s 301 cycles&#8230; which sounds even worse.</p>
<p>But it&#8217;s NOT worse if the compiler cheats. Instead, it spends 100 cycles reading Y and 100 cycles reading X. Then it executes the +1 for one cycle. Then, BEFORE writing out X it does some OTHER calculations on the chunks of memory that have been read in. If the program has good cache locality (active objects are near each other in memory) it may get 75 cycles of useful work done before it needs to spend 100 cycles to &#8220;flush the cache out&#8221; (write X and the other things that were updated. That would be a total of 375 cycles to do 75 bits of work, or just 5 cycles per line &#8212; a LOT better than 201!</p>
<p>But in order to do this, the compiler has the &#8220;cheat&#8221;. It has to execute bits of work out of order, although it can take special precautions to make sure that it gets the same answer as if it executed them in the order written. As seen by THIS thread. But as seen by a DIFFERENT thread, the steps may appear to happen in a very different order. The other thread won&#8217;t see the effects until they get flushed to main memory, and that won&#8217;t happen after every computation (unless it is running 100x too slow!!!).</p>
<p>WHEW!! Big wall of text there, but the story should explain why one thread in a program may see the computations by another thread happen in a different order. So imagine this:</p>
<p>&#8220;In thread you A create the object, then initialize it, then pass it off to existing threads B and C.&#8221;</p>
<p>But imagine that from thread C&#8217;s point of view, A created it, then passed it off, and only initializes it LATER. In fact, perhaps C will start using it at the same time that A is initializing it &#8212; so it&#8217;s not really immutable, and terrible errors result. This is NOT just a theoretical risk: I have written real code that exhibited this behavior when running on a multi-core machine.</p>
<p>In order to help protect against this, the Java langage added a special exception to the Java threading model. Despite all other threading rules, if a class is declared &#8220;final&#8221; (immutable) and then all code executed within the constructor is guaranteed to be occur before the constructor ends EVEN AS SEEN BY OTHER THREADS. In theory, this is a great tool for creating data structures ahead of time and then reading them after initialization from other threads.</p>
<p>But in <strong>practice</strong> it isn&#8217;t so good. Initializing everything within the constructor of an immutable object turns out to be a real pain. Often you really want to use a hashtable (not immutable), or use Spring injection to populate your objects after the constructor, or slew of other choices that make it hard to stuff all your setup code inside of constructors. Some languages support this better: Scala and Closure are examples of languages on the JVM that use this feature well, but in Java it is awkward because there&#8217;s no little special support for working with immutable objects. Python, as a language, is even worse: there are NO immutable objects in Python! So while it might be possible to do as you suggest (create special locks and use them around every __init__ method, then carefully make sure nothing is modified outside of __init__), the resulting language wouldn&#8217;t really read like Python.</p>
]]></content:encoded>
			<wfw:commentRss>http://mcherm.com/permalinks/1/how-even-immutables-are-hard-with-threads/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>When to Wrap a Library</title>
		<link>http://mcherm.com/permalinks/1/when-to-wrap-a-library</link>
		<comments>http://mcherm.com/permalinks/1/when-to-wrap-a-library#comments</comments>
		<pubDate>Sun, 03 Jul 2011 16:20:21 +0000</pubDate>
		<dc:creator>mcherm</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://mcherm.com/?p=562</guid>
		<description><![CDATA[I find that this comes up fairly frequently. You find some useful library: perhaps it does logging, or enforces design-by-contract, or it provides an API for calling web services. But someone on the team suggests that instead of using the library directly, we should create a wrapper: &#8220;that way, if we ever decide to switch [...]]]></description>
			<content:encoded><![CDATA[<p>I find that this comes up fairly frequently. You find some useful library: perhaps it does logging, or enforces design-by-contract, or it provides an API for calling web services. But someone on the team suggests that instead of using the library directly, we should create a wrapper: &#8220;that way, if we ever decide to switch to a different library instead it will be easy to switch&#8221;. Is this a good idea?<span id="more-562"></span></p>
<p>There are a few really good reasons for wrapping a library. The most important of these, is in order to add functionality or simplify use of the library. For instance, in a recent project we used Spring&#8217;s library for web service calls in order to make calls to our own company&#8217;s collection of web services. But when calling <em>our</em> web services, there are a bunch of things that would be nice to do. We always want the same value for the address to connect to, the timeout for the calls, and the set of headers to provide. We want additional special handling for errors wrapped around every call. Adding these features in a wrapper makes the wrapper <em>less</em> powerful (now it&#8217;s good only for calling <em>our</em> services whereas Spring&#8217;s original library could call any web service), but at the same time makes it much more useful for that one specific purpose.</p>
<p>I have also seen cases where the existing library had a terrible interface (API), and the wrapper attempts to make it palatable. The &#8220;<a title="Slick" href="http://slick.cokeandcode.com/">Slick</a>&#8221; library is a <del>Python</del>[ed] Java wrapper around <a href="http://www.lwjgl.org/">LWJGL</a> adding no real functionality but making it decent enough to use. This is a rare use case: most libraries that have a terrible interface also have lousy features and you&#8217;re better off finding a different library instead.</p>
<p>The most common argument that I hear is neither of these cases: the most common argument that I hear is that we should wrap the library so we can easily switch to a different library. In fact, I most often hear this from people developing in a language with <a title="strong typing defined" href="http://www.artima.com/weblogs/viewpost.jsp?thread=7590">strong typing</a>, such as Java. I find this argument completely unpersuasive, for two reasons. First of all, when you switch libraries, the new library typically does <em>not</em> have exactly the same API. For example, when we <a title="My previous article on why we switched" href="http://mcherm.com/permalinks/1/logging-apis-evaluating-options">switched</a> to SLF4J for logging one of the reasons for doing so was that it offered a better API that allowed functionality not possible with the previous API. Secondly, if you DO switch to a library with an API that is equivalent, in a strongly-typed language you can use standard refactoring tools to perform the switch without any risk of introducing bugs. (If the APIs are close enough a simple search-and-replace for an import statement may do it.)</p>
<p>There are advantages to using a library directly. Developers who have encountered the library elsewhere may already be familiar with it. The documentation for the library is likely to be far more extensive than the documentation for your wrapper. It is often safe to assume that the designers of the library are better at designing an API for this feature than you are. As the library is upgraded, newer features will automatically be available. Most of all, having one fewer layers means there is simply less to learn to understand the system.</p>
<p>There are still a few advantages to wrapping without extra features. It gives you a place to add some logging code, or timers around an external call (for profiling), or validation checks. And there are some cases where you want to be able to use <em>different</em> libraries with the same codebase &#8212; then a wrapper is indispensable. <a href="http://commons.apache.org/logging/">Commons Logging</a> is an example of this: it allows a library to use different logging frameworks depending on what application it has been embedded in.</p>
<p>So my approach to the &#8220;wrap or not to wrap&#8221; question goes like this. First of all, will I add functionality with my wrappers or will removing functionality but thereby simplify the interface? If so, then wrapping makes sense. Secondly, if I haven&#8217;t yet chosen which library to use or if I want to switch back and forth between libraries, then a wrapper will be required. If neither of these applies, then I begin with a strong presumption that I should use the library on its own, and only a real need to add logging, monitoring, or other wrapped behavior will persuade me otherwise.</p>
]]></content:encoded>
			<wfw:commentRss>http://mcherm.com/permalinks/1/when-to-wrap-a-library/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Wrong SAAJ Version &#8211; a Spring bug</title>
		<link>http://mcherm.com/permalinks/1/wrong-saaj-version-a-spring-bug</link>
		<comments>http://mcherm.com/permalinks/1/wrong-saaj-version-a-spring-bug#comments</comments>
		<pubDate>Sun, 08 May 2011 14:09:48 +0000</pubDate>
		<dc:creator>mcherm</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://mcherm.com/?p=550</guid>
		<description><![CDATA[A few notes on a bug I had so next time I won&#8217;t make the same mistake. In spring-ws, in the class org.springframework.ws.soap.saaj.SaajSoapMessage, in the method getImplementation(), it uses SaajUtils.getSaajVersion(SOAPMessage) to determine the SAAJ version of this message. Unfortunately, that has a bug in it (or at LEAST a poor design) which can be quite [...]]]></description>
			<content:encoded><![CDATA[<p>A few notes on a bug I had so next time I won&#8217;t make the same mistake.<span id="more-550"></span></p>
<p>In spring-ws, in the class <code>org.springframework.ws.soap.saaj.SaajSoapMessage</code>, in the method <code>getImplementation()</code>, it uses <code>SaajUtils.getSaajVersion(SOAPMessage)</code> to determine the SAAJ version of this message. Unfortunately, that has a bug in it (or at LEAST a poor design) which can be quite confusing.</p>
<p><code>getSaajVersion()</code> calls <code>SOAPMessage.getSOAPPart().getEnvelope()</code>. If the message was badly formed (in my case, it declared an <code>xsi:schemaLocation</code>, but failed to declare <code>xmlns:xsi</code>) then this is where the exception will be thrown. The code in <code>getSaajVersion()</code> then catches <i>any generic SOAPException</i> and swallows it (assuming the message must be SAAJ_11). The &#8220;parse was invalid&#8221; SOAPException will be ignored.</p>
<p>That wouldn&#8217;t be so bad if it weren&#8217;t for the fact that the SOAPMessage is mutated. The <code>getSOAPPart()</code> and <code>getEnvelope()</code> start with &#8220;get&#8221; so it suggests that they do not mutate the SOAPMessage, but in fact they DO! In the normal flow of processing, this just happens to be the <em>first</em> time that the content of the message gets looked at. During the first time, the InputStream is read, but after that first time it won&#8217;t be read again. So the NEXT time these are called, they will behave differently (returning null instead of throwing an exception). This confused me for a couple of days, thinking I had 2 different errors (it was returning null, and my SAAJ version was messed up), and producing the truly bewildering behavior (which <i>should</i> have tipped me off) that I could fix things by looking at something in the debugger.</p>
]]></content:encoded>
			<wfw:commentRss>http://mcherm.com/permalinks/1/wrong-saaj-version-a-spring-bug/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Eric Lippert Tree Challenge</title>
		<link>http://mcherm.com/permalinks/1/eric-lippert-tree-challenge</link>
		<comments>http://mcherm.com/permalinks/1/eric-lippert-tree-challenge#comments</comments>
		<pubDate>Fri, 10 Sep 2010 03:48:58 +0000</pubDate>
		<dc:creator>mcherm</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://mcherm.com/?p=484</guid>
		<description><![CDATA[In his blog, Eric Lippert issued an interesting programming challenge. (Follow the link for details of the requirements.) Here is my solution. # # Programming challenge from # http://blogs.msdn.com/b/ericlippert/archive/2010/09/09/old-school-tree-display.aspx # # Done in Python, by Michael Chermside # import unittest import itertools # ============== Provided Problem ============== class Node: """This class is, by gentleman's agreement, [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blogs.msdn.com/b/ericlippert/archive/2010/09/09/old-school-tree-display.aspx">In his blog</a>, Eric Lippert issued an interesting programming challenge. (Follow the link for details of the requirements.) Here is my solution.<span id="more-484"></span></p>
<pre>
#
# Programming challenge from
#  http://blogs.msdn.com/b/ericlippert/archive/2010/09/09/old-school-tree-display.aspx
#
# Done in Python, by Michael Chermside
#

import unittest
import itertools

# ============== Provided Problem ==============
class Node:
    """This class is, by gentleman's agreement, immutable.
    Do not modify it after creation."""
    def __init__(self, text, *children):
        self.text = text
        self.children = children

# ============== SOLUTION ==============
NEWLINE = '\n'

def dumper(root):
    """Passed top node of the tree. Returns string for the tree."""
    return ''.join( dumper_helper(root, ()) )

def sequence_and_item(sequence, item):
    """Returns an iterable (one which can be iterated multiple times)
    whose items are taken from the given sequence, followed by the
    given item."""
    return [x for x  in itertools.chain(sequence, (item,))]

def dumper_helper(node, prefix):
    """Generator which is passed a node and a sequence of prefix
    strings to be applied for each line. (Note: it must be possible
    to iterate the prefix sequence multiple times.)

    Yields a series of strings that, when assembled, will contain
      (1) the node name,
      (2) a newline,
      (3) line contents for all its children, each preceeded by the specified prefix."""
    # -- the node name --
    yield node.text
    # -- newline --
    yield NEWLINE
    children = node.children
    if len(children) >= 1:
        # -- All but the last child --
        for child in children[:-1]:
            for x in prefix:
                yield x
            yield '├─'
            for x in dumper_helper(child, sequence_and_item(prefix, '│ ')):
                yield x
        # -- Last child --
        child = children[-1]
        for x in prefix:
            yield x
        yield '└─'
        for x in dumper_helper(child, sequence_and_item(prefix, '  ')):
            yield x

# ============== Unit Tests ==============

class TestDumper(unittest.TestCase):
    def test_min_tree(self):
        tree = Node('a')
        self.assertEqual(dumper(tree), 'a\n')
    def test_one_top_child(self):
        tree = Node('a', Node('b'))
        self.assertEqual(dumper(tree),
                         'a\n'
                         '└─b\n')
    def test_some_top_childs(self):
        tree = Node('a', Node('b'), Node('c'), Node('d'))
        self.assertEqual(dumper(tree),
                         'a\n'
                         '├─b\n'
                         '├─c\n'
                         '└─d\n')
    def test_three_levels(self):
        tree = Node('a', Node('b', Node('c')))
        self.assertEqual(dumper(tree),
                         'a\n'
                         '└─b\n'
                         '  └─c\n')
    def test_full_to_three_levels(self):
        tree = Node('a',
                    Node('aa',
                         Node('aaa'),
                         Node('aab'),
                         Node('aaz')
                    ),
                    Node('ab',
                         Node('aba'),
                         Node('abb'),
                         Node('abz')
                    ),
                    Node('az',
                         Node('aza'),
                         Node('azb'),
                         Node('azz')
                    )
                )
        self.assertEqual(dumper(tree),
                         'a\n'
                         '├─aa\n'
                         '│ ├─aaa\n'
                         '│ ├─aab\n'
                         '│ └─aaz\n'
                         '├─ab\n'
                         '│ ├─aba\n'
                         '│ ├─abb\n'
                         '│ └─abz\n'
                         '└─az\n'
                         '  ├─aza\n'
                         '  ├─azb\n'
                         '  └─azz\n')
    def test_erics_example(self):
        tree = Node("a",
                   Node("b",
                       Node("c",
                           Node("d")),
                       Node("e",
                           Node("f"))),
                   Node("g",
                       Node("h",
                           Node("i")),
                       Node("j")))
        self.assertEqual(dumper(tree),
                         'a\n'
                         '├─b\n'
                         '│ ├─c\n'
                         '│ │ └─d\n'
                         '│ └─e\n'
                         '│   └─f\n'
                         '└─g\n'
                         '  ├─h\n'
                         '  │ └─i\n'
                         '  └─j\n')

if __name__ == '__main__':
    unittest.main()
</pre>
<p>DEVELOPMENT AND DESIGN NOTES:</p>
<p>(1) I chose to write this in Python (Python 3). I don&#8217;t know .Net that well and didn&#8217;t have a compiler handy. Python seemed a good, readable choice. Its performance characteristics are different, but I don&#8217;t care about performance.</p>
<p>(2) I chose to write unit tests. Unusually for me, I wrote this using &#8220;Test Driven Design&#8221;: writing a test, then afterward modifying the code to make it work. TDD is perfect for this kind of small, contained problem when I have a fairly good idea of how to proceed but I need to be careful of subtle errors (always a danger with recursion).</p>
<p>(3) I chose a functional, not imperative, approach. This was for two reasons: because I wanted to practice that style of coding, and because the nature of the problem (tree processing) is well-suited for such an approach.</p>
<p>(4) I was NOT smart enough to write perfectly clean code on the first try. There was a final pass where I renamed functions and variables and removed unnecessary layers.</p>
<p>(5) Nor was I smart enough to see the elegant recursive subroutine on my first glance at the problem. At first, I passed the list of children (and a prefix) to my recursive subroutine and returned complete lines. Partway through, I looked at the structure of my code and realized that it was far cleaner if I passed in nodes (and a prefix) and returned &#8220;the node name, a newline, and the lines underneath it&#8221;. Perhaps I have figured that out from the start by examining the sample output better&#8230; but the cool thing was that making the change was a simple, easy transformation, and realizing there was a cleaner approach came from looking at the structure of my code. So I conclude that just starting in without a perfect understanding of the ideal recursive unit was *just fine* because I could clean it up later with refactoring. This was, perhaps, the most interesting thing I learned from the exercise.</p>
<p>(6) I chose to write my recursive function as a &#8220;pure function&#8221; (no side effects, immutable arguments). So a call returned the output for that sub-section of the tree. The other alternative would have been to pass some sort of &#8220;output object&#8221; to the function that it would write to as it went. I chose this approach mostly to practice functional programming. I feel it still came out very readable.</p>
<p>(7) At first, I was returning strings. But I felt bad about constantly building strings. After all, strings are immutable (in so many modern languages, including Python, .Net, and Java) and I worried that all the string manipulation would be a performance problem. Even though I didn&#8217;t care about performance, I chose to return lists of lines instead. Then I realized I could return lists of bits of string and never have to concatinate any strings until the final step. This approach would look perfectly normal in a language like Haskell. By the way, I never did actual performance testing so my belief that the string manipulation would be slow might be completely wrong. I <em>know</em> that intuition is unreliable for performance questions, but one often still must make a decision without complete information. </p>
<p>(8) Python has this nifty feature called a &#8220;generator&#8221;. It&#8217;s a function where you use the keyword &#8220;yield&#8221;. The compiler turns the function into one returning an iterator which will give out the values you would get if you ran the function and each &#8220;yield&#8221; put an item onto a list. Except that it does not actually run any bit of code until needed, and it does not materialize the list in memory. Another way to think about it is that when you pull a value from the iterator, the function executes until the first &#8220;yield&#8221; statement, then it exits. The NEXT time you pull a value from the iterator the function <em>picks up where it left off</em> with <em>all instance variable state intact</em> and continues until the next &#8220;yield&#8221;. It is a VERY cool feature, and I used it because of the amazing way it made my function be very readable. Other languages should adopt this idea!</p>
<p>(9) There was one place where I wanted to do something simple, but the most readable syntax I could find for it was unbearably ugly and complex. So I made it a subroutine, just for the sake of readability and of having a function comment explaining what the heck it was doing. That is why the function &#8220;sequence_and_item&#8221; exists.</p>
]]></content:encoded>
			<wfw:commentRss>http://mcherm.com/permalinks/1/eric-lippert-tree-challenge/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Error message was &#8217;3&#8242;.</title>
		<link>http://mcherm.com/permalinks/1/error-message-was-3</link>
		<comments>http://mcherm.com/permalinks/1/error-message-was-3#comments</comments>
		<pubDate>Fri, 20 Aug 2010 20:07:59 +0000</pubDate>
		<dc:creator>mcherm</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://mcherm.com/?p=479</guid>
		<description><![CDATA[Just a brief entry so the NEXT time this happens I can search and find the solution.Once before I had this problem, but I couldn&#8217;t remember the solution so this time I am writing it down. We (a Profile developer and I) were adding a new MRPC. When I tried calling it from Java I [...]]]></description>
			<content:encoded><![CDATA[<p>Just a brief entry so the NEXT time this happens I can search and find the solution.<span id="more-479"></span>Once before I had this problem, but I couldn&#8217;t remember the solution so this time I am writing it down. We (a Profile developer and I) were adding a new MRPC. When I tried calling it from Java I got the following mysterious message:</p>
<pre>dg.DirectGatewayException: Error on call to mrpc ZMRPCSETRATE with parameters [PRIME, 19/08/2010, 3.75], error message was '3'.</pre>
<p>The mysterious &#8220;error message was &#8217;3&#8242;.&#8221; is the key here. The cause is an error in the way that the exit is coded from the MRPC. When it generates this error it is because the author attempted to return a value via the QUIT statement:</p>
<blockquote>
<pre>..do stuff..
Q 1
</pre>
</blockquote>
<p>Instead, they should have set the return (or ret) parameter and then returned nothing via the QUIT statement:</p>
<blockquote>
<pre>...do stuff..
S RET 1
Q ""
</pre>
</blockquote>
<p>With this change, the error message goes away.</p>
]]></content:encoded>
			<wfw:commentRss>http://mcherm.com/permalinks/1/error-message-was-3/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Logging APIs &#8211; Evaluating Options</title>
		<link>http://mcherm.com/permalinks/1/logging-apis-evaluating-options</link>
		<comments>http://mcherm.com/permalinks/1/logging-apis-evaluating-options#comments</comments>
		<pubDate>Tue, 09 Feb 2010 13:30:58 +0000</pubDate>
		<dc:creator>mcherm</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://mcherm.com/?p=381</guid>
		<description><![CDATA[In my previous post, I defined a number of different features that logging libraries could have. This time, I will evaluate some Java libraries based on those features. I&#8217;ll start by ranking these according to how important I think they are, at least for my purposes. Severity &#8211; mandatory: no logging system should be without [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://mcherm.com/permalinks/1/logging-apis-feature-list">my previous post</a>, I defined a number of different features that logging libraries could have. This time, I will evaluate some Java libraries based on those features. <span id="more-381"></span>I&#8217;ll start by ranking these according to how important I think they are, at least for my purposes.</p>
<ol>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#severity">Severity</a> &#8211; <em>mandatory</em>: no logging system should be without this</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#tree">Tree of Log Topics</a> &#8211; <em>mandatory</em>: no logging system should be without this</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#configurable">Configurable</a> &#8211; <em>vital</em>: configuring log levels at runtime is something we use often</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#rotating">Rotating Log Files</a> &#8211; <em>vital</em>: our log files would be too big for the OS without this</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#lineformat">Configurable Log Line Format</a> &#8211; <em>vital</em>: it is unlikely that the off-the-shelf fields would be the ones we want to use</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#exceptions">Logging of Exceptions</a> &#8211; <em>vital</em>: getting stack traces from logs is one of our most productive debugging techniques</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#delayed">Delayed String Construction</a> &#8211; <em>vital</em>: I consider this to be an very undervalued feature. Without it, software <em>will</em> be slowed significantly and also will be less readable.</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#locations">Log to Multiple Locations</a> &#8211; <em>desirable</em>: sometimes this is handy. We used to use it, but at the moment we don&#8217;t.</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#directed">Configure where Logs are Directed</a> &#8211; <em>desirable</em>: this, too, we have used in the past but are not using right now.</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#standard">Standard or Widely Used</a> &#8211; <em>desirable</em>: in the Java world, only Log4J (the most widely used library) and java.util logging (which is in the standard library).</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list">Unique Messages</a> &#8211; <em>desirable</em>: as suggested in the comments, an ability to identify each log usage uniquely (probably by source file and line number) would be handy.</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#fallbacks">Sensible Fallbacks</a> &#8211; <em>desirable</em>: it&#8217;s nice that the library works OK when your config fails, because it helps in debugging the config problem.</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#queued">Logging Queued to Avoid Delays</a> &#8211; <em>nice extra</em>: although this seems like it would be a very useful feature, I do not know of any serious production logging tools that implement it.</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#threadlocal">Threadlocal Context Data</a> &#8211; <em>nice extra</em>: theoretically, this is extremely useful. In practice, people usually live without it and find the data by reading back through the log.</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#filtering">Log Filtering</a> &#8211; <em>nice extra</em>: if this were easy, we would use it to filter out SSNs and passwords from our logs, but we can live without it.</li>
<li><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#internationalization">Internationalization</a> &#8211; <em>undesirable</em>: I recommend against ever using this. Use logging ONLY for developers, NOT for end users; all developers should speak the same language.</li>
</ol>
<p>Next, I will assemble a list of different logging libraries in Java to be evaluated.</p>
<ul>
<li><strong><a href="http://logging.apache.org/log4j/">Log4J</a></strong>: Log4J by Apache is the most widely used logging framework in Java.</li>
<li><strong><a href="http://java.sun.com/javase/6/docs/api/index.html?java/util/logging/package-summary.html">Java util logging</a></strong>: Rather than adopting Log4J as the standard for logging in Java, Sun chose to clone it, creating something almost-but-not-quite the same as a standard Java library.</li>
<li><strong><a href="http://commons.apache.org/logging/">Commons Logging/Log4J</a></strong>: Commons Logging is a wrapper from Apache which is designed for use in libraries. It simply delegates to an underlying logging framework. The purpose is so a library can be configured to use the same logging system as the rest of the application. I will consider Commons Logging backed by Log4J.</li>
<li><strong><a href="http://www.slf4j.org/">SLF4J/Log4J</a></strong>: SLF4J is a project begun by the original author of Log4J. It is intended to provide a better API for calling into a logging framework, and it can connect to different logging back ends. I will consider SLF4J backed by Log4J.</li>
</ul>
<p>There are others (<a href="http://logback.qos.ch/">logback</a>, <a href="http://jlo.jzonic.org/">jLo</a>, and <a href="http://www.java-logging.com/">many others</a>), but I am fairly confident that one of these 4 will be the final choice, so at this point I am going to perform a full analysis just on these 4.</p>
<table border="1">
<tr align="center">
<th>Feature</th>
<th></th>
<th>Log4J</th>
<th>JavaUtil</th>
<th>Commons<br/>/Log4J</th>
<th>SLF4J/Log4J</th>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#severity">Severity</a></td>
<td>mandatory</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#tree">Tree of Log Topics</a></td>
<td>mandatory</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#configurable">Configurable</a></td>
<td>vital</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#rotating">Rotating Log Files</a></td>
<td>vital</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#lineformat">Configurable Log Line Format</a></td>
<td>vital</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#exceptions">Logging of Exceptions</a></td>
<td>vital</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#delayed">Delayed String Construction</a></td>
<td>vital</td>
<td>No</td>
<td>No</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#locations">Log to Multiple Locations</a></td>
<td>desirable</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#directed">Configure where Logs are Directed</a></td>
<td>desirable</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#standard">Standard or Widely Used</a></td>
<td>desirable</td>
<td>Yes</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list">Unique Message</a></td>
<td>desirable</td>
<td>No</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#fallbacks">Sensible Fallbacks</a></td>
<td>desirable</td>
<td>Meh</td>
<td>Meh</td>
<td>Meh</td>
<td>Meh</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#queued">Logging Queued to Avoid Delays</a></td>
<td>nice extra</td>
<td>No</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#threadlocal">Threadlocal Context Data</a></td>
<td>nice extra</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#filtering">Log Filtering</a></td>
<td>nice extra</td>
<td>No</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
<tr align="center">
<td align="left"><a href="http://mcherm.com/permalinks/1/logging-apis-feature-list#internationalization">Internationalization</a></td>
<td>undesirable</td>
<td>No</td>
<td>No</td>
<td>No</td>
<td>No</td>
</tr>
</table>
<hr/>
<a href="http://www.flickr.com/photos/melodysk/3035450347/"><img src="http://mcherm.com/blog/wp-content/uploads/2010/01/woodpile.jpg" alt="Woodpile" title="woodpile" width="500" height="333" class="aligncenter size-full wp-image-434" /></a></p>
<p>So, after considering all of these options, I have concluded that my preference is to use SLF4J as an interface, with the implementation from Log4J. Most of the options I have considered have more or less the same features. The deciding factors are (1) Threadlocal storage (MDC) is useful and not present in java.util.logging, and (2) The API for SLF4J provides an elegant solution to delay string construction, which is rather important for performance.</p>
<p>Conveniently, SLF4J publishes a tool for automatically converting existing java.util.logging and Log4J code to SLF4J, so the conversion should be relatively painless.</p>
]]></content:encoded>
			<wfw:commentRss>http://mcherm.com/permalinks/1/logging-apis-evaluating-options/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>


