Code archeology

April 13th, 2009

After a few years of service, the time had come for us to refactor one of the components of our system. Like many components, it was never fully specified, and had evolved overtime so I needed to do some preliminary study of the code to scope the amount of work needed and know more precisely what it was doing before starting the actual redevelopement.

This was a very interesting exercise which made me feel at  times like I was doing the job of an archeologist.

When I started looking at the code, I wasn’t quite sure what to expect.

 

Hieroglyphs

Most of it had not been touched for a while and the original developer had long gone, and because noone actually understood how it all worked, changing that code had become somewhat  taboo. Other developers would not dare approach it as if it were cursed, something that went beyond the “if it ain’t broke, don’t fix it” dogma, there was a sense that any modification was more likely to break something than do any good. And I guess the fact that some of us remembered that these parts were the results of long hours of what appeared to be tedious coding.

Other parts had been modified more recently by various developers when the new features were needed. 

What surprised me while navigating through the code, is that I could identify not only who wrote it without looking at the author on the CVS, but also the level of expertise of the programmer and the conditions under which the code was written.

For a start, I could easily identify the golden age period, when several parts of the code were beautifully organised. The author was confident in his skills, knew what he was doing, and had enough time to do it well. The different pieces fitted together within a well-thought architecture. The code was nicely commented, the coding style was consistent, errors were cleverly caught, functions were purposeful, names meaningful, etc… This was the work of the original developer, a few months after he had been working on the component, the CVS confirmed. This was like reading hieroglyphs on the walls of ancient Egypt’s temples.

I thought this was going to make my work a lot easier, maybe we could even reuse some the algorithms.

But I also found what appeared to have been developed in a period of crisis, something like a looming deadline. It was from the same author, but there were very few comments, some of the code comments were not relevant and looked suspiciously like a piece of code that had been copied there and adapted in a hurry. There were some TODO and FIXME scattered here and there. Some variables and functions had been left unused and large portions of the code were simply long sequences of instructions that had not been organised, errors were caught simply to prevent them from propagating. This was obviously rushed code, and it did not seem to have pleasant programming.

Now, this looked like I would have to pay back some technical debt… We would have to work from the few clues laying around to know what was worth refactoring…

Other parts of the code seemed more experimental, the variables names were not meaningful and function names did not correspond to what they were supposed to do, several chunks included code that was commented out and even if the code what profusely commented, it seems like it was more for the author to make sense of what it was doing than to share knowledge about the code. This could have been the work of a developer who did not understand fully what he needed to build, maybe the programmer was not given some clear specifications. The inconsistency in the coding style could be the result of a programmer learning or of several programmers participating in the code.

Overall, the exercise told me a story of the component. I knew not only what worked, but what was done, in what order and why. Even it was not all properly documented, the code was leaving clues just like that the shape of claypot jar made some 2000 years ago helps archeologist making sense of the events at the time. In the end, we felt like we were better equipped to know how to refactor it effectively.
But it also reminded us of the value of not accumulating too much technical debt and that good comments are gold.

The blind men and the elephant

March 31st, 2009

There is a classic Indian tale about blind men and an elephant:

Once upon a time there was a certain raja who called to his servant and said, ‘Come, good fellow, go and gather together in one place all the men of Savatthi who were born blind… and show them an elephant.’ ‘Very good, sire,’ replied the servant, and he did as he was told. He said to the blind men assembled there, ‘Here is an elephant,’ and to one man he presented the head of the elephant, to another its ears, to another a tusk, to another the trunk, the foot, back, tail, and tuft of the tail, saying to each one that that was the elephant.

When the blind men had felt the elephant, the raja went to each of them and said to each, ‘Well, blind man, have you seen the elephant? Tell me, what sort of thing is an elephant?’

Thereupon the men who were presented with the head answered, ‘Sire, an elephant is like a pot.’ And the men who had observed the ear replied, ‘An elephant is like a winnowing basket.’ Those who had been presented with a tusk said it was a ploughshare. Those who knew only the trunk said it was a plough; others said the body was a grainery; the foot, a pillar; the back, a mortar; the tail, a pestle, the tuft of the tail, a brush.

Then they began to quarrel, shouting, ‘Yes it is!’ ‘No, it is not!’ ‘An elephant is not that!’ ‘Yes, it’s like that!’ and so on, till they came to blows over the matter.

This is the buddhist version — there are other versions. Buddha used this parable to make a philosophical point about the relativity of one’s perspective.

Elephantine perfectionWhen we develop software, it is important that the various members do not feel like blind men examining an elephant. While different perspectives are essential to avoid a dogmatic approach and foster creative thinking, everyone should still agree on what is actually being developed. Reconciling people’s perspectives around a common project is a good way to avoid conflict and ensure that everyone is on the same wavelength.

It is the role of software architects to bring their vision to the rest of the team but also to take everyone’s perspectives into account and include them in a common project.

Of the importance of choosing the right word

March 27th, 2009

Nearly a century ago, in his book The problems of Philosophy, Bertrand Russell commented on the work of Bishop Berkeley:

It is often said, as though it were a self-evident truism, that we cannot know that anything exists which we do not know. [...]

And he made an important remark:

Again, it is by no means a truism, and is in fact false, that we cannot know that anything exists which we do not know. The word ‘know’ is here used in two different senses. (1) In its first use it is applicable to the sort of knowledge which is opposed to error, the sense in which what we know is true, the sense which applies to our beliefs and convictions, i.e. to what are called judgements. In this sense of the word we know that something is the case. This sort of knowledge may be described as knowledge of truths. (2) In the second use of the word ‘know’ above, the word applies to our knowledge of things, which we may call acquaintance. This is the sense in which we know sense-data. (The distinction involved is roughly that between savoir and connaître in French, or between wissen and kennen in German.)

 

The Problems of Philosophy - by Bertrand Russell

The Problems of Philosophy - by Bertrand Russell

Bertrand Russell had realised that the problem was not so much an epistemological problem, but a problem of terminology that led to a fallacious argument. Bishop Berkeley, whom he was criticising, had failed to notice that the verb ‘to know’ was equivocal and he used alternately both meanings of the word ‘to know’ — and this caused endless discussions amongst philosophers, especially the English ones.

 

Although it may appear obvious once it is pointed out, this was not easy to notice. Continental philosophers – by using Latin, French or German – had circumvented the argument because they already had two different words for each meaning and therefore could readily make the distinction between the two meanings.

On our projects, we choose terms for our features, classes, functions and other concepts because we think that they reflect what we intend them to be. But sometimes, they end up being quite different from the original intent. Even if they seem appropriate at the time, it is important to ensure that the meaning of the terms we use actually reflects what they are used for, especially when designing API or a key functionality of the system.

If we do not, we run the risk of inducing confusion – not only amongst users and other developers, but amongst ourselves.

In one instance recently, we became aware that something was wrong with one of the features of our software: we could not describe it properly. We could not quite put our finger on why we could not, but we were having frustrating discussions because its definition was shifty. This is because we were not actually talking about the same thing.

We thought we were, but as it happens, that feature had evolved over time and started to be used under different contexts and it did slightly different things — and therefore the term for it started to mean different things: it had become equivocal.

Part of the problem came from the fact that we had not properly defined that feature initially. Having an umbrella term was convenient originally. In fact, it was almost too good because it always seemed to express what we wanted regardless of the context. But the more generic the term, the more likely it is to acquire different meanings over time. The fact that it was so fitting and generic regardless of the context should have given us a clue that we should specify it better.

Through their usage, words evolve over time by acquiring new meanings and some times discarding old ones. My trusted dictionary provides about a dozen different meanings for ‘know’ as verb; one of these meanings  “to have sexual intercourse with” has clearly fallen into disuse.

The meanings of words are bounds to evolve because language is a social institution. The terms that we use are simply labels that we use to refer to things, when these things change or the meaning of the term change, we expose ourselves to confusion and misunderstanding, so it is better to anticipate.

Carefully choosing the word describing a feature, a function or a piece of code is time saved later. It helps bringing clarity and avoiding pointless discussions because of equivocation. If philosophers could be fooled, I don’t see why software engineers would not be fooled as well.

A good dictionary should be part of every developer’s toolkit and a quick look at the thesaurus is not a waste of time. It will make life easier for other developers working on the software, it will make it easier to communicate with managers, and it will make it easier for the sales and marketing folks to explain the software.

Relax and create

March 1st, 2009

It came to my mind that the nature of my job required me to be alternately productive and creative but realised that is often practically impossible to be both simultaneously. This reminded me of an article I read a few years ago in a scientific publication on how our stress level affected our creativity and productivity differently.

Creativity vs productivity

Creativity vs productivity

In a nutshell, it argued that we were reaching our creative peak only at a moderate level of stress. This is because we need enough stress to focus on a particular topic or problem, but too much stress eliminates creative solutions by narrowing our focus. In a complete state of relaxation, our mind wanders so we do not create, we procrastinate. And when the stress levels rise past a given point, we are not allowing our brains to take into account the subtleties that permit creative thinking and analyse the problem under different angles, so our creativity declines.

But on the other hand, when the stress level increases, our focus increases and we are therefore able to produce output-driven solutions more efficiently. If I recall correctly, the process at work was somewhat similar to what happens when we’re in a state of fear: our brains responds as efficiently as possible by taking the minimum amount of information possible to yield solutions quickly. If the stress levels rise even further, the brain enters a state of panic where we’re unable to produce anything of much use either productively or creatively.

Another non-negligible aspect was that our brains could only make effective use of a given level of stress for a period of time that was inversely proportional to that stress level. So even if we were able to be very productive thanks to an increased stress level, we would certainly not be able to sustain that level of productivity for very long: someone who is chronically stressed is not productive or creative.

Despite the fact that ‘creativity’ and ‘productivity’ without context can be very vague terms this certainly rings true with what I do.

In my experience, there is nothing like a looming deadline in order to churn out some code furiously and feel like the conductor of an orchestra performing Wagner’s ‘Ride of the Valkyries’. There is a sense of heightened focus and high efficiency. But I have rarely been at my most creative in these times, and I can easily look back and think I could have done things differently. This would also typically be a time where I would accumulate technical debt… and I would feel drained after a few hours of intense coding.

Also I can’t help but notice that some of the most inventive software solutions were designed while I had a bit more time on my hand so that I could approach the problem from various angles and explore a few ideas.

Our jobs as software engineers, architects, developers and programmers require us to come up with creative solutions when we design and actually produce something efficiently when we implement. So if there is some truth is this, it means that we need to perform a balancing act between stress and relaxation. Alternating between the two, allows us to be more effective at each given task. Allowing ourselves to relax is essential not only for producing creative solutions but also be more efficient at getting things done when deadlines are coming and some objectives need to be met.

Note that unfortunately, I have lost all references to the original article, it may have had something to do with cortisol and been the results of observing a few rats in a lab… but stress was to be understood as a measurable physiological characteristic.

Returning to the cave

February 13th, 2009

In The Republic, Plato imagines a group of people locked in a cave forced to make sense of the world from the shadows projected on the wall in front of them. There is no other reality because they are unable to see what creates the shadows outside the cave. But someone outside the cave would be able to see the true nature of things and know that the shadows are mere projections of other objects.

Software users are in the same position as the prisoners in the cave — they are stuck behind the screen with the user interface. Only developers are able to appreciate the true nature of software. Users build their own reality of the software based on the interface they see and interact with, blissfully unaware of the lines of codes, the algorithms, functions and classes that bring it to life.

Recently, I was asked to revisit a software to which I contributed. I had not looked at the code for a while; I had forgotten the true reality. But in a way, that was a good thing because for a while as I slowly rediscovered the functions and the architecture, I was able to have both perspectives.

In Plato’s story, prisoners who have released and have stepped outside eventually cease to notice the shadows because they only see the true nature of things. This is exactly what happens to senior developers. And this is a problem. It is because in the business of software, they often take the role of the shadow puppeteers, yet they are unable to imagine the world that they create for their users.

In the end, Socrates wonders whether released prisoners who decide to return to the cave are still any good at making sense of the shadows, he assumes that blinded by the sun, they may not no longer be accustomed to the darkness. I could tell him that a lot of developers are struggling to understand the user’s perspective because they have been blinded by the code.

URI encoding in location bar, different strategies

February 8th, 2009

In my previous post, I mentioned that some browsers escape non-ASCII characters in URIs in an inconsistent manner.

I started to look a little deeper in the issue and tried a few more characters in the location bar and see what strategies the browsers adopted.

Default character set

First, when I tried characters which are defined in Windows-1252 but not in ISO-8859-1, I found that both Firefox and IE are actually using my machine’s default encoding Windows-1252 and not simply ISO-8859-1 to replace characters in the query string.

The euro symbol € (U+20AC) was replaced by %80 (Windows-1252) whereas it should be %E2%82%AC (utf-8). The € symbol is not part of ISO-8859-1. I had similar results with the mdash — (U+2014): %97 instead of %E2%80%94. The mdash also is part of the Windows-1252 but not ISO-8859-1.

Of course, this means that these escaped byte sequences will make little sense for a Linux or MacOS server.

When outside the range

Then I tried to see what happened with characters that where not part of Windows-1252, for example the Greek small letter alpha α (U+03B1). IE and Firefox started to differ. Firefox opted for utf-8 this time so I got %CE%B1 as expected. IE assumed that I must have wanted a latin small letter A, and replaced the α (alpha) by a (A). When using the Greek small letter lambda λ (U+03BB). Firefox used utf-8 again and gave me %CE%BB. IE unable to provides a suitable replacement, replaced my character by a question mark ?.

At least, Firefox provides a way for me to access sites using a different character set. IE on the other hand assumes that I would only type the URL of sites which use the same character set. So if I want to type the URL of a site which uses say Greek characters, it probably won’t work.

Mixing characters

I decided to test the strategy adopted when mixing some Unicode characters that only appear in different ISO-8859-x character sets. I used ‘é’ (U+00E9) in ISO-8859-1 and ‘α′ (U+03B1) in ISO-8859-7.

Firefox used utf-8 for transforming that part of the query string and replaced it with %C3%A9%CE%B1. However, IE persisted with Windows 1252 and replaced it with %E9a.

I am glad that Firefox realised that it required an universal character set to correctly represent the URL. But this means that the encoding of characters in the query string is dependent upon the other characters: without the alpha, my e with an acute accent is encoded using Windows 1252. IE is more consistent, but make it impossible for me to type the URL that contains characters from another character set.

Summary

  • Firefox assumes the default character set, and if it cannot use it, it falls back on utf-8;
  • IE always uses the default character set, and therefore tries to find replacement characters; if it cannot, it gives up;
  • Google Chrome uses utf-8 only.

Additional Info

I tested this on Firefox 3.0.6, Internet Explorer 7.0.6 and Google Chrome 1.0.

Below are the characters that I have used and their byte representation in various character encodings.

Char Unicode ISO-8859-1 CP-1252 ISO-8859-7 UTF-8
é U+00E9 E9 E9 C3 A9
U+20AC 80 A4 E2 82 AC
U+2014 97 E2 80 94
α U+03B1; E1 CE B1
λ U+03BB; EB CE BB

Below are the results of the characters I tested in the query string of a URI typed in the location bar:

Char(s) IE Firefox Chrome
é %E9 %E9 %C3%A9
%80 %80 %E2%82%AC
%97 %97 %E2%80%94
α a %CE%B1 %CE%B1
λ ? %CE%BB %CE%BB
éα %E9a %C3%A9%CE%B1 %C3%A9%CE%B1

A puzzling URI encoding affair

February 6th, 2009

While working some URI template/pattern java library, I started looking into the details of URI encoding.

I have realised that browsers make puzzling decisions when it comes to encoding URIs that the user types in the location bar. They seem to use different encodings for the different parts of the URI.

Consider the following URL:

  http://acme.com/café?variété=#caféine

This isn’t a valid URI because the URI generic syntax (RFC3986) requires all bytes that are not 7-bit ASCII to be percent encoded. But it is perfectly legitimate from the user’s perspective. Users do not know about RFC-3986, and they should not, so the browser should do the work on their behalf.

The W3C has indicated how non-ASCII characters in URI attribute values should be handled in an appendix of the HTML specifications:

Although URIs do not contain non-ASCII values (see [URI], section 2.1) authors sometimes specify them in attribute values expecting URIs (i.e., defined with %URI; in the DTD). For instance, the following href value is illegal:

<A href="http://foo.org/Håkon">...</A>

We recommend that user agents adopt the following convention for handling non-ASCII characters in such cases:

  1. Represent each character in UTF-8 (see [RFC2279]) as one or more bytes.
  2. Escape these bytes with the URI escaping mechanism (i.e., by converting each byte to %HH, where HH is the hexadecimal notation of the byte value).

This procedure results in a syntactically legal URI (as defined in [RFC1738], section 2.2 or [RFC2141], section 2) that is independent of the character encoding to which the HTML document carrying the URI may have been transcoded.

I thought this rule would similarly apply to addresses typed in the location bar. But both Firefox and Internet Explorer (and apparently Opera also) will generate the following URI HTTP request:

  http://localhost/caf%C3%A9?vari%E9t%E9=#caf%C3%A9ine

The ‘é’ (U+00E9) has been escaped differently:

  • The path and fragment part have a percent encoded version of the UTF-8 byte encoding: C3 A9.
  • The query string however was encoded using ISO-8859-1 (or a superset): E9.

This leaves us with a URI that uses two different characters encodings — what a mess!

Google Chrome however produces the URI that the W3C prescribes by using UTF-8 for all parts of the URL.

Strato-cumulus

January 28th, 2009

It is only after reading Jason Scott’s F*ck the Cloud, that I realised that my two previous posts were in fact touching upon the same subject from two different angles. Though, I do not necessarily agree with all of Jason’s points because his definition of “The Cloud” seems a little vague, he has made several good points.

I have to admit I am still not sure what “The cloud” is — people seem to have many different views, a bit like for Web2.0. I note that the Wikipedia entry for cloud computing is move-protected due to vandalism, and a large number of techies prefer surrounding the term with inverted commas.

To clarify this post, I will refer to “the cloud” as the collection of software and hardware services which, by using the Internet as the underlying infrastructure, enable data to live outside the home/office space. It therefore relies on SaaS, virtualisation and Web2.0 to make it happen. This definition therefore includes GMail, Blog platforms, Social networks as well as Amazon EC2. To me, the term is simply a convenient way to refer to the current trend in Web development; even if given the lack of integration and interoperability, we should really use the plural form…

In my post on Google’s approach to business rationalisation, I was looking at the service provider’s end. I was wondering about the effect of shutting down Web services for a company which is actively promoting cloud computing. Because companies like Google and Amazon are at the steering wheel, people are watching their moves; especially service providers in search of a good business model. Freemium might be the way to go because it allows the service to reach critical mass, but I am sure that other models will emerge.

What I was implying was that providers are not only selling productivity, they are selling trust as well. The issue of trust isn’t new, but when you have control over the software and hardware, it is easier to take responsibility for the data. When users lose the direct control of their data, trust seems vital. After all, there could be thorny legal issues regarding data retention, liability, etc… At the moment, providers take no responsibility (read the fine print!) which makes it theoretically risky to utilise “the cloud” for anything that is mission critical or sensitive.

But people are bad risk assessors and if “the cloud” solves their problem, people will embrace it. As Dare Obasango mentioned on his blog, given the massive adoption, trust might already be an issue that is ready evaporate. To follow on his example, it took a few decades for people to realise that seat belts and airbags might be good ideas, and drink driving and speeding not such good ones… The fatality rate did not deter people from using cars: gradually, manufacturers made the cars safer and traffic authorities enforced rules and educated people.

In my other post, I mentioned an article published in a French magazine that reconstructed the life of an Internet user from all the information that he had left on the Internet. What I found interesting was that people were putting so much data, and therefore so much faith, in “the cloud”. Of course, in the case of social applications such as Facebook or Twitter, the data is generally trivial and can hardly be considered mission critical or sensitive — although, a lot of people would not appreciate to lose their list of friends, photos, business contacts, etc…

I was pointing out that the coming generation, not trailing the same history of problems as the previous ones, is making anyone 30+ sound grumpy — in fact most of the criticism was coming from experienced professionals. There was a time when people would print everything they type because their hard drive was not safe enough; nowadays, they say that only their hard drive (for personal users) or their data center (for businesses) is safe enough, they see “the cloud” as a big fluffy thing that will disappear. Maybe they would appreciate my old dot matrix printer.

My guess is that users will continue to take advantage of “the cloud”, and they will learn to decide what data is important. Businesses also will learn, and because they are better risk assessors, they will pay premium for better guarantees and service when needed. And providers will probably start providing better interoperability, and continue to adapt their services to a growing demand.

Trust (or lack thereof) did not affect adoption, but risk awareness eventually changed the behaviour of users and manufacturers. In that regard, what happened with the car industry will happen with the Web.

“The cloud” is no silver bullet, we just need to understand better when it is appropriate to use. It will gradually disappear though, but only because it is a silly term.

The Life Of – related articles

January 27th, 2009

A few articles worth mentioning relating to my previous post “The Life Of”.

From the New Scientist:

From ReadWriteWeb:

Interestingly, I just saw this:

I might add a few more later…

Google’s new year clean-up

January 21st, 2009

There have been many reactions to Google’s announcements that they will cut back on services and shelve several of their pet projects. I have to admit that I am actually not familiar with several of the services that are being shutdown and they are already lots of well-informed comments for each individual project, so this is more a general comment.

I understand the need for a company to focus on its core assets and shut down products which bring little revenue, especially during a downturn. In fact, I wonder why it took them so long to deal with Google Video given that YouTube is a lot more popular and offers a similar service. It was a redundant service, so I guess they could have merged it or removed it a lot earlier — they didn’t have to wait for the credit crunch.

I am sure that the same could be said for other services.

But it did not seem to be the case of several other services that were given the knife. A lot of bloggers felt that projects like Jaiku or Google NoteBook had a lot of potential and were never given a real chance; which led to speculation about Google’s intention in the first place — was it just to acquire new talented development teams? If that is the case, this confirms that Google is adopting the conventional strategies. Fine.

But I wonder if Google may also be shooting themselves in the foot…

Rationalising a network of services

A few years ago, I became really interested in small world networks: an example stuck to my mind because it was counter-intuitive. In one of the books there was a chapter about transport optimisation, explaining how having some lines operating at a loss enabled other lines to be more profitable, therefore generating a global profit.

There were several examples of transport networks eventually collapsing because instead of evaluating the network as a whole, lines were evaluated individually and independently. When non-profitable lines were chopped off or the frequency of service was reduced to reflect attendance, users would start looking at other options. But they would also stop using other lines which would then became less profitable. As the rationalisation process would continue, other lines would be chopped off until only a handful of profitable lines would survive or none at all. Globally though, the network system would not generate any more revenue, but it would provide a lot less service and not necessarily more efficiently.

It was because the structural properties of the network ensured that the whole structure was sustainable; these non-profitable lines were not valuated properly. The network had to be rationalised as a whole to be made more efficient, not purely assessed based on individual lines: management had failed to understand that these services were interdependent and what it meant to remove them. Commuters did not want to go just from one station to another, they wanted to go from one place to another and would choose the transportation means that best suited their needs.

What does that have to do with Google?

I certainly don’t think that Google is going to shrink anytime soon due to some drastic cost-cutting — they are smarter than that and I am sure that they do consider each project carefully. And of course, they have every right to shut down any service they please, they run a business after all and most of these services are provided for free.

But I think they may be neglecting the interdependence between their products and the relationship with their users. People don’t want to use just that particular service, they are trying to find solutions to their problems. Google uses the Web to provide solutions and I believe that the interconnected nature of the Web should make them consider business rationalisation differently.

One reason why we use Google services instead of others, is not necessarily because they are better but because we feel Google is more reliable. A start-up could go bust and since I don’t want to lose my data or change my habits, I will be more hesitant before committing to one of their services. As Pisani mentioned on Transnets (in French) , there is a moral contract between users and Google: by interrupting some web services, we are reminded that maybe we should not entrust the Internet giant with all our online documents, emails, blogs, videos, feeds, applications, etc…

We are in an interesting transition period where the Web is supplanting the Operating System as a development platform. By shutting down these services even if they operate at a loss, Google is giving a cold shower to those who believe in moving applications to the web and were counting on Google to spearhead the move. The trend towards web-based applications is not going to stop, but we now have a reason to think twice before using Google’s services and their APIs.

I am still glad that there are companies out there like Google to beat new ground, but their recent house cleaning is a good reminder that if we follow, we may also need to backtrack.