Posts Tagged ‘Software’

Code archeology

Monday, April 13th, 2009

After a few years of service, the time had come for us to refactor one of the components of our system. Like many components, it was never fully specified, and had evolved overtime so I needed to do some preliminary study of the code to scope the amount of work needed and know more precisely what it was doing before starting the actual redevelopement.

This was a very interesting exercise which made me feel at  times like I was doing the job of an archeologist.

When I started looking at the code, I wasn’t quite sure what to expect.

 

Hieroglyphs

Most of it had not been touched for a while and the original developer had long gone, and because noone actually understood how it all worked, changing that code had become somewhat  taboo. Other developers would not dare approach it as if it were cursed, something that went beyond the “if it ain’t broke, don’t fix it” dogma, there was a sense that any modification was more likely to break something than do any good. And I guess the fact that some of us remembered that these parts were the results of long hours of what appeared to be tedious coding.

Other parts had been modified more recently by various developers when the new features were needed. 

What surprised me while navigating through the code, is that I could identify not only who wrote it without looking at the author on the CVS, but also the level of expertise of the programmer and the conditions under which the code was written.

For a start, I could easily identify the golden age period, when several parts of the code were beautifully organised. The author was confident in his skills, knew what he was doing, and had enough time to do it well. The different pieces fitted together within a well-thought architecture. The code was nicely commented, the coding style was consistent, errors were cleverly caught, functions were purposeful, names meaningful, etc… This was the work of the original developer, a few months after he had been working on the component, the CVS confirmed. This was like reading hieroglyphs on the walls of ancient Egypt’s temples.

I thought this was going to make my work a lot easier, maybe we could even reuse some the algorithms.

But I also found what appeared to have been developed in a period of crisis, something like a looming deadline. It was from the same author, but there were very few comments, some of the code comments were not relevant and looked suspiciously like a piece of code that had been copied there and adapted in a hurry. There were some TODO and FIXME scattered here and there. Some variables and functions had been left unused and large portions of the code were simply long sequences of instructions that had not been organised, errors were caught simply to prevent them from propagating. This was obviously rushed code, and it did not seem to have pleasant programming.

Now, this looked like I would have to pay back some technical debt… We would have to work from the few clues laying around to know what was worth refactoring…

Other parts of the code seemed more experimental, the variables names were not meaningful and function names did not correspond to what they were supposed to do, several chunks included code that was commented out and even if the code what profusely commented, it seems like it was more for the author to make sense of what it was doing than to share knowledge about the code. This could have been the work of a developer who did not understand fully what he needed to build, maybe the programmer was not given some clear specifications. The inconsistency in the coding style could be the result of a programmer learning or of several programmers participating in the code.

Overall, the exercise told me a story of the component. I knew not only what worked, but what was done, in what order and why. Even it was not all properly documented, the code was leaving clues just like that the shape of claypot jar made some 2000 years ago helps archeologist making sense of the events at the time. In the end, we felt like we were better equipped to know how to refactor it effectively.
But it also reminded us of the value of not accumulating too much technical debt and that good comments are gold.

Of the importance of choosing the right word

Friday, March 27th, 2009

Nearly a century ago, in his book The problems of Philosophy, Bertrand Russell commented on the work of Bishop Berkeley:

It is often said, as though it were a self-evident truism, that we cannot know that anything exists which we do not know. [...]

And he made an important remark:

Again, it is by no means a truism, and is in fact false, that we cannot know that anything exists which we do not know. The word ‘know’ is here used in two different senses. (1) In its first use it is applicable to the sort of knowledge which is opposed to error, the sense in which what we know is true, the sense which applies to our beliefs and convictions, i.e. to what are called judgements. In this sense of the word we know that something is the case. This sort of knowledge may be described as knowledge of truths. (2) In the second use of the word ‘know’ above, the word applies to our knowledge of things, which we may call acquaintance. This is the sense in which we know sense-data. (The distinction involved is roughly that between savoir and connaître in French, or between wissen and kennen in German.)

 

The Problems of Philosophy - by Bertrand Russell

The Problems of Philosophy - by Bertrand Russell

Bertrand Russell had realised that the problem was not so much an epistemological problem, but a problem of terminology that led to a fallacious argument. Bishop Berkeley, whom he was criticising, had failed to notice that the verb ‘to know’ was equivocal and he used alternately both meanings of the word ‘to know’ — and this caused endless discussions amongst philosophers, especially the English ones.

 

Although it may appear obvious once it is pointed out, this was not easy to notice. Continental philosophers – by using Latin, French or German – had circumvented the argument because they already had two different words for each meaning and therefore could readily make the distinction between the two meanings.

On our projects, we choose terms for our features, classes, functions and other concepts because we think that they reflect what we intend them to be. But sometimes, they end up being quite different from the original intent. Even if they seem appropriate at the time, it is important to ensure that the meaning of the terms we use actually reflects what they are used for, especially when designing API or a key functionality of the system.

If we do not, we run the risk of inducing confusion – not only amongst users and other developers, but amongst ourselves.

In one instance recently, we became aware that something was wrong with one of the features of our software: we could not describe it properly. We could not quite put our finger on why we could not, but we were having frustrating discussions because its definition was shifty. This is because we were not actually talking about the same thing.

We thought we were, but as it happens, that feature had evolved over time and started to be used under different contexts and it did slightly different things — and therefore the term for it started to mean different things: it had become equivocal.

Part of the problem came from the fact that we had not properly defined that feature initially. Having an umbrella term was convenient originally. In fact, it was almost too good because it always seemed to express what we wanted regardless of the context. But the more generic the term, the more likely it is to acquire different meanings over time. The fact that it was so fitting and generic regardless of the context should have given us a clue that we should specify it better.

Through their usage, words evolve over time by acquiring new meanings and some times discarding old ones. My trusted dictionary provides about a dozen different meanings for ‘know’ as verb; one of these meanings  “to have sexual intercourse with” has clearly fallen into disuse.

The meanings of words are bounds to evolve because language is a social institution. The terms that we use are simply labels that we use to refer to things, when these things change or the meaning of the term change, we expose ourselves to confusion and misunderstanding, so it is better to anticipate.

Carefully choosing the word describing a feature, a function or a piece of code is time saved later. It helps bringing clarity and avoiding pointless discussions because of equivocation. If philosophers could be fooled, I don’t see why software engineers would not be fooled as well.

A good dictionary should be part of every developer’s toolkit and a quick look at the thesaurus is not a waste of time. It will make life easier for other developers working on the software, it will make it easier to communicate with managers, and it will make it easier for the sales and marketing folks to explain the software.

Returning to the cave

Friday, February 13th, 2009

In The Republic, Plato imagines a group of people locked in a cave forced to make sense of the world from the shadows projected on the wall in front of them. There is no other reality because they are unable to see what creates the shadows outside the cave. But someone outside the cave would be able to see the true nature of things and know that the shadows are mere projections of other objects.

Software users are in the same position as the prisoners in the cave — they are stuck behind the screen with the user interface. Only developers are able to appreciate the true nature of software. Users build their own reality of the software based on the interface they see and interact with, blissfully unaware of the lines of codes, the algorithms, functions and classes that bring it to life.

Recently, I was asked to revisit a software to which I contributed. I had not looked at the code for a while; I had forgotten the true reality. But in a way, that was a good thing because for a while as I slowly rediscovered the functions and the architecture, I was able to have both perspectives.

In Plato’s story, prisoners who have released and have stepped outside eventually cease to notice the shadows because they only see the true nature of things. This is exactly what happens to senior developers. And this is a problem. It is because in the business of software, they often take the role of the shadow puppeteers, yet they are unable to imagine the world that they create for their users.

In the end, Socrates wonders whether released prisoners who decide to return to the cave are still any good at making sense of the shadows, he assumes that blinded by the sun, they may not no longer be accustomed to the darkness. I could tell him that a lot of developers are struggling to understand the user’s perspective because they have been blinded by the code.