posted under category: Software Quality on February 13, 2020 by Nathan
Maybe you don’t remember, so let me remind us all how hard programming is.
I have four children. Believe me when I say that I have been trying to get them to program from a very young age, and so far it’s not all roses and candy treats. Usually there is an amount of familiarization that a flourishing person needs to have with a subject in order to comprehend it fully. Learning how to read, for example, takes sounding out letters, followed by dick-and-jane books, followed by frustration, some time off (usually months), and then retrying, usually with a higher amount of success. Learning to read is a long process and doing it well involves creating many new brain pathways.
Imagine trying to teach your parents to program. With only a few exceptions, there’s a very high chance they simply won’t be able to get it.
This is something I wrestle with, and recently I’ve come to the conclusion that the standard human brain can’t easily grasp the concept of templating. Instead of building something, we programmers are building something for someone else to build something. The more meta up that chain our brains can go, the better a programmer we tend to be. Problems are always solved on a higher level than they were created.
Let’s say that we learned programming, we make our first application, we send it out for our users to try it, and what happens? It doesn’t work! We thought of every conceivable possibility. The problem was that we didn’t conceive of enough possibilities.
“Software is not governed by Moore’s Law; more like Murphy’s Law”
– Douglas Crockford
Even when we find a good solution, that doesn’t mean we’ve reached programming bliss.
“Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.”
– Jamie Zawinski
Sometimes progress is very hard and it can literally take years to get past our hurdles. My friend Ben wrote this a number of years ago:
“Every time I think I am making progress, I come to realize that I only have more confusion over the ‘right’ way to implement something”
– Ben Nadel
But don’t forget, programming is very difficult. It can be near-infinitely complex. Think about this for a couple minutes:
“Computer programs are the most complex things that humans make. Programs are made up of a huge number of parts, expressed as functions, statements, and expressions that are arranged in sequences that must be virtually free of error. The runtime behavior has little resemblance to the program that implements it. Software is usually expected to be modified over the course of its productive life. The process of converting one correct program into a different correct program is extremely challenging.”
– Douglas Crockford
If you’re programming, you are doing something extremely intricate. You are valuable because most people cannot do what you are doing. But wait! Your value isn’t only in programming, you are especially valuable because of your ability to put business processes to computers, and to explain these computer systems back to the business.
posted under category: Software Quality on February 7, 2020 by Nathan
One of the oldest and rawest forms of software measurement is the inimitable count of the number of lines of code. Let’s talk about that.
“Measuring programming progress by lines of code is like measuring aircraft building progress by weight.”
I work at an airplane manufacturer. We actually know the final delivery weight of an airplane, even adjusted for paint and seating arrangements, and in airplane manufacturing, knowing the current and final weight could help. On the other hand, a work of software is done when all the features are complete and the bugs are worked out, or better yet, when we know it will make or save enough money to start selling or using.
Lines of code differ drastically between systems. Choice of programming language is one of the primary reasons for a drastic change in the lines of code. One lower level language may allow us to implement a solution in 100 lines of code, while a higher level one has it as a built-in-function - one line. That doesn’t mean either language is worse, just different. Typically we see higher level languages develop applications much faster and debug easier thanks to the smaller codebases, but there is a tradeoff in performance and resource usage.
Another distinguishing factor is what function the application is expected to perform. It’s hard to compare a shopping cart system to a manufacturing system, they have completely different uses and thus a line count to compare them does essentially nothing. Perhaps one application is more straightforward and requires less code, even if it produces more in some kind of measurable output.
Then there’s the people problem. A developer on one system may have clever tricks that reduce the line count. This could make the codebase equally better or worse, generally depending on the cleverness of the solution, and thus the cleverness of the programmer. We often think we are smart. 90% of programmers believe they are above average, thanks to the cognitive bias called illusory superiority.
This then leads us to the yin-yang problem. More features need more lines of code, however there is a tax because more lines of code become harder for the software writer to comprehend and modify successfully. This is the exact nature of software complexity, and leads to the introduction of bugs, the creation of testing frameworks, the discussion of software quality, and the entire software industry. Remember that no lines of code means no bugs, no rework, no expensive developers, and no security hacks.
Next, we have to consider how the number of lines are calculated. Does that count include 3rd party libraries? Does it include blank lines and whitespace? Does it include comments? We can make a strong case both that all of these should or should not be included.
Lines of code does tell one one important thing however. It tells us how many lines of code are in the application’s codebase. It can give us a general order of magnitude for what we will find. It can give us the feeling of largeness, especially when we compare file-by-file (though typically a byte count would suffice). That’s it. Number of lines of code can tell us how many lines of code there are. That is all.
posted under category: Software Quality on February 1, 2020 by Nathan
Why do we want to measure software?
Software measurement is what we software developers do instinctively when we start a new job, join a new project, or enter a new team. It’s critical for us to understand the scope of a product so that we can learn our boundaries and understand our place. Understanding software is one of our best skills - the others being the ability to explain it, and the ability to write it (okay, there may be more to it). Our fight against the chaos of the unknown requires that we split up what we can observe into categories that we can understand.
Many times I’ve been give a legacy codebase without a proper introduction. That’s the worst feeling, and was my motivation for writing a big post on how much documentation you should have. If we could somehow bottle up this knowledge and pass it on to our colleagues, it could be very beneficial to future maintainers, helping to lower the long tail of operational cost on a product, and maybe even stave off the inevitable retirement of an app and the cost of rewriting it for a long time. And that’s our goal - it’s to remove cost, not to add it. It’s to understand the world and to be able to explain it.
Good measurements of software can bring a valuable understanding to us and to the people paying our bills. Is the software tiny and easy? That’s good to know! Is the software humongous and complex? That’s good to know! Is the software terrible and in dire need of help? That’s good to know!
How do we measure software?
Answer that and you’re likely to get a grant. The truth is, software isn’t specifically measurable, for the same way we can’t make software perfect. Sure, at a certain small size we can verify that it works as expected, but, you know, can it read mail yet?
Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can.
That’s tongue in cheek. These days we could change Zawinski’s Law to interacting with social media or running as a native mobile app.
Our trouble with measuring software is that measuring by any single logical metric is a sure way to miss the mark altogether. Software measurement has to be on a different level, in fact, many different levels. Measuring software can only be done when we take in a system holistically. That means understanding it on a much broader level.
As soon as we measure the size on disk, somebody’s in there changing things. As soon as we measure the memory footprint, someone logs in, adding data. As soon as we measure the performance, another proces ties up the CPU and throws off our measurements. Software is soft, and changes too frequently to make accurate measurements that matter. Software is fluid, moving from disk to RAM, and transmitting across networks. It doesn’t take much of a program to utilize all of the hardware given it. It seems that simple metrics just won’t be enough to measure software. Measuring software by its metrics can give a picture of the whole, though, and there’s value in that.
When thinking holistically, one of the aspects that would need to be measured is the quality of a work. We need to measure the quality to complete this picture.
How can we measure software quality?
There are more intangible ways of measuring software, such as number of customer requirements, size or activity of the product’s backlog, and number of users. In particular, we can concentrate on the number of logged defects, if they have been logged at all. None of these are a holistic measurement of the quality of a piece of software, but they do help to paint the broad picture.
I believe that quality is a collection of mostly-intangible aspects. This means quality cannot be measured with a number or a picture. I think maybe a dozen pictures and user interviews and developer deep dives could help complete a mental model of the quality of an application.
There are quality metrics, such as the amount of code coverage (amount of code that is covered in unt tests). That’s a potential measurement of quality, but it could still be a terribly built app with great unit tests. Cyclomatic complexity is one of my favorites, having built a McCabe Indexing tool of my own a few years back. It tells you how complex your CFML files and functions are, and can show line-by-line where your most complex areas are. Another one I’ve looked at is function point analysis, which can map the central parts of an application by usage.
Once a software analyst understands the system holistically, they should be able to express that holistic idea to others. Of course this is a breakdown of its own type, and relies on the communication skills of the analyst.
What are the aspects of software quality?
I’ve compiled this list that I’ll go over at a later date, but feel free to suggest more. Quality is:
- Client Portability
- Server Portability
posted under category: Software Quality on January 30, 2020 by Nathan
The official software crisis was about computing power versus software power in the 1960’s. Hardware was outpacing software, which led to the need for higher level languages, better development methodologies, and more workers to write programs. Ask anyone in their later years who had been a programmer for their career and you may learn more than an earful about these problems. Something ongoing from the old software crisis were some of the signs and symptoms that we see today:
- Software is over budget
- Software is delivered late
- Software is low quality
- Programs run slowly
- Results don’t meet requirements
- Code is unmanageable
- Sometimes applications are never completed, vaporware
We see a good amount of that today. Personally, I see it all over enterprise software, and frequently in outsourced programs – that’s another topic for another day. I also see it in inexperienced programmers, especially when they are allowed to work alone, and sometimes in development done outside of normal software companies and departments. For real though, sometimes we even see it in more experienced looking teams as well.
The struggle is real. A lot of this software crisis is still in effect. It just doesn’t act like it did back before I was born.
Better software lifecycle processes typically help with late, over-budget software and with missed requirements. I personally have seen how Agile was a giant victory over the 6-month deployment cycle in years before. Tools like CI/CD enablers can put working code in front of our users even faster. Unfortunately, these processes don’t remove the problem of low-quality code, though they help mitigate many of the hurdles.
Isn’t that just software life though? We struggle through it every day in order to make the world a better place. That’s the programming life. That’s what we get paid to do. Maybe it’s kind of the dark side, sure, but that’s where we live. What do you think? Is the software crisis still a term worth using today?
posted under category: Software Quality on January 26, 2020 by Nathan
I really enjoy a lot of the content on Y Combinator’s Hacker News. One recent article that came up was Why do we fall into the rewrite trap?, which was a great look into the anti-pattern of software rewrites instead of thoughtful refactoring.
The comments tend to be as good as the articles, for example, one commentor mentioned Theseus’ Gambit - a philosophical engineering question about replacing every plank on a ship, at what point is it a new ship? This is a great question when it comes to software rewrites. Isn’t a refactor somehow creating a new work? Or is it only a rewrite if you start over?
So many comments agreed with the author and added their own notes. One wrote “rewriting a codebase is a lose-lose proposition: I’ll never suggest or recommend it again.” Others mentioned the need to rewrite based on technologies, like old Flash applications.
Not long after that, #TeamRefactor got a victory on a desktop app - it was a bigger team and while the code was bad - I mean like, a complete misunderstanding of OOP - we decided to refactor a lot of it instead of starting over. It was a somewhat political decision because we didn’t want to make the previous dev team angry. Unfortunately, they were equally offended that we were planning a refactor. Some drama took place. The refactor effort took a long time. We finished it eventually, but we ended up with an app that we still didn’t like (it was better, and many of the initial underlying problems were fixed, but…), our friends didn’t like us anymore, and a lot of time was wasted - far more than we expected. I would vote that refactoring this app was an overall failure.
A little later I was given charge of an IT app that was built only 6 months earlier and still in active development. Its foundation was in ASP.NET WebForms and VB.NET. After I gagged a tiny bit, I swallowed my pride and tried to hunker down and start on improvements. Backing up, our department has been a C# MVC group for some years, and when we follow one path, we try to make the rest of our choices as vanilla, gravy, plain, and standard as possible. This helps with ramping up new employees. The group that started this app chose .NET but then went with every other different choice, probably just based on familiarity. I took a task off the backlog for creating a mobile-friendly version, which turned into a reimplementation of the entire app, running much faster, was natively responsive, perfectly printable, and ended up being the spearhead of the larger vision for version 2: a complete rewrite. I did it with tact, not offending any parties, announcing my successes and advertising my work while helping the product team make the right choices on the vision. The end result was a much better product that fell closer in line with department goals. Another point for #TeamRewrite!
Not long after, I was handed a C# MVC + React app. The current steward was about to leave the company, so I had a transition week and then it was mine. The app was fine, there was nothing really wrong with it. On day one of independence I had about 30 commits to fix a bunch of things that I thought needed changing but didn’t want to do it and stomp on my former co-worker’s code and feelings. Very quickly, a number of things started working better, I caught up with all the changed from the data team that worked out of our Seattle office, I refactored some things that probably needed it, created my own backlog, and pretty much got bored about 3 weeks in. This is when I started to rewrite the UI in Vue.js. There were some requirements coming down that would have me using Redux, and I do not want to use Redux again, also I couldn’t figure out React-Router’s chunk splitting (I guess it depends on
React.lazy()), and I don’t like the React runtime’s bloated file size. On the other hand, I’m a huge fan of Vue, which is smaller, faster and I just like it more. I’m opinionated! I was pretty sure I could keep up with the React app changes and create the Vue version (I could), chalk up the UI change from React-Semantic-UI to Vuetify (Material) as just being a visual update (I did), and get it all done in 4 weeks (I didn’t). It took me 6 weeks. I floated it with the data team and they were positive on it, so I pushed it to the users… they barely noticed. After a while I got some comments about how much they liked the changes. The result was an application that ran a little faster, downloaded quicker, was more maintainable, could easily support all the new requirements, and I just plain liked it more. #TeamRewrite for the win. Was it worthwhile? Maybe, until I consider the pain I would have to go through for implementing Redux, then it was definitely worth it!
So what have we learned?
Maybe that Nathan just likes to rewrite applications for fun. Am I an anomaly if I keep having a lot of success with it? I work on internal enterprise systems; I wonder of the applications I keep inheriting are just really, terribly bad.
For the record, I’ve worked on plenty of systems that I did not suggest rewriting. I have one app that I inherited and have kept alive for twelve years now. It’s been through some major refactors, but it’s been evolutionary the whole time. Of course now it’s time for a rewrite - haha! It’s ColdFusion after all, and we don’t do that anymore. How did this application avoid Nathan’s swift justice against poorly coded enterprise apps? Well it was very well structured from the start - strong MVC roots, layered architecture, it was built on a firm foundation. Maybe #TeamRefactor should get a few more points.
While refactoring is typically less expensive, I have a list of reasons that will ping that rewrite bell in my head.
- When the existing application is very obviously, truly a disaster
- The UI looks like it’s 20+ years old
- When the existing technology platform is inadequate
- When upcoming changes require significant rework
I remember reading - though I can’t recall where - a hard figure. It was something to the extent of: when more than 60% of an application or component needs to be refactored, that application or component should probably be rewritten.
I really don’t prefer to rewrite, and neither should anyone, however, I believe that rewrites are much more possible than we are led to believe.