dopefly dot com - The Dopefly Tech Blog

Lines of Code

posted under category: Software Quality on February 7, 2020 by Nathan

One of the oldest and rawest forms of software measurement is the inimitable count of the number of lines of code. Let’s talk about that.

“Measuring programming progress by lines of code is like measuring aircraft building progress by weight.”
-Bill Gates

I work at an airplane manufacturer. We actually know the final delivery weight of an airplane, even adjusted for paint and seating arrangements, and in airplane manufacturing, knowing the current and final weight could help. On the other hand, a work of software is done when all the features are complete and the bugs are worked out, or better yet, when we know it will make or save enough money to start selling or using.

Lines of code differ drastically between systems. Choice of programming language is one of the primary reasons for a drastic change in the lines of code. One lower level language may allow us to implement a solution in 100 lines of code, while a higher level one has it as a built-in-function - one line. That doesn’t mean either language is worse, just different. Typically we see higher level languages develop applications much faster and debug easier thanks to the smaller codebases, but there is a tradeoff in performance and resource usage.

Another distinguishing factor is what function the application is expected to perform. It’s hard to compare a shopping cart system to a manufacturing system, they have completely different uses and thus a line count to compare them does essentially nothing. Perhaps one application is more straightforward and requires less code, even if it produces more in some kind of measurable output.

Then there’s the people problem. A developer on one system may have clever tricks that reduce the line count. This could make the codebase equally better or worse, generally depending on the cleverness of the solution, and thus the cleverness of the programmer. We often think we are smart. 90% of programmers believe they are above average, thanks to the cognitive bias called illusory superiority.

This then leads us to the yin-yang problem. More features need more lines of code, however there is a tax because more lines of code become harder for the software writer to comprehend and modify successfully. This is the exact nature of software complexity, and leads to the introduction of bugs, the creation of testing frameworks, the discussion of software quality, and the entire software industry. Remember that no lines of code means no bugs, no rework, no expensive developers, and no security hacks.

Next, we have to consider how the number of lines are calculated. Does that count include 3rd party libraries? Does it include blank lines and whitespace? Does it include comments? We can make a strong case both that all of these should or should not be included.

Lines of code does tell one one important thing however. It tells us how many lines of code are in the application’s codebase. It can give us a general order of magnitude for what we will find. It can give us the feeling of largeness, especially when we compare file-by-file (though typically a byte count would suffice). That’s it. Number of lines of code can tell us how many lines of code there are. That is all.