Outcome and Aftermath

There are two kinds of statistics, the kind you look up and the kind you make up.

Archie Goodwin

Table 1.7, “Result summary” summarizes my results. I have marked cells where an operating system excels with a + and corresponding laggards with a –. For a number of reasons it would be a mistake to read too much from this table. First of all, the weights of the table's metrics are not calibrated according to their importance. In addition, it is far from clear that the metrics I used are functionally independent, and that they provide a complete or even representative picture of the quality of C code. Finally, I entered the +/– markings subjectively, trying to identify clear cases of differentiation in particular metrics.

Table 1.7. Result summary

Metric FreeBSDLinux SolarisWRK
File Organization
Length of C files   
Length of header files  +  
Defined global functions in C files   
Defined structures in header files    
Directory organization  +  
Files per directory    
Header files per C source file     
Average structure complexity in files  +  
Code Structure
Extended cyclomatic complexity  +  
Statements per function  +   
Halstead complexity  +  
Common coupling at file scope    
Common coupling at global scope  +   
% global functions  +  
% strictly structured functions   +
% labeled statements   +
Average number of parameters to functions     
Average depth of maximum nesting   
Tokens per statement     
% of tokens in replicated code +  
Average structure complexity in functions +   
Code Style
Length of global identifiers    +
Length of aggregate identifiers    +
% style conforming lines   +
% style conforming typedef identifiers  +
% style conforming aggregate tags +
Characters per line     
% of numeric constants in operands  + +
% unsafe function–like macros    
Comment density in C files   +
Comment density in header files   +
% misspelled comment words    +
% unique misspelled comment words    +
Preprocessing
Preprocessing expansion in functions  +  
Preprocessing expansion in files    
% of preprocessor directives in header files  +
% of non–#include directives in C files  +  
% of preprocessor directives in functions  +  
% of preprocessor conditionals in functions + +  
% of function–like macros in defined functions  +  
% of macros in unique identifiers  + +
% of macros in identifiers  +  
Data Organization
Average level of namespace pollution in C files +   
% of variable declarations with global scope  +  
% of variable operands with global scope +   
% of identifiers with wrongly global scope  +  
% of variable declarations with file scope +   
% of variable operands with file scope  +  
Variables per typedef or aggregate   +
Data elements per aggregate or enumeration   +

Nevertheless, by looking at the distribution and clustering of markings, we can arrive at some important plausible conclusions. The most interesting result, which I drew from both the detailed results listed in the previous sections and the summary in Table 1.7, “Result summary”, is the similarity of the values among the systems. Across various areas and many different metrics, four systems developed using wildly different processes score comparably. At the very least, the results indicate that the structure and internal quality attributes of a large and complex working software artifact, will represent first and foremost the formidable engineering requirements of its construction, with the influence of process being marginal, if any. If you're building a real-world operating system, a car's electronic control units, an air traffic control system, or the software for landing a probe on Mars it doesn't matter if you're managing a proprietary software development team or running an open source project: you can't skimp on quality. This does not mean that process is irrelevant, but that processes compatible with the artifact's requirements lead to roughly similar results. In the field of architecture this phenomenon has been popularized under the motto form follows function [Small 1947].

One can also draw interesting conclusions from the clustering of marks in particular areas. Linux excels in various code structure metrics, but lags in code style. This could be attributed to the work of brilliant motivated programmers who aren't, however, effectively managed to pay attention to the details of style. In contrast, the high marks of WRK in code style and low marks in code structure could be attributed to the opposite effect: programmers who are effectively micro-managed to care about the details of style, but are not given sufficient creative freedom to develop techniques, design patterns, and tools that would allow them to conquer large-scale complexity.

The high marks of OpenSolaris in preprocessing could also be attributed to programming discipline. The problems from the use of the preprocessor are well-known, but its allure is seductive. It is often tempting to use the preprocessor to create elaborate domain-specific programming constructs. It is also often easy to fix a portability problem by means of conditional compilation directives. However, both approaches can be problematic in the long run, and we can hypothesize that in an organization like Sun programmers are discouraged from relying on the preprocessor.

A final interesting cluster appears in the low marks for preprocessor use in the FreeBSD kernel. This could be attributed to the age of the code base in conjunction with a gung-ho programming attitude that assumes code will be read by developers at least as smart as the one who wrote it. However, a particularly low level of namespace pollution across the FreeBSD source code could be a result of using the preprocessor to set up and access conservatively scoped data structures.

Despite various claims regarding the efficacy of particular open or close-source development methods, we can see from the results that there is no clear winner (or loser). One system with a commercial pedigree (OpenSolaris) has the highest balance between positive than negative marks. On the other hand, WRK has the largest number of negative marks, while OpenSolaris has the second lowest number of positive marks. Looking at the open source systems, although FreeBSD has the highest number of negative marks and the lowest number of positive marks, Linux has the second highest number of positive marks. Therefore, the most we can read from the overall balance of marks is that open source development approaches do not produce software of markedly higher quality than proprietary software development.