blog dds: 2003-06-21 — FreeBSD Committer

I became a FreeBSD committer. I've been using BSD Unix systems since 1986 starting with 4.3 BSD on a pair of VAX 780 machines. In 1992, as a bored PhD student, I reimplemented sed(1) and contributed it the unencumbered BSD version that was then being put together; it is now part of the *BSD family. I crossed again paths with BSD software when the prize of the 2000 Usenix technical conference ``win a pet Shark contest'', Digital's Network Appliance Reference Design-DNARD, came with a NetBSD boot image. I used that code for drawing about 500 examples for my book Code Reading: The Open Source Perspective (Addison-Wesley 2003), detailing how to read software code others have written . Since 2001 I 've been using FreeBSD to control my home's security, communications, and entertainment systems as described in a SANE conference paper and a recent article in Personal and Ubiquitous Computing (as an academic I have to live by the "publish or perish" motto).

Why did I become a committer? My feeling is that FreeBSD, although less visible than other systems, exemplifies the state of the art in software engineering, both as a product and as a process. Its scale (6 MLOC), quality, level of integration, legacy, and development practices could well be unmatched both in the proprietary and open-source software. I therefore want to be close to this effort and will be proud to further contribute to it.

My (longish term) plans as a FreeBSD committer:

doc

Continue work on the consistency of the man pages:

Correct .Xr references (docs/51480)
Experiment with expanding the idea to check:
- command-line arguments (section 1 and 8)
- system call errors (section 2) (see e.g. docs/43891)
Integrate the manual checking script I wrote in the tools collection

src

Userland commands

Modify ash (src/bin/sh) to support network pipes
Add SIGINFO support to commands that could benefit (e.g. sed, make (silent make option))
Ensure commands detect and report write(2) errors on standard output
Correct command bugs (see e.g. bin/48424)

lib

Optimize libc/regex to build the finite automaton with native code instead of intepreting it (I am currently experimenting with a similar approach based on the JVM).
Locate candidate code for moving into a library
Investigate how kevent(2) can be used to aggressively cache file contents in library lookup operations (get*). (Do an strace(1) on apache's logresolve(8) to see what I mean).

kernel

Integrate and enhance my PCL-724 driver (i386/46238)
Fix the occasional bug (e.g. kern/46116)

src

The CScout refactoring browser I have implemented can parse arbitrary collections of C programs and allow its user to browse and safely rename identifiers, even in the presence of the most complex C preprocessor constructs. As a test case, I have already successfully processed bwk's awk source code and the complete apache distribution. I have calculated that the current implementation of CScout could process the complete FreeBSD distribution on a 1GHz processor in 12 hours using 5GB of physical and 12GB of virtual memory. It would therefore be interesting to initiate an effort to:

locate unused identifiers and dead code
improve identifier naming consistency

across the complete FreeBSD source tree. As an example, a quick run on just the source code of bin/cp reveals that the macro definition RETAINBITS in src/bin/cp/utils.c is not being used.

Given the memory requirements of this task, it would also be an interesting test case for the 64-bit FreeBSD version. This will be a massive effort, so volunteers with time and access to appropriate hardware are more than welcome.

Comments Post Toot! Tweet Share