The Psychology of the AWS Outage
Unless you've been living on another planet, you're certainly aware that over the past couple of hours Amazon's AWS S3 service has experienced a serious outage, which has affected thousands of sites and services around the world. For reasons I will elaborate in this post, the coverage of this outage has been blown completely out of proportion. So, what's the difference between the perceived risk associated with the AWS outage and the actual risk of this outage?
Continue reading "The Psychology of the AWS Outage"
The Road to Debugging Success
A colleague recently asked me how to debug a Linux embedded system that crashed in the Unix shell (and only there), when its memory got filled through the buffer cache. He added that when he emptied the buffer cache the crash no longer occurred.
Continue reading "The Road to Debugging Success"
Debugging PCSecrets Synchronization
A reader of my Effective Debugging book commented that debugging is learned through experience. I think he's partly right, so I'll periodically describe here techniques and tools I use when debugging. A problem I faced today was the inability of the PC-based PCSecrets program to sync with the Secrets for Android counterpart. Here is how I troubleshot and solved the problem.
Continue reading "Debugging PCSecrets Synchronization"
Netdata on a Raspberry Pi
A couple of days ago I had the privilege to see a demo of netdata by Costa Tsaousis, the person behind this project. The project offers comprehensive real time monitoring of a Linux computer with low overhead in a single easily-installed and self-contained system. I thought this was too good to be true, but the number of users and installations hinted that this could well be the case. I therefore decided to install the system on a Raspberry Pi I'm configuring to replace an ancient 20 year old IBM PS/2 server.
Continue reading "Netdata on a Raspberry Pi"
Debugging a File Synchronization Problem
In Effective Debugging I write that if a web search doesn't return you any useful results, then maybe you're barking at the wrong tree. Here's an example.
Continue reading "Debugging a File Synchronization Problem"
Service Orchestration with Rundeck
Increasingly, software is provided as a service. Managing and controlling the service’s provision is tricky, but tools for service orchestration, such as Rundeck, can make our lives easier. Take software deployment as an example. A well-run IT shop will have automated both the building of its software using tools like make, Ant, and Maven and the configuration of the hosts the software runs on with CFEngine, Chef, or Puppet (see the post “Don’t Install Software by Hand”). Furthermore, version control tools and continuous integration will manage the software and the configuration recipes, handling developer contributions, reviews, traceability, branches, logging, and sophisticated workflows. However, these tools still leave a gap between the software that has been built and is ready to deploy, and the server that has been configured with the appropriate components and libraries and is ready to run the software.
Continue reading "Service Orchestration with Rundeck"
Developing in the Cloud
Running a top-notch software development organization used to be a capital-intensive endeavor, requiring significant technical and organizational resources, all managed through layers of bureaucracy. Not anymore. First, many of the pricey systems and tools that we developers need to work effectively are usually available for free as open source software. More importantly, cheap, cloud-based offerings do away with the setup, maintenance, and user support costs and complexity associated with running these systems. Here are just a few of the services and providers that any developer group can easily tap into
(you can find many more listed here):
Continue reading "Developing in the Cloud"
How to Create Your Own Git Server
Although I'm a happy (also paying) user of GitHub's offerings,
there are times when I prefer to host a private repository
on a server I control.
Setting up your own Git server can be useful
if you're isolated from the public internet,
if you're subject to inflexible regulations,
or if you simply want features different from those offered by GitHub
(and other similar providers).
Setting up a Git server on a Unix (Linux, Mac OS X, *BSD, Solaris, AIX)
machine isn't difficult,
but there are many details to observe.
Here is a complete guide.
Continue reading "How to Create Your Own Git Server"
The virtual machine (VM) is the most dazzling comeback in information technology. IBM implemented a VM platform architecture in the late 1960s in its CP/CMS operating system. The company’s goal was to provide the time-sharing capabilities that its batch-oriented System/360 lacked. Thus a simple control program (CP) created a VM environment where multiple instances of the single-user CMS operating system could run in parallel. Thirty years later, virtualization was rediscovered when companies like VMware found ways to virtualize the less accommodating Intel x86 processor architecture. The popularity of Intel’s platform and the huge amount of software running on it made virtualization an attractive proposition, spawning within a decade tens of proprietary and open source virtualization platforms.
Continue reading "Virtualize Me"
Don't Install Software by Hand
An IT system’s setup and configuration is a serious affair. It increasingly affects us developers mainly due to the proliferation and complexity of internet-facing systems. Fortunately, we can control and conquer this complexity by adopting IT-system configuration management tools.
Continue reading "Don't Install Software by Hand"
Why are AWS Command-Line Tools so Slow?
Amazon's Elastic Compute Cloud command-line tools are useful building
blocks for creating more complex shell scripts.
They allow you to start and stop instances, get their status,
add tags, manage storage, IP addresses, and so on.
They have one big disadvantage: they take a long time to run.
For instance, running ec2-describe-instances for six instances
takes 19 seconds on an m1.small AWS Linux instance.
One answer given,
is that this is caused by JVM startup overhead.
I found that hard to believe,
because on the same machine a Java "hello world" program executes in 120ms,
and running ec2-describe-instances --help takes just 321ms.
So I set out to investigate, and, using multiple tracing tools and techniques,
this is what I found.
Continue reading "Why are AWS Command-Line Tools so Slow?"
Package Management Systems
DLL hell was a condition that often afflicted unfortunate users of old Microsoft Windows versions. Under it, the installation of one program would render others unusable due to incompatibilities between dynamically linked libraries. Suffering users would have to carefully juggle their conflicting DLLs to find a stable configuration. Similar problems distress any administrator manually installing software that depends on incompatible versions of other helper modules.
Continue reading "Package Management Systems"
Using the HP 4470c Scanner Under Windows 7
Hewlett Packard nor
Microsoft Windows 7
offer native support for my HP 4470c scanner.
Throwing a working scanner away to buy a new one only because some
software was missing seemed like a waste,
so I looked for an alternative solution.
This is how I made it work using SANE,
an open source framework for scanners.
Continue reading "Using the HP 4470c Scanner Under Windows 7"
Sane vim Editing of Unicode Files
Being able to use plain alphabeitc keys as editing commands
is for many of us a great strength of the vi editor.
It allows us to edit without hunting for the placement of
the various movement keys on each particular keyboard,
and, most of the time,
without having to juggle in order to combine particular keys with
Continue reading "Sane vim Editing of Unicode Files"
However, this advantage can turn into a curse when editing files
using a non-ASCII keyboard layout.
When the keyboard input method is switched to another script
(Greek in my case, or, say, Cyrillic for others)
vi will stop responding to its normal commands, because it will
encounter unknown characters.
Here is how I've dealt with this problem.
Batch Files as Shell Scripts Revisited
Four years ago I wrote
about a method that could be used to have the Unix Bourne shell interpret
Windows batch files.
I'm using this trick a lot, because programming using the Windows/DOS
batch files facilities is decidedly painful, whereas the Bourne
shell remains a classy programming environment.
There are still many cases where the style of Unix shell programming
outshines and outperforms even modern scripting languages.
Continue reading "Batch Files as Shell Scripts Revisited"
Useful Polyglot Code
Four years ago I blogged about an
incantation that would allow the Windows command interpreter (cmd) to execute
Unix shell scripts written inside plain batch files.
Time for an update.
Continue reading "Useful Polyglot Code"
The Risk of Air Gaps
As some readers of this blog know,
from this month onward I'm on a leave of absence from my
to head the
Greek Ministry of Finance
General Secretariat of Information Systems.
The job's extreme demands explain the paucity of blog postings here.
I'll describe the many organizational and management
challenges of my new position in a future blog post.
For now let me concentrate on a small but interesting technical aspect:
the air gap we use to isolate the systems involved in processing
tax and customs data from the systems used for development and production
Continue reading "The Risk of Air Gaps"
Madplay on an Intel Mac
Numerous MP3 players around my house pull music from a central file server.
The hardware I'm using is extremely diverse and many devices
can nowadays be politely described as junk:
they include 100MHz Pentiums with 16MB RAM, and an ARM-based prototype
lacking support for floating point operations.
For the sake of simplicity I've standardized the setups around
a web server running on each machine to list static HTML pages
containing the available music files,
and simple shell-based CGI clients that invoke
play the music.
When I added an Intel-based Mac to the mix I found that madplay
refused to work, producing only a white noise hiss.
Continue reading "Madplay on an Intel Mac"
Parallelizing Jobs with xargs
With multi-core processors sitting idle most of the time
and workloads always increasing,
it's important to have easy ways to make the CPUs earn their money's worth.
told me today how the Unix xargs command can help in this regard.
Continue reading "Parallelizing Jobs with xargs"
A Well-Tempered Pipeline
I am studying the use of open source software in industry.
One way to obtain empirical data is to look at the operating systems and
browsers used by the Fortune 1000 companies by examining browser logs.
I obtained a list of the Fortune 1000 domains and wrote a pipeline
to summarize results by going through this site's access logs.
Continue reading "A Well-Tempered Pipeline"
Monitor Process Progress on Unix
I often run file-processing commands that take many hours to
finish, and I therefore need a way to monitor their progress.
The Perkin-Elmer/Concurrent OS32 system I worked-on for a couple
of years back in 1993 (don't ask)
had a facility that displayed for any executing
command the percentage of work that was completed.
When I first saw this facility working on the programs I maintained,
I couldn't believe my eyes, because I was sure that those rusty
Cobol programs didn't contain any functionality to monitor their progress.
Continue reading "Monitor Process Progress on Unix"
An Inadvertent Denial of Service Attack
If you're wondering why this blog was down for the past few hours, here is
In an earlier blog post I listed a small script
I'm using to lock-away door knockers who attempt to break into our
group's computer by trying various passwords.
If you like puzzles, read the script again and think how it
could be used against us by isolating our computer from the entire world.
Continue reading "An Inadvertent Denial of Service Attack"
Suspend Windows from the Command Line
I used to leave my computer up all night, but I've come to realize that this
is ecologically unsound.
Now I suspend it before going to sleep, but this missed running
a daily job that used to run at 03:00 am.
The job marks my students' exercises and send me email with the next day's
I thus decided to schedule the task to wakeup my computer at 3:00 am,
run the job, and then suspend it again.
The Windows scheduler allows you to specify a wakeup option,
but not a subsequent suspend.
Furthermore, it seems that Windows lacks a way to suspend from the
command line (while maintaining the ability to hibernate), and the
only free tools on the web are distributed in executable form,
so I ended writing a small tool myself.
Continue reading "Suspend Windows from the Command Line"
LTO Tape Drive Compression Considered Harmful
I used to think that tape drive compression was a silly marketing trick
used by manufacturers to inflate the advertised capacity of their tape drives.
Apparently it is worse than that.
Continue reading "LTO Tape Drive Compression Considered Harmful"
The Relativity of Performance Improvements
Today, after receiving a 1.7MB daily security log message containing
thousands of ssh failed login attempts from bots around the
world, I decided I had enough.
I enabled IPFW to a FreeBSD system I maintain, and added a script
to find and block the offending IP addresses.
In the process I improved the script's performance.
The results of the improvement were unintuitive.
Continue reading "The Relativity of Performance Improvements"
The Memory Savings of Shared Libraries
A recent thread in the
FreeBSD ports mailing list
discusses the benefits and drawbacks of static builds.
How can we measure the memory savings of shared libraries?
Continue reading "The Memory Savings of Shared Libraries"
In an earlier blog entry
I described ACM's imaginative way to handle web site downtime.
Today I noticed that the web site of the
uses an equally imaginative (and low-tech) way to handle excessive web
Continue reading "Handling Traffic"
The Treacherous Power of Extended Regular Expressions
I wanted to filter out lines containing the word "line" or a double quote
from a 1GB file.
This can be easily specified as an extended regular expression,
but it turns out that I got more than I bargained for.
Continue reading "The Treacherous Power of Extended Regular Expressions"
Location-Based Dictionary Attacks
I get daily security reports from the hosts I manage.
Typically these contain invalid user attempts for users like
guest, www, and root.
(Although FreeBSD doesn't allow remote logins for root,
I was surprised to find out that many Linux distributions allow them.)
Continue reading "Location-Based Dictionary Attacks"
Ideally web sites should be up on a 24 by 7 basis.
This is however a difficult and often an expensive proposition.
Today I saw on the ACM Portal site
an innovative alternative.
Continue reading "Handling Downtime"
Breaking into a Virtual Machine
Say you're running your business on a rented
virtual private server.
How secure is your setup?
I wouldn't expect it to be more secure than the system your server runs
on, and a simple experiment confirmed it.
Continue reading "Breaking into a Virtual Machine"
Knowledge is power.
—Sir Francis Bacon
The ultimate source of truth regarding a program is its execution. When a program runs everything comes to light: correctness, CPU and memory utilization, even interactions with buggy libraries, operating systems, and hardware. Yet, this source of truth is also fleeting, rushing into oblivion at the tune of billions of instructions per second. Worse, capturing that truth can be a tricky, tortuous, or downright treacherous affair.
Continue reading "I Spy"
A Humbling Upgrade
Yesterday I upgraded one of the servers I maintain from
FreeBSD 4.11, which had reached its
end of life, into the latest
production release 6.2.
It was a humbling experience.
Continue reading "A Humbling Upgrade"
Software Rejuvenation is Counterproductive
In the February issue of the Computer magazine
Grottke and Trivedi propose four strategies for
fighting bugs that are difficult to detect and reproduce.
operation and replicating software are indeed time-honored and practical
solutions. When coupled with appropriate logging, they may allow an
application to continue functioning, while also alerting its maintainers
that something is amiss. On the other hand, the proposal to restart
applications at regular intervals (rejuvenation as the authors call
it), doesn't allow us to find latent bugs, sweeping them instead under
the carpet. This lowers the bar on the quality we expect from software,
and will doubtless result in a higher density of bugs and increasingly
complicated failure modes.
Continue reading "Software Rejuvenation is Counterproductive"
Open Source and Professional Advancement
Doing really first-class work, and knowing it, is as good as wine, women (or men) and song put together.
— Richard Hamming
I recently participated in an online discussion regarding the advantages of the various certification programs. Some voiced skepticism regarding how well one can judge a person's knowledge through answers to narrowly framed multiple choice questions. My personal view is that the way a certification's skills are examined is artificial to the point of uselessness. In practice I often find solutions to problems by looking for answers on the web. Knowing where and how to search for an answer is becoming the most crucial problem-solving skill, yet typical certification exams still test rote learning. Other discussants suggested that certification was a way to enter into a job market where employers increasingly asked for experience in a specific technology. My reaction to that argument was that open source software development efforts offer us professionals a new and very valuable way to obtain significant experience in a wide range of areas. In this column I'll describe how we can advance professionally by contributing to open source projects.
Continue reading "Open Source and Professional Advancement"
What Can System Administrators Learn from Programmers?
Although we often hear about program bugs and techniques to get
rid of them, we seldom see a similar focus in the field of system
This is unfortunate, because increasingly the reliability of an IT system
depends as much on the software comprising the system as on the support
infrastructure hosting it.
Continue reading "What Can System Administrators Learn from Programmers?"
Efficient Human Multitasking
I sometimes hear colleagues complaining that they can't get anything done,
because they have too many tasks in their head.
I've found that in order to increase the efficiency of my work
I need a moderately large selection of pending tasks.
This allows me to match the type of work I can do at a given moment
with a task in the most optimal way.
Continue reading "Efficient Human Multitasking"
Batch Files as Shell Scripts
Although the Unix Bourne shell offers a superb environment for combining
existing commands into sophisticated programs, using a Unix shell
as an interactive command environment under Windows can be painful.
Continue reading "Batch Files as Shell Scripts"
Surprising Findings on Software Reuse
Kevin DeSouza and his colleagues in a recent
article in the
Communications of the ACM published some surprising
findings regarding software reuse:
reuse happens more by novices rather than by experts,
within projects rather than across them, and in
transient teams rather than permanent ones.
The statement regarding the higher propensity of rookies to reuse
compared to older professionals rang particularly true to my ears.
Continue reading "Surprising Findings on Software Reuse"
Project Asset Portability
It's said that real computer scientists don't program in assembler; they don't write in anything less portable than a number two pencil. Joking aside, at the end of the 1970s, the number of nonstandard languages and APIs left most programs tied to a very specific and narrow combination of software and hardware. Entire organizations were locked in for life to a specific vendor, unable to freely choose the hardware and software where their code and data would reside. Portability and vendor independence appeared to be a faraway, elusive goal.
Continue reading "Project Asset Portability"
Hard Disk Failure
I tell everybody that the question is not whether your hard drive
will fail, but when it will fail.
My laptop's drive started emmitting a loud grinding sound last Saturday.
Continue reading "Hard Disk Failure"
System administration stories: The Revolt
Can a small embedded system the size of a paperback
lead a group of machines into revolt?
Continue reading "System administration stories: The Revolt"
Detective Work and Dropped TCP Connections
I had problems with TCP connections (mostly long-lasting ssh sessions)
getting dropped on my ADSL line.
In the end, I found that the problem had two different roots.
The detective work behind establishing them is, I believe, interesting.
It also shows how accessible source code, and the will to use it,
can be a tremendous boost to difficult system administration problems.
Continue reading "Detective Work and Dropped TCP Connections"
Optimizing ppp and Code Quality
While debugging a problem of my ppp connection I noticed that
ppp was apparently doing a protocol lookup (with a file open,
read, close sequence) for every packet it read.
This is an excerpt from the strace log, one of my
favourite debugging tools.
Continue reading "Optimizing ppp and Code Quality"