blog dds: 2020-09-29 — Error handling under Unix and Windows

One thing that struck me when I first encountered the 4.3BSD Unix system call documentation in the 1980s, was that each call was followed by an exhaustive list of the errors associated with it. Ten years later, when I was going through the Windows API, I was disappointed to see that very few functions documented their error conditions. This is a big deal.

Consider as an example how the open function can fail. The 4.3BSD manual page documents 23 possible error conditions. Here is the complete list.

[ENOTDIR] A component of the path prefix is not a directory.
[EINVAL] The pathname contains a character with the high-order bit set.
[ENAMETOOLONG] A component of a pathname exceeded 255 characters, or an entire path name exceeded 1023 characters.
[ENOENT] O_CREAT is not set and the named file does not exist.
[ENOENT] A component of the path name that must exist does not exist.
[EACCES] Search permission is denied for a component of the path prefix.
[EACCES] The required permissions (for reading and/or writing) are denied for the named flag.
[EACCES] O_CREAT is specified, the file does not exist, and the directory in which it is to be created does not permit writing.
[ELOOP] Too many symbolic links were encountered in translating the pathname.
[EISDIR] The named file is a directory, and the arguments specify it is to be opened for writting.
[EROFS] The named file resides on a read-only file system, and the file is to be modified.
[EMFILE] The system limit for open file descriptors per process has already been reached.
[ENFILE] The system file table is full.
[ENXIO] The named file is a character special or block special file, and the device associated with this special file does not exist.
[ENOSPC] O_CREAT is specified, the file does not exist, and the directory in which the entry for the new file is being placed cannot be extended because there is no space left on the file system containing the directory.
[ENOSPC] O_CREAT is specified, the file does not exist, and there are no free inodes on the file system on which the file is being created.
[EDQUOT] O_CREAT is specified, the file does not exist, and the directory in which the entry for the new fie is being placed cannot be extended because the user’s quota of disk blocks on the file system containing the directory has been exhausted.
[EDQUOT] O_CREAT is specified, the file does not exist, and the user’s quota of inodes on the file system on which the file is being created has been exhausted.
[EIO] An I/O error occurred while making the directory entry or allocating the inode for O_CREAT.
[ETXTBSY] The file is a pure procedure (shared text) file that is being executed and the open call requests write access.
[EFAULT] Path points outside the process’s allocated address space.
[EEXIST] O_CREAT and O_EXCL were specified and the file exists.
[EOPNOTSUPP] An attempt was made to open a socket (not currently implemented).

This list allows programmers to judge which errors are likely to occur in a given situation, which can be handled, and what to do about the rest. Modern FreeBSD expands this list to 40 documented errors, while the Debian distribution of GNU/Linux documents 32.

In contrast, the documentation of the Windows OpenFile function only specifies that “If the function fails, the return value is HFILE_ERROR. To get extended error information, call GetLastError.”, which can return any of about 6000 error codes. As this information is returned at runtime, the main way programmers can find out which errors they need to handle in order to make their programs more resilient is by painful trial and error.

I first identified this problem in an article I wrote in 1997, titled A Critique of the Windows Application Programming Interface. At that time the possible error codes were 1130. As exemplified by the documentation of the Windows OpenFile function, things have not improved much in the past quarter century. Yet, there is a glimmer of hope. As I was reading the documentation of the RegQueryValueExA function, I was struck that it actually documented the reasons it could fail.

If the lpData buffer is too small to receive the data, the function returns ERROR_MORE_DATA.
If the lpValueName registry value does not exist, the function returns ERROR_FILE_NOT_FOUND.

Excitedly, I installed the 1999 MSDN Library documentation to see how it was documented back then. The error conditions were indeed missing: “ERROR_SUCCESS indicates success. A nonzero error code defined in Winerror.h indicates failure. To get a generic description of the error, call FormatMessage with the FORMAT_MESSAGE_FROM_SYSTEM flag set.” Although the error behavior of thousands of Windows functions must still be properly specified, adding such documentation a move in the right direction. After all the 1979 Seventh Research Edition Unix manual also lacked detailed system call error documentation, lamely arguing that “The possible error numbers are not recited with each writeup in section 2, since many errors are possible for most of the calls.” This was fixed a few years later, and we’re still enjoying the fruits of that labor.

Comments Post Toot! Tweet Share