Who are the Publishers of Computer Science Research?
To answer this question, I downloaded the DBLP database and used the DOI publisher prefix of each publication to determine its publisher. I grouped the 3.4 million entries by publisher and joined the numeric prefixes with the publisher names available in the list of Crossref members. Based on these data, here is a pie chart of the major publishers of computer science research papers.
Continue reading "Who are the Publishers of Computer Science Research?"
Interoperability, at Last
Language is a very powerful way to describe behavior.
Therefore even when I create pictures, instead of dragging around my mouse,
I use declarative tools like GraphViz,
These allow me to describe what I want to draw, instead of
how I want the end-result to look like.
The truth however is that the end-results are not always perfect.
Today I realized that the state of the art has advanced to the point
where I can create the drawing declaratively, and then visually
polish the final drawing.
Continue reading "Interoperability, at Last"
Using and Abusing XML
Words are like leaves; and where they most abound,
Much fruit of sense beneath is rarely found.
— Alexander Pope
I was recently gathering GPS coordinates and cell identification data, researching the algorithms hiding behind Google’s “My Location” facility.
While working on this task, I witnessed the great interoperability benefits we get from XML. With a simple 140-line script, I converted the data I gathered into a de facto standard, the XML-based GPS-exchange format called GPX. Then, using a GPS-format converter, I converted my data into Google Earth’s XML data format A few mouse clicks later, I had my journeys and associated cell tower switchovers beautifully superimposed on satellite pictures and maps.
Continue reading "Using and Abusing XML"
Backwards Compatibility in Office Open XML
As a member of my country's
national standards body
committee on electronic data processing, I lately spend considerable time
deliberating what our position should be in the upcoming Office Open XML
ISO Ballot Resolution Meeting in Geneva.
My biggest objection concerns large parts of the standard that
are proposed to live in an Annex containing normative descriptions of
deprecated features that will only be used by existing binary documents.
The rationale behind this decision is backwards compatibility.
My opinion is that this solution is counterproductive for a number
Continue reading "Backwards Compatibility in Office Open XML"
Make vs Ant: Observability
I've long felt uncomfortable with ant
as a build management tool.
I thought that my uneasiness stemmed from the verbose XML used for
describing tasks, and the lack of default dependency resolution.
Today, email from a UMLGraph user
struggling with a complex ant task
made me realize another problem:
lack of observability.
Continue reading "Make vs Ant: Observability"
Xerces v Flex
What is the fastest way to process and XML file?
I was faced with this question when I recently wanted to
process a 452GiB XML file; for this amount of data speed matters.
Some obvious choices were XML libraries, hand-crafted code, and
lexical analyzer generators.
Continue reading "Xerces v Flex"
If STL Had Been Designed by a Committee
I've been reading on XML schema, and it's embarrassingly obvious
that it has been designed by a committee.
Continue reading "If STL Had Been Designed by a Committee"
Tool Writing: A Forgotten Art?
Merely adding features does not make it easier for users to do things—it just makes the manual thicker. The right solution in the right place is always more effective than haphazard hacking.
— Brian W. Kernighan and Rob Pike
In 1994 Chidamber and Kemerer defined a set of six simple metrics for object-oriented programs. Although the number of object-oriented metrics swelled to above 300 in the years that followed, I had a case where I preferred to use the original classic metric set for clarity, consistency, and simplicity. Surprisingly, none of the six open-source tools I found and tried to use fitted the bill. Most tools calculated only a subset of the six metrics, some required tweaking to make them compile, others had very specific dependencies on other projects (for example Eclipse), while others were horrendously inefficient. Although none of the tools I surveyed managed to calculate correctly the six classic Chidamber and Kemerer metrics in a straightforward way, most of them included numerous bells and whistles, such as graphical interfaces, XML output, and bindings to tools like ant and Eclipse.
Continue reading "Tool Writing: A Forgotten Art?"
XML Abstraction at the Wrong Level
Over the last month I've encountered two applications
that use XML at the wrong level of abstraction.
Instead of tailoring the schema to their needs, they
use a very abstract schema, and encode their elements
at a meta level within the XML data.
This approach hinders the verification and manipulation of the corresponding
Continue reading "XML Abstraction at the Wrong Level"
XML Versus Text Files
package dependency analyzer can output its results
either as XML or as plain text.
Instead of using the XML output,
I found myself processing the text output using awk.
Am I becoming tied to old-world thinking,
or are text files easier to process?
Continue reading "XML Versus Text Files"