blog dds

2017.09.15

Who are the Publishers of Computer Science Research?

To answer this question, I downloaded the DBLP database and used the DOI publisher prefix of each publication to determine its publisher. I grouped the 3.4 million entries by publisher and joined the numeric prefixes with the publisher names available in the list of Crossref members. Based on these data, here is a pie chart of the major publishers of computer science research papers.

Continue reading "Who are the Publishers of Computer Science Research?"

2008.05.24

Interoperability, at Last

Language is a very powerful way to describe behavior. Therefore even when I create pictures, instead of dragging around my mouse, I use declarative tools like GraphViz, gnuplot, and UMLGraph. These allow me to describe what I want to draw, instead of how I want the end-result to look like. The truth however is that the end-results are not always perfect. Today I realized that the state of the art has advanced to the point where I can create the drawing declaratively, and then visually polish the final drawing.

Continue reading "Interoperability, at Last"

2008.03.01

Using and Abusing XML

Words are like leaves; and where they most abound, Much fruit of sense beneath is rarely found.

— Alexander Pope

I was recently gathering GPS coordinates and cell identification data, researching the algorithms hiding behind Google’s “My Location” facility. While working on this task, I witnessed the great interoperability benefits we get from XML. With a simple 140-line script, I converted the data I gathered into a de facto standard, the XML-based GPS-exchange format called GPX. Then, using a GPS-format converter, I converted my data into Google Earth’s XML data format A few mouse clicks later, I had my journeys and associated cell tower switchovers beautifully superimposed on satellite pictures and maps.

Continue reading "Using and Abusing XML"

2008.02.21

Backwards Compatibility in Office Open XML

As a member of my country's national standards body committee on electronic data processing, I lately spend considerable time deliberating what our position should be in the upcoming Office Open XML ISO Ballot Resolution Meeting in Geneva. My biggest objection concerns large parts of the standard that are proposed to live in an Annex containing normative descriptions of deprecated features that will only be used by existing binary documents. The rationale behind this decision is backwards compatibility. My opinion is that this solution is counterproductive for a number of reasons.

Continue reading "Backwards Compatibility in Office Open XML"

2007.03.15

Make vs Ant: Observability

I've long felt uncomfortable with ant as a build management tool. I thought that my uneasiness stemmed from the verbose XML used for describing tasks, and the lack of default dependency resolution. Today, email from a UMLGraph user struggling with a complex ant task made me realize another problem: lack of observability.

Continue reading "Make vs Ant: Observability"

2006.04.13

Xerces v Flex

What is the fastest way to process and XML file? I was faced with this question when I recently wanted to process a 452GiB XML file; for this amount of data speed matters. Some obvious choices were XML libraries, hand-crafted code, and lexical analyzer generators.

Continue reading "Xerces v Flex"

2005.12.07

If STL Had Been Designed by a Committee

I've been reading on XML schema, and it's embarrassingly obvious that it has been designed by a committee.

Continue reading "If STL Had Been Designed by a Committee"

2005.07.01

Tool Writing: A Forgotten Art?

Merely adding features does not make it easier for users to do things—it just makes the manual thicker. The right solution in the right place is always more effective than haphazard hacking.

— Brian W. Kernighan and Rob Pike

In 1994 Chidamber and Kemerer defined a set of six simple metrics for object-oriented programs. Although the number of object-oriented metrics swelled to above 300 in the years that followed, I had a case where I preferred to use the original classic metric set for clarity, consistency, and simplicity. Surprisingly, none of the six open-source tools I found and tried to use fitted the bill. Most tools calculated only a subset of the six metrics, some required tweaking to make them compile, others had very specific dependencies on other projects (for example Eclipse), while others were horrendously inefficient. Although none of the tools I surveyed managed to calculate correctly the six classic Chidamber and Kemerer metrics in a straightforward way, most of them included numerous bells and whistles, such as graphical interfaces, XML output, and bindings to tools like ant and Eclipse.

Continue reading "Tool Writing: A Forgotten Art?"

2005.06.23

XML Abstraction at the Wrong Level

Over the last month I've encountered two applications that use XML at the wrong level of abstraction. Instead of tailoring the schema to their needs, they use a very abstract schema, and encode their elements at a meta level within the XML data. This approach hinders the verification and manipulation of the corresponding XML files.

Continue reading "XML Abstraction at the Wrong Level"

2005.02.18

XML Versus Text Files

The JDepend package dependency analyzer can output its results either as XML or as plain text. Instead of using the XML output, I found myself processing the text output using awk. Am I becoming tied to old-world thinking, or are text files easier to process?

Continue reading "XML Versus Text Files"


Creative Commons License Last update: Sunday, November 19, 2017 2:36 pm
Unless otherwise expressly stated, all original material on this page created by Diomidis Spinellis is licensed under a Creative Commons Attribution-Share Alike 3.0 Greece License.