Blog

Another cry for releasing scientific code

written by Trevor on February 09, 2010, at 06:55 PM

In a previous post I voiced my concern that scientific researchers do not have to provide the source code that produces their results. Now, Darrel Ince, Professor of Computing at The Open University, has written an article for the Guardian also calling for scientists to release their source code. Slashdot covered the story this morning.

Ince argues this as part of the fallout of "climategate," the incident in which it was shown that climate researchers had been fudging scientific data to produce desired results. It is a compelling example, but Ince goes on to say that "the situation is by no means bad across academia." He cites the example of the four colour theorem from graph theory, which was proved partly by a computer program. The program was published and highly scrutinized, and eventually found to be correct.

However, not all results are that nice in mathematics and computer science. In his PhD thesis, Terry Stewart notes one result that, to this day, has not been recreated.

SHRDLU

SHRDLU was a natural language understanding system developed from 1968-1970 by Terry Winograd at MIT. This 40-year old system produces some very interesting interaction, and was one of the first great results in artificial intelligence research. The following are two snippets from a demo found at Winograd's website:

Person: PICK UP A BIG RED BLOCK.
Computer: OK. (does it)
Person: GRASP THE PYRAMID.
Computer: I DON'T UNDERSTAND WHICH PYRAMID YOU MEAN.
Person: FIND A BLOCK WHICH IS TALLER THAN THE ONE YOU ARE HOLDING AND PUT IT INTO THE BOX.
Computer: BY "IT", I ASSUME YOU MEAN THE BLOCK WHICH IS TALLER THAN THE ONE I AM HOLDING.
Computer: OK. (does it)
Person: WHAT DOES THE BOX CONTAIN?
Computer: THE BLUE PYRAMID AND THE BLUE BLOCK.
Person: WHAT IS THE PYRAMID SUPPORTED BY?
Computer: THE BOX.

...

Person: HAD YOU TOUCHED ANY PYRAMID BEFORE YOU PUT THE GREEN ONE ON THE LITTLE CUBE?
Computer: YES, THE GREEN ONE.
Person: WHEN DID YOU PICK IT UP?
Computer: WHILE I WAS STACKING UP THE RED CUBE, A LARGE RED BLOCK AND A LARGE GREEN CUBE.
Person: WHY?
Computer: TO GET RID OF IT.
Person: WHY DID YOU DO THAT?
Computer: TO CLEAN OFF THE RED CUBE.
Person: WHY DID YOU CLEAR OFF THAT CUBE?
Computer: TO PUT IT ON A LARGE GREEN CUBE.
Person: WHY DID YOU DO THAT?
Computer: TO STACK UP THE RED CUBE , A LARGE RED BLOCK AND A LARGE GREEN CUBE.
Person: WHY DID YOU DO THAT?
Computer: BECAUSE YOU ASKED ME TO.

That is crazy impressive. The question is: why haven't we seen systems like it in common use today? A page maintained by Semaphore Corp. that provides source code for an implementation of SHRDLU gives some insight:

The current code isn't capable of completely reproducing the classic demo dialog and is fairly brittle and easily crashable, but it does correctly handle a large portion of the classic input sentences and many reasonable variations.

40 years of natural language research and infinitely more powerful computers are still not able to reproduce SHRDLU's 1970 results. How is this possible?

One of Winograd's students, Dave McDonald, notes, "In the rush to get [SHRDLU] ready for his thesis defense [Winograd] made some direct patches to the Lisp assembly code and never back propagated them to his Lisp source... We kept around the very program image that [Winograd] constructed and used it whenever we could."

Back in those days, most computers ran different operating systems, so as changes were made to the operating system of the computer Winograd and his students were using, the program became less robust (a phenomenon known as software rot).

So, even though the source code exists in some form, the original results he produced cannot be replicated because they don't include Winograd's direct patches, and because the platform that the code was created for longer exists.

The lesson

What should be taken from this example is that mathematics and computer science do still suffer the same problem of irreproducible results. Ever-changing platforms are less of a problem now -- a Matlab program produced on one computer should run the same on any other computer -- but source code is rarely shared as part of the publication process.

I think it should be. But for that to happen we need a central repository to store code and meaningfully link it to the papers that use it.

This repository should have the same amount of peer review and, therefore, authority that scientific journals have now. Maybe that can happen by existing journals adding the ability to link code to a paper (and enforce that any code used to generate results is included), or maybe a new organization has to rise up to the challenge (I would love to see code.arxiv.org).

Already I can hear the outcry of scientists claiming that their code is "sloppy" and "not ready to be released," but those concerns are simply irrelevant: all that matters is that the code produces the output cited in the paper given the input cited in the paper. That's it. If another researcher finds your result interesting, then let it be up to them to wade through your code -- it's probably still way better than trying to reproduce your result based on the prose that describes your algorithm.

Comments

  1. By Trevor, on February 10, 2010, at 03:51 PM
    Yet another related article can be found on Ars Technica: Keeping computers from ending science's reproducibility (via Terry Stewart)

Leave a comment

Name (required)
E-mail (required, will not be published)
Website
Comment
Enter value
: