Announcing: Slashdot Deals - Explore geek apps, games, gadgets and more. (what is this?)

Thank you!

We are sorry to see you leave - Beta is different and we value the time you took to try it out. Before you decide to go, please take a look at some value-adds for Beta and learn more about it. Thank you for reading Slashdot, and for making the site better!

Call For Scientific Research Code To Be Released

Soulskill posted more than 4 years ago | from the but-then-people-will-see-how-awful-it-is dept.

Programming 505

Pentagram writes "Professor Ince, writing in the Guardian, has issued a call for scientists to make the code they use in the course of their research publicly available. He focuses specifically on the topical controversies in climate science, and concludes with the view that researchers who are able but unwilling to release programs they use should not be regarded as scientists. Quoting: 'There is enough evidence for us to regard a lot of scientific software with worry. For example Professor Les Hatton, an international expert in software testing resident in the Universities of Kent and Kingston, carried out an extensive analysis of several million lines of scientific code. He showed that the software had an unacceptably high level of detectable inconsistencies. For example, interface inconsistencies between software modules which pass data from one part of a program to another occurred at the rate of one in every seven interfaces on average in the programming language Fortran, and one in every 37 interfaces in the language C. This is hugely worrying when you realise that just one error — just one — will usually invalidate a computer program. What he also discovered, even more worryingly, is that the accuracy of results declined from six significant figures to one significant figure during the running of programs.'"

Sorry! There are no comments related to the filter you selected.

Seems reasonable (4, Insightful)

NathanE (3144) | more than 4 years ago | (#31071878)

Particularly if the research is publicly funded.

Re:Seems reasonable (5, Insightful)

fuzzyfuzzyfungus (1223518) | more than 4 years ago | (#31072286)

The "The public deserves access to the research it pays for" position seems so self-evidently reasonable that further debate is simply unnecessary(though, unfortunately, the journal publishers have a strong financial interest in arguing the contrary, so the "debate" actually continues, against all reason). Similarly, the idea that software falls somewhere in the "methods" section and is as deserving of peer review as any other part of the research seems wholly reasonable. Again, I suspect that getting at the bits written by scientists, with the possible exception of the ones working in fields(oil geology, drug development, etc.) that also have lucrative commercial applications, will mainly be a matter of developing norms and mechanisms around releasing it. Academic scientists are judged, promoted, and respected largely according to how much(and where) they publish. Getting them to publish more probably won't be the world's hardest problem. The more awkward bit will be the fact that large amounts of modern scientific instrumentation, and some analysis packages, include giant chunks of closed source software; but are also worth serious cash. You can absolutely forget getting a BSD/GPL release, and even a "No commercial use, all rights reserved, for review only, mine, not yours." code release will be like pulling teeth.

On the other hand, I suspect some of this hand-wringing of being little more than special pleading. "This is hugely worrying when you realise that just one error — just one — will usually invalidate a computer program." Right. I know that I definitely live in the world where all my important stuff: financial transactions, recordkeeping, product design, and so forth are carried out by zero-defect programs, delivered to me over the internet by routers with zero-defect firmware, and rendered by a variety of endpoint devices running zero-defect software on zero-defect OSes. Yup, that's exactly how it works. Outside of hyper-expensive embedded stuff, military avionics, landing gear firmware, and FDA approved embedded medical widgets(that still manage to Therac people from time to time), zero-defect is pure fantasy. A very pleasant pure fantasy, to be sure; but still fantasy. The revelation that several million lines of code, in a mixture of Fotran and C, most likely written under time and budget constraints, isn't exactly a paragon of code quality seems utterly unsurprising, and utterly unrestricted to scientific areas. Code quality is definitely important, and science has to deal with the fact that software errors have the potential to make a hash of their data; but science seems to attract a whole lot more hand-wringing when its conclusions are undesirable...

Re:Seems reasonable (1)

upmufa (702569) | more than 4 years ago | (#31072746)

In the case of publicly-funded research, I'd first like to compel scientists to publish their work in a free-to-read format. Many journals do not make the articles available to the public. It seems a little wrong that public money goes to pay for research and then the results of the research -- the journal articles -- aren't available to the public.

Re:Seems reasonable (0, Flamebait)

kenp2002 (545495) | more than 4 years ago | (#31072872)

The public doesn't fund any research you insensitive clod the goverment does. It's the goverment's money not yours...

Seriously when in any debate with a politician have you heard "We need to be careful how we spend THEIR money..." Never, it is always "OUR money".

Now go pay homage to dear leader! OBEY!

Why release it? (-1, Flamebait)

Montezumaa (1674080) | more than 4 years ago | (#31071884)

I mean, it is just easier to keep lying to people to get them to believe what you want them to believe. It works for politicians.

More to the point, people increasingly don't (4, Insightful)

aussersterne (212916) | more than 4 years ago | (#31071940)

seem to understand the very idea of scientific methods or processes, or the reasoning behind empiricism and careful management of precision.

It's a failure of education, no so much in science education, I think, as in philosophy. Formal and informal logic, epistemology and ontology, etc. People appear increasingly unable to understand why any of this matters and they essentialize the "answer" as always "true" for any given process that can be described, so science becomes an act of creativity by which one tries to create a cohesive narrative of process that arrives at the desired result. If it has no intrinsic breaks or obvious discontinuities, it must be true.

If another study that contradicts it also suffers from no breaks or discontinuities, they're both true! After all, everyone gets to decide what's true in their own heart!

Re:More to the point, people increasingly don't (4, Insightful)

bsDaemon (87307) | more than 4 years ago | (#31072130)

I think a lot of it has to do not just with failures in education, but also due to the way science (in particular, but everything in general) is reported in the media. One week a study saying coffee will kill you gets reported, then a couple of days later a story saying another study says coffee will make you immortal is reported on, both with equal voracity, neither with expert commentary or perspective. C+ students who look good on camera banter back and forth about it, laughing jocularly and ultimately creating a situation in which, by their own dismissal and misunderstanding, perpetuate that to their viewers.

Its come to the point where many, many people just dismiss the whole business of science. "They can't even make up their minds!" they say, as if the point of science is to make up ones' mind. Of course, this is where the failure of education to actually educate comes into play. Classical liberalism has been turned over, spanked and made into the servant of corporate mercantilism and we're all just now supposed to sit down and shut up. Science, is in its essence, a libertarian (note small 'l') pursuit through which one questions all authority, up to and including the fabric of existence itself -- all assumptions are out the window and any that cannot pass muster is done away with.

But, just like socio-political anarchism (libertarian socialism), the spirit of rebellion and anti-authoritarianism inherent in science has been packaged and sold in a watered down and safe-for-children package at the local shopping mall only to be taken out of the box when the powers that be feel that they can use it for their own purposes. Not to be a downer or anything, its just I really do think this is bigger than just science. It's to do with people willingly leading themselves as sheep to the slaughter on behalf of the farmer to make the dog's job easier.

Re:More to the point, people increasingly don't (0)

Anonymous Coward | more than 4 years ago | (#31072600)

I loved that post. But I would add that classical liberalism and Austrian economics and based on the old rationalist tradition and deductive reasoning, whereas modern science is based on empiricism and inductive reasoning.

On the other hand, Einstein did say this:

"There is no inductive method which could lead to the fundamental concepts of physics in error are those theorists who believe that theory comes inductively from experience."
– Albert Einstein, The Method of Theoretical Physics, Oxford, 1933

And I would assume that, though Edison said genius is 99% perspiration, Tesla would probably agree more with Einstein on the importance of imagination, and reason itself rather than grueling empiricism.

Reality is incompatible with academia. (0)

Anonymous Coward | more than 4 years ago | (#31072206)

People understand the theory behind science just fine. The problem, however, is that it is nothing but theory. And theory, like most things in academia, only works properly under a highly controlled "reality".

In the real world, people need to eat. People need vehicles. People need clothing. People feel the need for luxury items. To get these things, people need money. To get money, the vast majority of people need to do something of value for somebody who already has money.

When it comes to scientists, they need to get their funding. Much of that funding these days comes from corporations. Corporations are often in the business of fucking over other people. Poor science often helps them achieve this. Who provides this lousy science? Scientists, of course.

But wait, you'll say that some scientists get funding from the government. That is true. But keep in mind that in most Western nations, the governments are selected by the "people" from a slate of candidates funded by a small number of corporations or industry groups. So even the scientists who get their funding from politicians end up having to create "research" that fulfills the needs of the corporations funding the politicians.

Price of technocracy (1)

tjstork (137384) | more than 4 years ago | (#31072228)

IT's the price of a society that doesn't actually value the liberal arts and the technology. Studying the greeks and romans matters and you need to be a well rounded thinker.

Don't use that word (1, Insightful)

Anonymous Coward | more than 4 years ago | (#31072400)

...so science becomes an act of creativity by which one tries to create a cohesive narrative of process that arrives at the desired result.

As someone who listens to Talk Radio on occasion, that sounds like you're creating a work of fiction. Rush and Hannity would have a whole week of shows based on that statement.

I would put it more like "piecing the narrative from the evidence" or "from facts" or something like that.

Scientists need to realize that if they're going to get public support, they really need to be very careful with their choice of wording. Like it or not, the scare mongers, and I mean scare mongers in the sense that there are people who are trying to scare folks into believing that Global Warming is some sort of wealth redistribution scheme by the socialists, are going to use any hint, real or not, that scientists are making up their findings.

Re:Don't use that word (3, Interesting)

harvey the nerd (582806) | more than 4 years ago | (#31072780)

Real scientists don't use simulators with incomplete equations and fudge factors to match highly manipulated historic data to "prove" their case with game machines that have no predictive capability or other external validation. That simply is not the way you build a valid fundamentals based model starting from the equations of motion. IPCC reports previously noted whole terms in the equations' energy terms that were inadequately described or represented, then have done no research to fill the terms, modellers just zeroing them out or putting in small constants for significant *variables*. These are not real scientists, their processes and practices have been clearly shown to be antithetical to valid science.

These models are just primitive speculative tools, often reflecting personal biases in data selection and derivation, NOT fundamental equations. The models are NOT valid physics data or experiments.

On prediction failure, Hansen's 1988 "A,B,C" forecasts of rising temperature are rapidly diverging from the cooling we are actually experiencing right now, where case C assumed we massively limited CO2 also. Missed the side of a barn with a shotgun, tsk, tsk, tsk.

Re:More to the point, people increasingly don't (1)

monoqlith (610041) | more than 4 years ago | (#31072520)

I'm with you that people don't see to understand the motivation for empiricism or the scientific method, but I think that's an overly complex explanation why.

Science *is* somewhat an act of creativity, but in a different respect. In order to explain observation, one has to creatively intuit the path to the precise explanation in novel, non-obvious ways. Einstein was creative in his science, and he was also a brilliant scientist.

You say, one has to try to create a cohesive narrative of a process. Well, yes, that's what science aims to do. What people don't seem to understand is that not all explanations are created equal, which is where I agree with you again. There is a critical difference between a framework that does explain and predict observation and is falsifiable and one that isn't. People by and large don't seem to get that.

Why the fuck do people increasingly do (1)

xtracto (837672) | more than 4 years ago | (#31072620)

start their posts in the title? it make it seems as if they do not think what they are going to write.

On topic with the article, I completely agree with this "release the scientific code" position. I am currently working within a EU project in which we are developing ABM*. In my project it was made clear from the beginning that the code license will be GPL.

However the place where I work has some program they have used to conduct simulations, these programs are complete closed (only a handful of people have access to the code, everyone within the institute). Nevertheless, simulations performed with such program has been used for several publications (journals, congresses, symposiums and even PhD thesis!).

And people in my line of work wonder why simulations are not taken more seriously (e.g. accepting papers) by people in more "classical" research fields.

Re:Why release it? (2, Insightful)

ShadowRangerRIT (1301549) | more than 4 years ago | (#31071954)

Please apply Hanlon's razor [wikipedia.org] before leaping to conspiracy theories. Or Occam's razor [wikipedia.org] might inform you that a conspiracy among thousands of scientists is a highly improbable occurrence; look for a solution that doesn't involve a perfect lid of secrecy among a group of (frequently) socially inept people.

Re:Why release it? (0)

Anonymous Coward | more than 4 years ago | (#31072118)

You are correct that it's not a conspiracy: it's more of an organic phenomenon. Politicians do not generally gain renown and reelection for doing nothing. So research dollars go to people who can show politicians that they should be doing something. Eventually, the profession selects for people who already have a certain lens for looking at the world. And here we are.

Conspiracy? (2, Insightful)

Coolhand2120 (1001761) | more than 4 years ago | (#31072148)

Nobody said conspiracy, just plain crappy code. You don't need a conspiracy if you are "trying to prove" something, your crappy code spits out what you want to see and you run with it. You just need plain old incompetence.

Re:Conspiracy? (4, Insightful)

obarthelemy (160321) | more than 4 years ago | (#31072238)

Yes and no. Which assertion do you think more probable:

1- "These are not the desired results. Check your code".

2- "These are the desired results. Check your code".

No conspiracy, but a conspiracy-like end result.

Re:Conspiracy? (2, Insightful)

bunratty (545641) | more than 4 years ago | (#31072358)

Let's think through what would really happen if scientists released their code. The code has bugs, as all code does. People with an ulterior motive would point to the bugs and say "Look here! A bug! The science cannot be trusted!" And millions of sheeple would repeat "Yes! The code has bugs! And therefore I refuse to believe it!" It won't matter whether the bugs are relevant to the science; the fact that there are any bugs at all will cause people who want to disagree to say there's doubt about the results. Meanwhile, they will go about their business using computer systems that are riddled with bugs, but function well enough the vast majority of the time they're not even aware of the bugs.

Re:Conspiracy? (4, Insightful)

crmarvin42 (652893) | more than 4 years ago | (#31072608)

And then they fix the bug and either...

A. The results change, thus indicating that the bug was important in some way. In this case, fixing the bug gained us not only silencing the critics, but improving our understanding.


B. The results don't change, thus indicating that the bug, while still a bug, was not important to the final result. In this case, we've fixed a bug that the critics were using as a banner, and that they were mistaken in it's importance. We don't get the improved understanding, but we do get a chance to politely say STFU to the more vocal/less qualified critics.

Either way looks like win/win to me.

Re:Conspiracy? (3, Informative)

xtracto (837672) | more than 4 years ago | (#31072688)

Agreed 100%.

You would not believe the amount and crappy quality of the code performed during "research projects", specially when the research is in a field completely unrelated to Comp. Sci. or Soft. Eng.

I have personally seen software related to Agronomy, Biology (Ecology) and Economics. The problem with a lot of that code is that sometimes researchers want to use the power of computers (say, for simulation) but do not know how to code, they then read a bit about some programming language and implement their program s they are learning.

The result? you can imagine.

Re:Why release it? (1)

INT_QRK (1043164) | more than 4 years ago | (#31072314)

You're right, Occam's Razor. Conspiracy is generally too hard, even if you know what you're doing. Who needs conspiracy? Group-think, socio-political cliques, popular public funding streams, fashion, peer pressure, yearning for acceptance by an in-crowd. Know what really brought the US to its knees in Viet Nam? Hippy Chicks. Wanted to get laid? You were anti-war.

Re:Why release it? (0, Offtopic)

Foolicious (895952) | more than 4 years ago | (#31072392)

Since you brought up the socially inept idea, I might also suggest taking a look at Bic's razor [wikipedia.org] or Gillette's razor [wikipedia.org] .

Re:Why release it? (1)

jgtg32a (1173373) | more than 4 years ago | (#31072564)

Who said it was a conspiracy of a thousand people? Isn't the claim that the ipcc reports are written by thousand people? That's not exactly true. The actual report was written by something like 50 people the thousand number comes from the supporting data. There were a lot of groups of people who did independent research that showed "whatever", and the "whatever" supports the "conclusion" so it is used, and things that don't support the "conclusion" are suppressed.
Now we have a fancy report, we pass that around, and it is solid and everyone agrees with what they see

Re:Why release it? (1)

Anonymusing (1450747) | more than 4 years ago | (#31072078)

And it's situations like this which make the general public distrust scientists, or even science in general.

The media plays a major role, as well -- it oversimplifies and dramatizes scientific research as if it comes to conclusions that it usually doesn't -- but when it comes to light that a scientist has made a mistake, or that a research paper has had false premises or inaccurate results, then the average Joe Public thinks to himself, "Can't trust those scientists. Shoulda known."

fear over fact (1)

xzvf (924443) | more than 4 years ago | (#31072346)

Humans are hardwired for fear and have to learn to think factually. Like most scientific issues that become political, fear and misinformation dominate over political fact. There will always be a certain segment of the population that believes vaccines cause autism and global warming is a trick to tax us with cap and trade. With vaccines you wait for the kids of autism avoiders to die of measles and polio. With global warming change the message from tax to disincentive to tax credit to incentive. Energy independence (make it a defense issue), tax credits for solar and wind that make the payback for a home owner less than a decade (I suspect a five year payback will get homeowners and home builders forking over for energy improvements).

Re:fear over fact (1)

INT_QRK (1043164) | more than 4 years ago | (#31072736)

"Political fact"? Freudian slip?

great! (3, Insightful)

StripedCow (776465) | more than 4 years ago | (#31071920)


I'm getting somewhat tired from reading articles, where there is little or no information regarding program accuracy, total running time, memory used, etc.
And in some cases, i'm actually questioning whether the proposed algorithms actually work in practical situations...

Re:great! (1)

xtracto (837672) | more than 4 years ago | (#31072722)

And in some cases, i'm actually questioning whether the proposed algorithms actually work in practical situations...

The problem is not only the algorithms but their implementation. I have read thesis where you have certain algorithm explaining the dynamics of a simulation and when I actually looked at the code (closed for in-house analysis only) several things were different.

Stuff like Sweave (3, Interesting)

langelgjm (860756) | more than 4 years ago | (#31071944)

Much quantitative academic and scientific work could benefit from the use of tools like Sweave, [wikipedia.org] which allows you to embed the code used to produce statistical analyses within your LaTeX document. This makes your research easier to reproduce, both for yourself (when you've forgotten what you've done six months from now) and others.

What other kinds of tools like this are /.ers familiar with?

Re:Stuff like Sweave (1)

StripedCow (776465) | more than 4 years ago | (#31072498)

This raises the question in what programming language the scientific code should be published.

Should there be a universal language, so that stronger guarantees are obtained on the reproducibility of the work?

Of course, this is a difficult topic since a lot of scientific programs are specifically designed for (specific) clusters.

Re:Stuff like Sweave (2, Interesting)

xtracto (837672) | more than 4 years ago | (#31072756)

Should there be a universal language,

It is called Z notation [wikipedia.org] . I have seen it used in several articles and at least a book on multi-agent systems.

Re:Stuff like Sweave (1)

shabtai87 (1715592) | more than 4 years ago | (#31072638)

There's a lot of nice extensions using the listings package in LaTeX. I use a lot of MATLAB so I usually end up using the mcode.sty available on mathworks (http://www.mathworks.com/matlabcentral/fileexchange/8015). Its got the color coded parts right too, which is nice for readability. More importantly I'll save the current code at the time of that report with the report itself, just in case I get really drunk and decide try to "fix" any base code.

Re:Stuff like Sweave (1)

xtracto (837672) | more than 4 years ago | (#31072812)

The problem with "adding code to paper" is the length of the paper.

I find it is better to submit the actual code into a "publisher repository" which can make it available in a long term basis (as opposed to have it in the researcher's web page, which is closed when they leave the position, or which the researcher himself can remove after some time).

Of course it may be useful to reproduce some snippets of the used algorithm in the article's text, however I won't suggest showing the actual code because not all the audience will know such notation (very likely outside Comp.Sci and Soft Eng. circles).

It should be released and under a free licence! (3, Interesting)

bramp (830799) | more than 4 years ago | (#31071946)

I've always been a big fan of releasing my academic work under a BSD licence. My work is funded by the taxpayers, so I think the taxpayers should be able to do what they like with my software. So I fully agree that all software should be released. It is not always enough to just publish a paper, but you should release your code so others can fully review the accuracy of your work.

About time! (5, Informative)

sackvillian (1476885) | more than 4 years ago | (#31071948)

The scientific community needs to get as far as we can from the policies of companies like Gaussian Inc., who will ban [bannedbygaussian.org] you and your institution for simply publishing any sort of comparative statistics on calculation time, accuracy, etc. from their computational chemistry software.

I can't imagine what they'd do to you if you started sorting through their code...

Re:About time! (0)

Anonymous Coward | more than 4 years ago | (#31072834)

Why keep picking on a small company? I don't see Microsoft making Excel code available for checking, or SAS making their source code available to see that the statistics run correctly... Why not make the argument that if you want it open, make it non-commercial.

I'm not even sure the lab instrument people make their code/schematics/whatever open - how do you know the experimental data's ok?

Engineering Course Grade = F (4, Interesting)

BoRegardless (721219) | more than 4 years ago | (#31071980)

One significant figure?

Re:Engineering Course Grade = F (1)

savanik (1090193) | more than 4 years ago | (#31072404)

One significant figure?

Yeah. My eyes bugged out when I saw that, too.

This is why Statistics should be taught to anyone attempting to do scientific research. If you don't understand why this is happening and how to prevent it, please turn in your PhD now.

Re:Engineering Course Grade = F (2, Interesting)

natoochtoniket (763630) | more than 4 years ago | (#31072452)

That actually surprised me, too. Loss of precision is nothing new. When you use floats to do the arithmetic, you lose precision in each operation, and particularly when you multiply two numbers with different scales (exponents). The thing that surprised me was not that a calculation could lose precision. It was the assertion that any precision would remain, at all.

Numeric code can be written using algorithms that minimize loss of precision, or that are able to quantify the amount of precision that is lost (and that remains) in the final answers. But, if you don't use those algorithms, or don't use them correctly and carefully, you really cannot assert _any_ precision in the result.

If you know your confidence interval, you can state your result with confidence. But, if you don't bother to calculate the confidence interval, or if you don't know what a CI is, or if you are not careful, it usually ends up being plus-or-minus 100 percent of the scale.

MaDnEsS ! (5, Funny)

Airdorn (1094879) | more than 4 years ago | (#31071998)

What? Scientists showing their work for peer-review? It's MADNESS I tell you. MADNESS !

Re:MaDnEsS ! (0)

Anonymous Coward | more than 4 years ago | (#31072180)

This is not madness, this is Sparta! Now, where's that darn well ...

Re:MaDnEsS ! (0)

Anonymous Coward | more than 4 years ago | (#31072704)

What? Scientists showing their work for peer-review?

It's MADNESS I tell you. MADNESS !


I'd like to see the code... (1)

argent (18001) | more than 4 years ago | (#31072040)

I'd like to see actual examples of the code failures mentioned in the T experiments paper.

Or at least Figure 9.

This is not science. (4, Insightful)

Coolhand2120 (1001761) | more than 4 years ago | (#31072052)

Re:This is not science. (2, Insightful)

Cyberax (705495) | more than 4 years ago | (#31072080)

His colleague was _sued_ (by a crank) based on released FOIA data. It might explain a certain reluctance to disclose data to known trolls.

Re:This is not science. (5, Insightful)

Idiot with a gun (1081749) | more than 4 years ago | (#31072190)

Irrelevant. If you can't take some trolls, maybe you shouldn't be in such a controversial topic. The accuracy of your data is far more significant than your petty emotions, especially if your data will be affecting trillions of dollars worldwide.

Re:This is not science. (1)

Cyberax (705495) | more than 4 years ago | (#31072318)

I demand to post all your work, including all your post-it notes, personal notebooks, and written per-hour documentation on all your movements. Trillions might depend on it!

1) Do you seriously think that the whole climate science depends on one scientist's data?

2) CRU was trolled by FOIA requests. They are nuisance to deal with, as far as I was told.

3) Scientists are people, people have emotions. That's why peer review is used.

Re:This is not science. (1)

Idiot with a gun (1081749) | more than 4 years ago | (#31072472)

1) Almost. Our politicians are retarded, and more interested in appeasing people than actually fixing things. They'll act on bad data. 2) So? There are trolls on the internet too. They're receiving grants, get some lawyers. 3) Clearly it didn't work too well. 1 sig fig.....

Re:This is not science. (0)

Anonymous Coward | more than 4 years ago | (#31072876)

It seems somewhat unfair to foist the responsibility of trillions of dollars onto a scientist who does not get enough funding to validate his or her own research *to satisfy the trillions of dollars expectation* nor get personally compensated enough to shoulder that responsibility. The scientist is simply trying to uncover some truth: it is the response of the governmental officials that you should be worried about. To place the blame on the scientist would cause such a chilling effect that it would scare away *any* research into the topic at hand.

I'm also trying to figure out how you equate *being sued* with "petty emotions".

Re:This is not science. (0, Flamebait)

Mashiki (184564) | more than 4 years ago | (#31072194)

I personally don't care if he was sued by the 4th Emperor of the Lastman Squealing dynasty. Post your work, put it up for review and suck it up buttercup when dealing with scientific review.

Re:This is not science. (1)

Attila Dimedici (1036002) | more than 4 years ago | (#31072336)

His colleague was _sued_ (by a crank) based on released FOIA data. It might explain a certain reluctance to disclose data to known trolls.

"known trolls" now equals people who have found significant errors in another scientist's released data?

What about McIntyre's faulty data? (1, Interesting)

Anonymous Coward | more than 4 years ago | (#31072544)

What about McIntyre's faulty data?

Ah, no FOIA there, because he's toeing the party line.

Note: He's not the only denial ditto who refuses to release his code:

http://www.realclimate.org/index.php/archives/2009/12/please-show-us-your-code/ [realclimate.org]

Oh, the meeja is quiet about that, isn't it...

Re:This is not science. (2, Insightful)

jgtg32a (1173373) | more than 4 years ago | (#31072742)

Shit like this is why I'm hesitant about going along with Climate Change. I'm in no way qualified to review scientific data, but I can tell when someone is shady, and I don't trust shady people.

Slashdot Egocentrism. (2, Insightful)

stewbacca (1033764) | more than 4 years ago | (#31072056)

My bet is there is a simple explanation...namely that scientists outside of computer science are too busy in their respective fields to know anything about code, or even care. The egocentric Slashdot-worldview strikes at the heart of logic yet again.

Re:Slashdot Egocentrism. (2, Interesting)

quadelirus (694946) | more than 4 years ago | (#31072348)

Unfortunately computer science is pretty closed off as well. Too few projects end up in freely available open code. It hinders advancement (because large departments and research groups can protect their field of study from competition by having a large enough body of code that nobody else can spend the 1-2 years required to catch up) and it hinders verifiability (because they make claims on papers about speed/accuracy/whatever and we basically have to stake it on their word and reputation and whether it SEEMS plausible--this also means that surprising results from lesser known researchers might be less likely to get published).

I think it our duty as scientists to ALWAYS release the code, even if it is uncommented and unclean. I'm very glad to be researching under an advisor who requires that we always release our code as open source after papers have been published so that other groups can build on what we've done. This should absolutely be universal.

Re:Slashdot Egocentrism. (2, Insightful)

FlyingBishop (1293238) | more than 4 years ago | (#31072396)

What's your point? If a Biologist has no understanding of code, they have no business running a simulation of an ecological system. If a physicist has no understanding of code, they have no business writing software to simulate atomic processes. If a Geneticist has no understanding of code, they have no business writing software that does pattern matching across genes.

Those who don't want to write software to aid in their research may continue not to do so (and continue to lose relevance.) But if they're going to use software, they have to use best practices. To do otherwise likewise makes their work quickly fading in relevance.

Nothing to do with CS (2, Insightful)

nten (709128) | more than 4 years ago | (#31072570)

I am suspect of the interface reference. Are they counting things where an enumeration got used as an int, or there was an implicit cast from a 32bit float to a 64bit one? From a recent TV show "A difference that makes no difference is no difference." Stepping back a bit there will be howls from OO/Functional/FSM zealots that look at a program and declare its inferior architecture, lack of maintainability etc. indicate its results are wrong. These are programs written to be run once to turn one set of data into a more understandable and concise one. A truth test set run through it is good enough, they don't need iso compliant, triply refactored, perfectly architectured code to get the right answer. I don't think any of my CS proffs would have cared about such inane drivel they barely paid attention to what language we each picked to solve the assignment in. My software engineering proff would have yelled about comment density and coding standards compliance, but I consider that a different discipline primarily applicable to widely used and/or safety critical code.

Keeping track of digit precision through a calculation isn't CS, its fundamental grade school science. That is only one step from forgetting to do unit analysis for a sanity check. If they are forgetting that, they are probably also not looking at numerical conditioning, or trying to get by with doubles when they need bignums. None of this is CS egocentrism, its stuff we learn in math and science courses.

Re:Slashdot Egocentrism. (0)

Anonymous Coward | more than 4 years ago | (#31072578)

In my department (applied mechanics), that probably holds true for alot of cases (which is also very troubling, as anyway who reads your work will almost certainly want to look at the code eventually), but there definitely is a large part of egoism behind this as well.

I honestly don't understand why, almost all the times, the code they have been tinkering with for a decade is not even remotely good enough to be sold anyway. And they already have a good job that pays well, so most of them probably wants a good reputation to go with it. The easiest way to do this is to open up the code. Would I ever have heard of X and Y if it wasn't for them releasing their meshing or computational code in some FLOSS license? No fucking way.

Re:Slashdot Egocentrism. (3, Interesting)

AlXtreme (223728) | more than 4 years ago | (#31072594)

that scientists outside of computer science are too busy in their respective fields to know anything about code, or even care.

If their code results in predictions that affect millions of lives and trillions of dollars, perhaps they should learn to care.

What I've personally seen of scientists is a frantic determination to publish papers anywhere and everywhere, no matter how well-founded the results in those papers are. The IPCC-gate is merely a symptom of a deeper problem within scientific research.

If scientists are too busy because of publication quota's and funding issues to focus on delivering proper scientific research, maybe we should question our current means of supporting scientific research. Currently we've got quantity, but very little quality.

This is news? (1)

andyh-rayleigh (512868) | more than 4 years ago | (#31072058)

Nothing seems to change ...
30 years ago it was a standard joke that most "fundamental particles" were bugs in the Fortran programs of the day.

I wouldn't be surprised to discover that some of the programs inestigated are just the result of 30 years of further modification of the ones we knew ... and that nobody understands them now!

Peer Review vs. Funding (2, Informative)

stokessd (89903) | more than 4 years ago | (#31072120)

I got my PhD in fluid mechanics funded by NASA, and as such my findings are easily publishable and shared with others. My analysis code (such as it was) was and is available for those would would like to use it. More importantly my experimental data is available as well.

This represents the classical pure research side of research where we all get together and talk about our findings and there really aren't any secrets. But even with this open example, there are still secrets when it comes to ideas for future funding. You only tip your cards when it comes to things you've already done, not future plans.

But more importantly, there are whole areas of research that are very closed off. Pharma is a good example. Sure there are lots of peer reviewed articles published and methods discussed, but you'll never really get into their shorts like this guy wants. There's a lot that goes on behind that curtain. And even if you are a grad student with high ideals and a desire to share all your findings, you may find that the rules of your funding prevent you from sharing.


Re:Peer Review vs. Funding (4, Insightful)

PhilipPeake (711883) | more than 4 years ago | (#31072804)

... and this is the problem. The move from direct government grants to research to "industry partnerships".

Well, (IMHO) if industry wants to make use of the resources of academic institutions, they need to understand the price: all the work becomes public property. I would go one step further, and say that one penny of public money in a project means it all becomes publicly available.

Those that want to keep their toys to themselves are free to do so, but not with public money.

Does this apply to climate deniers too? (0)

Anonymous Coward | more than 4 years ago | (#31072122)

Will ExxonMobil release all code that their scientists use?

Re:Does this apply to climate deniers too? (1)

harvey the nerd (582806) | more than 4 years ago | (#31072406)

ExxonMobil, et al, buy their large scale simulators that their business depends on from commercial 3rd parties.

Absolutely (1)

RandCraw (1047302) | more than 4 years ago | (#31072134)

Aside from logistics, there is no excuse for not doing this. In my experience, software innovations are notoriously sensitive to subtleties in input data (e.g. data mining, AI, image processing). Posting both code & data (and a test driver, of course) should be mandatory for all publications that claim to have found a signal in data, better or faster.

The question is, how to maintain code & data long after the publication publishes? IMHO, any peer-reviewed publication should be required to maintain such a repository for perhaps 20-30 years, ideally under a GPL (or its kin) so access to it would be free in perpetuity.

Maybe such a service would finally justify peer reviewed pubs' exorbitant fees for non-subscriber access.

That's all wrong (2, Interesting)

Gadget_Guy (627405) | more than 4 years ago | (#31072140)

The scientific process is to invalidate a study if the results cannot be reproduced by anyone else. That way you can eliminate all potential problems like coding errors, invalid assumptions, faulty equipment, mistakes in procedures, and 100 of the other things that can produce dodgy results.

It can be misleading to search through the code for mistakes when you don't know which code was eventually used in the final results (or in which order). I have accumulated quite a lot of snipits of code that I used to fix a particular need at the time. I am sure that many of these hacks were ultimately unused because I decided to go down a different path in data processing. Or the temporary tables used during processing is no longer around (or in a changed format since the code was written). There is also the problem of some data processing being done by commercial products.

It's just too hard. The best solution is to let science work the way it has found to be the best. Sure you will get some bad studies, but these will eventually be fixed over time. The system does work, whether vested interests like it or not.

Re:That's all wrong (1)

quadelirus (694946) | more than 4 years ago | (#31072424)

Not entirely. Another problem is that a research group may have a large body of code that is required to do research in an area. A new research group entering the area would currently have to duplicate all that code in order to be able to add to it. In my field, there are certain subfields that we won't even touch because it would take a year of coding to build up the necessary platform to be able to compete against established groups. If all code were open, anyone could download and begin improving/extending it. The result is that certain subfields that require a large body of background code to do study in have only a few players and no-one else can really enter the subfield without sacrificing a few years of publishing. This is bad, because then all research in an area is being done by a very small number of individuals and there isn't any cross pollination from other fields since the cost of entry is too high. The simple fix is to make the code open. Then anyone can make improvements.

Also, as to verifiability. You can't spend a year writing code to verify someone else's results. That may be the utopian ideal, but in practice it never happens. You don't get papers out of spending a full year doing nothing but verifying that yes, the research was actually done correctly. Having open code would most definitely lead to more verification.

Credentials: I am a doctoral student in the sciences.

Re:That's all wrong (1)

insufflate10mg (1711356) | more than 4 years ago | (#31072450)

The scientists aren't being asked to release every piece of code in their repository, just the code they used to reach the conclusions they published.

Not possible. (1)

MindlessAutomata (1282944) | more than 4 years ago | (#31072174)

Many scientists get their code from companies or individuals that license it to them, much like most other software. They're not in the position to release the code for many experiments...!

Re:Not possible. (1)

insufflate10mg (1711356) | more than 4 years ago | (#31072476)

What if the code they use has errors that affect the outcome of their experiments? What should be done? Let it slide?

one error will invalidate a computer program?!?!? (2, Insightful)

Anonymous Coward | more than 4 years ago | (#31072196)

As it is written, the editorial is saying that if there is any error at all in a scientific computer program, the science is usually invalid. What a lot of bull hunky! If this were true, then scientific computing would be impossible, especially with regards to programs that run on Windows.

Scientists have been doing great science with software for decades. The editorial is full of it.

Not that it would be bad for scientists to make their software open source. And not that it would be bad for scientists to benefit from some extra QA.

Re:one error will invalidate a computer program?!? (1)

alan_dershowitz (586542) | more than 4 years ago | (#31072538)

That statement was kind of breathless, but the study he was citing focused on bugs that specifically affected the accuracy of the output, and found that they were a common occurrence. I agree with the author, if you are going to use a computer program to get results, you need to publish code otherwise your methods are packaged in a black box. A lot of people don't want to do this because scientific code is not usually done by people we can say are knowledgeable in how to write reliable, verifiable code. It's usually a pieced together means to an end. Not that there's anything wrong with that, IF it can be available for verification. I HATE reading studies that for example constantly refer to a dataset and then never give you the dataset. I guess unlike many people I don't naturally trust the authors to be perfect.

Science isn't set up for "political research" (1)

Anonymous Coward | more than 4 years ago | (#31072200)

and by that I mean climate science.

Science suffers from methodological "flaws", which are really just the rational interest of the people involved. One of them is that scientists do not tend to disclose data, defended and explained several times by scientists under the pro-climate banner, as effectively that if they DID publish, then a) people would misuse the data, but more commonly and much more importantly (after all, most science isn't controlversial) b) that other scientists would use their data and programs and publish papers on the basis of them. A complex data set is hence the same as JOB INSURANCE. I kid you not, look up the statements yourself.

Another is that advanced science with multiple obscure data sets needs advanced statistical knowledge, which by its very nature requires significant professional judgement. This is also obvious from having read the debates, e.g. the 700-page hacked analysis document.

Now, in most science, none of these problems are really a problem. Firstly, the science is rarely _very urgent_. The scientists can therefore sit in an ivory tower and debate for a decade or so before the research "leaks" into the outside world, or is even required or wanted by the outside world. People who disagree can say "is that really the case though? look at this little piece her over in the corner which looks strange" and everyone can gather around for tea. Life-critical science is always checked by multiple independent people and in small scales as much as possible - like drugs research, for example. If data and models diffuses slowly, and models are subject to judgement, this doesn't have much of an impact in the long run, and can be worked out amicably and in preservation of everyone's dignity and efforts wholly within the "scientist sphere". Getting personally attached to a cause is also meaningless, because few causes provide enough job security or money to even risk allegations of misconduct.

I feel it does represent a problem in the climate case though. One reason is that the science will impact _everyone_ before the data is fully comprehensive. There is no 'scientist sphere' or 'trial runs', all conclusions are implemented as soon as they are out. Secondly, the conclusions couldn't even be checked by nonscientists whom it would affect. Thirdly, the extreme passion and personal stakes in a strong climate movement makes me very skeptical that the professional judgement of statistical analysis has been exercised dispassionately and objectively (if that is even possible). Scientists who say "is this really the case?" are pillored. Pro-Climate people will say that "that's noe true, if anyone could disprove climate science they would be heroes!". Anti-climate people would respond that "much like proving takes a ton of work and hundreds of articles, so would disproving, and in the meantime their lives would be hell'.

As a result, I am convinced that the scientific community and method we have is totally unsuited to research something as complex as climate science and make a conclusion within a few years. I don't want to change my life and reduce my consumption on the basis of what might well be bullshit - so it's either very painful enforcement against the will and good conscience of a lot of people, or, the 'data' for a catastrophe would only be conclusively found when it happens.

Then give legal liability shield too (0, Troll)

orzetto (545509) | more than 4 years ago | (#31072220)

The reason many researchers, especially climate scientists, are not so happy about divulging their models and data is that they can be sued by crackpots, as it has already happened. Even if they are proven right, a lawsuit is an expensive business. I can already imagine hordes of Exxon sockpuppets suing any random climate scientist they don't like.

Granting immunity from lawsuit should make them more willing to share data. Anyway, if something really bad is found in the research, the researcher will have their reputation tarnished, which in the environment is bad enough to ruin a career.

By the way: I am a researcher, and I attached the source code of my models in the PDF version of my PhD dissertation.

On what basis? (1)

tjstork (137384) | more than 4 years ago | (#31072262)

The reason many researchers, especially climate scientists, are not so happy about divulging their models and data is that they can be sued by crackpots, as it has already happene

On what basis of damages can a researcher be sued?

Re:On what basis? (1)

orzetto (545509) | more than 4 years ago | (#31072790)

Libel, for instance [desmogblog.com] .

Re:Then give legal liability shield too (1)

azaris (699901) | more than 4 years ago | (#31072440)

The reason many researchers, especially climate scientists, are not so happy about divulging their models and data is that they can be sued by crackpots, as it has already happened.

If it has already happened, what additional harm can come from disclosure? In the US you can sue anybody at any time for any reason whatsoever.

Re:Then give legal liability shield too (1)

insufflate10mg (1711356) | more than 4 years ago | (#31072514)

Off-topic, would you mind explaining the point of your signature?

Re:Then give legal liability shield too (1)

LordLucless (582312) | more than 4 years ago | (#31072602)

At a guess, it's a comment on the relative impact of terrorism and road fatalities, especially in view of the legislative changes rammed through on the back of the former.

It will never happen. (0)

Anonymous Coward | more than 4 years ago | (#31072232)

As an example: Releasing automotive code falls clearly in the interest of public safety. Do you really think any company will release source code? If WOZ had the source for his Prius, the current problem might have had a happy ending.

When people have confidence in their code, they release sources. When they are afraid of what might be hiding in their code, they lock it up.

Very few have confidence, whether in industry or science.

I concur (4, Interesting)

dargaud (518470) | more than 4 years ago | (#31072260)

As a software engineer who has spent 20 years coding in research labs, I can say with certainty that the code written by many, if not most, scientists is utter garbage. As an example, a colleague of mine was approached recently to debug a piece of code: "Oh, it's going to be easy, it was written by one of our postdocs on his last day here...". 600 lines of code in the main, no functions, no comments. He's been at it for 2 months.

I'm perfectly OK with the fact that their job is science and not coding, but would they go to the satellite assembly guys and start gluing parts at random ?

Re:I concur (0)

Anonymous Coward | more than 4 years ago | (#31072800)

the code written by many, if not most, scientists is utter garbage.

That is an understatement. I can vouch for it too. And the reason is perfectly clear: most scientific courses do NOT include any serious CS notions; quite the opposite. Professors usually teach the shoddy programming habits they were themselves taught decades ago, and thus perpetuate computer illiteracy. Many, if not most, still write fortran (and fortran 77 or DEC specific at that) in full spaghetti style and without any non-naive algorithms. Try telling them about pointers and balanced search trees or even hash tables, or structured code (not even OOP!) and see their faces.
So, while it is perfectly understandable that, say, physicists can't spend 5 years learning CS, at the very least they should be made aware that it requires trained people to write sane code and that they must hand the job to specialists, and spend their valuable time doing what the're skilled at. And the same can be said about numerical analysis, btw: throwing off-the-shelf Monte-Carlo or Molecular Dynamics at anything cannot make for a lack of mathematical skills.

Observations... (4, Informative)

kakapo (88299) | more than 4 years ago | (#31072268)

As it happens, my students and I are about to release a fairly specialized code - we discussed license terms, and eventually settled on the BSD (and explicitly avoided the GPL), which requires "citation" but otherwise leaves anyone free to use it.

That said, writing a scientific code can involve a good deal of work, but the "payoff" usually comes in the form of results and conclusions, rather than the code itself. In those circumstances, there is a sound argument for delaying any code release until you have published the results you hoped to obtain when you initiated the project, even if these form a sequence of papers (rather than insisting on code release with the first published results)

Thirdly, in many cases scientists will share code with colleagues when asked politely, even if they are not in the public domain.

Fourthly, I fairly regularly spot minor errors in numerical calculations performed by other groups (either because I do have access to the source, or because I can't reproduce their results) -- in almost all cases these do not have an impact on their conclusions, so while the "error count" can be fairly high, the number of "wrong" results coming from bad code is overestimated by this accounting.

Code isn't good enough. (2, Interesting)

FlyingBishop (1293238) | more than 4 years ago | (#31072290)

Back in college, I did some computer vision research. Most people provided open source code for anyone to use. However, aside from the code being of questionable quality, it was mostly written in Matlab with C handlers for optimization.

In order to properly test all of the software out there you would need:

1. A license for every version of Matlab.
2. Windows
3. Linux
4. Octave

I had our school's Matlab, but none of the code we found was written on that version. Some was Linux, some Windows, (the machine I had was a Windows box with Matlab) consequently we had to play with Cygwin...

I mean, basically, you need to distribute a straight-up VM if you want your results to be reproducible. (which naturally rules out Windows or Matlab or anything else proprietary being at the core.)

all (1, Insightful)

rossdee (243626) | more than 4 years ago | (#31072296)

So if scientists use MS Excel for part of their data analysis, MS should release the source code of Excel to prove that there's no bugs in it (that may favour one conclusion over another)
Soumds fair to me.

And if MS doesnt comply then all scientists have to switch to OO.org ?

social.... science. (0)

Anonymous Coward | more than 4 years ago | (#31072312)

Sure it's nice for peers to review, but to make it a mandatory thing? I thought science was based on rules with the assumption there are no rules.

I wonder what Newton, Einstein, Kepler, or Goddard would have thought if they were demanded to follow this type of review. Really. Peer review has become the Java of the research world if you know what I mean (the solution for everything)--there should be multiple forums for informal discussion/verification, formal proposals/verification.

Also true for CS research (2, Interesting)

DoofusOfDeath (636671) | more than 4 years ago | (#31072344)

I'm working on my dissertation proposal, and I'd like to be able to re-run the benchmarks that are shown in some of the papers I'm referencing. But must of the source code for those papers has disappeared into the aether. Without their code, it's impossible for me to rerun the old benchmark programs on modern computers so that I and others can determine whether or not my research has uncovered a better way of doing things. This is very far from the idealized notion of the scientific method, and significantly calls into question many of the things that we think we know based on published research.

Not a good idea (5, Insightful)

petes_PoV (912422) | more than 4 years ago | (#31072364)

The point about reproducible experiments is not to provide your peers with the exact same equipment you used - then they'd get (probably / hopefully) the exact same results. The idea is to provide them with enough information so that they can design their own experiements to [b]measure the same things[/b] and then to analyze their results to confirm or disprove your conclusions.

If all scientists run their results through the same analytical software, using the same code as the first researcher, they are not providing confirmation, they are merely cloning the results. That doesn't give the original results either the confidence that they've been independently validated, or that they have been refuted.

What you end up with is no-one having any confidence in the results - as they have only ever been produced in one way and arguments thatt descend into a slanging match between individuals and groups of vested interests who try to "prove" that the same results show they are right and everyone else is wrong.

Re:Not a good idea (0)

Anonymous Coward | more than 4 years ago | (#31072824)

I have to agree, the best way to see if a program is calculating incorrectly is to have a second program doing the same calculation a different way.

Recent example Keith Baggerly vs Duke Clin. Trials (1)

bloosqr (33593) | more than 4 years ago | (#31072378)

If you ever get a chance take a look at some of Baggerly's (MD Anderson / bioinformatics/stats) analysis of the number of rather embarrassing mistakes were used in developing genomic biomarkers used for a clinical trial at Duke. He has been giving talks around at stats conferences (and pharma's about this), its one of the best talks i've heard in recent years. But what it boils down to is the analysis (and input) programs used by Duke had a series of fundamental mistakes in it causes the results to be incorrect leading to an incorrect conclusions which unfortunately lead to a series of clinical trials which certainly should not have happened. After Baggerly attempted to respond negatively to the original series of articles being posted he reposted in a stats journal and basically got the clinical trial shut down. For slashdot readers, one of the rather many egregious mistakes here was the analysis program used has in its instructions the need for a header line, the input the Duke researchers used did not include a header line causing a shift in the results with regards to their input. My understanding is nature medicine refused to publish baggerlies initial correspondence with full details as it was "too negative" so he published in a stats journal which then got the critical coverage to shut everything down..

Here are some random links

Here is the original Potti genomics article:
http://www.nature.com/nm/journal/v12/n11/abs/nm1491.html [nature.com]

Here is one of the baggerly nature medicine letters describing what is wrong in summarized form:

http://www.nature.com/nm/journal/v13/n11/full/nm1107-1276b.html [nature.com]

here is the halt of the trials :

http://cancerletter.com/tcl-blog/copy108_of_whats-going-on-with-nih [cancerletter.com]

http://cancerletter.com/tcl-blog/copy111_of_whats-going-on-with-nih [cancerletter.com]

engineering or science (0)

Anonymous Coward | more than 4 years ago | (#31072446)

For example, interface inconsistencies between software modules which pass data from one part of a program to another occurred at the rate of one in every seven interfaces on average in the programming language Fortran, and one in every 37 interfaces in the language C.

Ah, classic example of the difference between s/w engineering and s/w development. Problem is it's hard to tell and engineer to thinking like a scientist and vice versa.

Well known problem... (1)

Wdi (142463) | more than 4 years ago | (#31072500)

The code quality of many well-known scientific software packages is abysmal.

In chemistry, you should at least expect that the outcome of descriptor computations on a set of molecules is independent of the order of atoms and bonds in a molecule, and the order of file records.

Well, this is disturbingly often not the case, as we discovered in a recent study.

In an attempt to raise awareness of this problem, we have launched a public Web-accessible computational result verification service (http://www.xemistry.com/cv). A poster explaining this app and some background, including sample test results, can be found at http://www.xemistry.com/Presentations/verifier_panel_2009.pdf.

Unfortunately, the worst application we have encountered so far appears to be a standard tool for adding Wikipedia data for chemicals, systematically poisoning it with incorrect data.

This should probably be tagged RANDU (0)

Anonymous Coward | more than 4 years ago | (#31072586)

After the flawed pseudorandom number algorithm whose use may have invalidated quite a few statistical simulations.

Not that simple (3, Interesting)

khayman80 (824400) | more than 4 years ago | (#31072694)

I'm finishing a program that inverts GRACE data to reveal fluctuations in gravity such as those caused by melting glaciers. This program will eventually be released as open source software under the GPLv3. It's largely built on open source libraries like the GNU Scientific Library, but snippets of proprietary code from JPL found their way into the program years ago, and I'm currently trying to untangle them. The program can't be made open source until I succeed because of an NDA that I had to sign in order to work at JPL.

It's impossible to say how long it will take to banish the proprietary code. While working on this project, my research is at a standstill. There's very little academic incentive to waste time on this idealistic goal when I could be increasing my publication count.

Annoyingly, the data itself doesn't belong to me. Again, I had to sign an NDA to receive it. So I can't release the data. This situation is common to scientists in many different fields.

Incidentally, Harry's README file is typical of my experiences with scientific software. Fragile, unportable, uncommented spaghetti code is common because scientists aren't professional programmers. Of course, this doesn't invalidate the results of that code because it's tested primarily through independent verification, not unit tests. Scientists describe their algorithms in peer-reviewed papers, which are then re-implemented (often from scratch) by other scientists. Open source code practices would certainly improve science, but he's wrong to imply that a single bug could have a significant impact on our understanding of the greenhouse effect.

Maybe it's my Berkeley roots: we release source (1)

PeterM from Berkeley (15510) | more than 4 years ago | (#31072816)

All the software I and my team write is version controlled (CVS and SVN) and releasable (some of it with export restriction.) I do admit that our software engineering is not the best, but we also have a rule that code cannot be committed before an extensive suite of tests is run on the code (for our main scientific application--'helper' tools are not so tightly controlled.)

Our scientific colleagues, provided they can satisfy the export control restrictions, can get source code and poke around all they want. Some have even contributed back valuable changes, and most have contributed back valuable feedback.

In fact, we look down upon colleagues who do not release source code. Are you really doing serious science if others cannot delve into your methods?


The problem is with government funding standards (1, Interesting)

Anonymous Coward | more than 4 years ago | (#31072822)

NIH funding standards promote commercialization of publicly funded software. This appears to have been implemented before the modern internet, and the idea may have been that a commercial product would make the code more available, and perhaps fix some of the quality issues with code cobbled together by "non-programmers". The result is that companies like Accelrys own a huge amount of software developed under public funding. Now, the public has to pay to use software trhat they paid to develop, and it is impossible for other scientific research to extend that publicly funded effort.

I want to see an NIH version of SourceForge, and mandate all government funded software development to be stored there. Unlike SourceForge, there could be delayed release to the public so that researchers have time to publish their work.

Peer Review / publication process (2, Insightful)

Wardish (699865) | more than 4 years ago | (#31072826)

As part of publication and peer review all data and providence of the data as well as any additional formula's, algorithms, and the exact code that was used to process the data should be placed online in a neutral holding area.

Neutral area needs to be independent and needs to show any updates and changes, preserving the original content in the process.

If your data and code (readable and compilable by other researchers) isn't available then peer review and reproduction of results is foolish. If you can't look in the black box then you can't trust it.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?