Beta

Slashdot: News for Nerds

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Mozilla Plan Seeks To Debug Scientific Code

Soulskill posted about 10 months ago | from the unit-tests-are-for-undergrads dept.

Mozilla 115

ananyo writes "An offshoot of Mozilla is aiming to discover whether a review process could improve the quality of researcher-built software that is used in myriad fields today, ranging from ecology and biology to social science. In an experiment being run by the Mozilla Science Lab, software engineers have reviewed selected pieces of code from published papers in computational biology. The reviewers looked at snippets of code up to 200 lines long that were included in the papers and written in widely used programming languages, such as R, Python and Perl. The Mozilla engineers have discussed their findings with the papers’ authors, who can now choose what, if anything, to do with the markups — including whether to permit disclosure of the results. But some researchers say that having software reviewers looking over their shoulder might backfire. 'One worry I have is that, with reviews like this, scientists will be even more discouraged from publishing their code,' says biostatistician Roger Peng at the Johns Hopkins Bloomberg School of Public Health in Baltimore, Maryland. 'We need to get more code out there, not improve how it looks.'"

cancel ×

115 comments

Send In The Codes (0, Interesting)

Anonymous Coward | about 10 months ago | (#44944645)

Isn't it rich?
Are we a pair?
Me here at last on the ground,
You in mid-air.
Send in the codes.

Isn't it bliss?
Don't you approve?
One who keeps tearing around,
One who can't move.
Where are the codes?
Send in the codes.

Just when I'd stopped
Opening doors,
Finally knowing
The one that I wanted was yours,
Making my entrance again
With my usual flair,
Sure of my lines,
No one is there.

Don't you love farce?
My fault, I fear.
I thought that you'd want what I want -
Sorry, my dear.
And where are the codes?
Quick, send in the codes.
Don't bother, they're here.

Isn't it rich?
Isn't it queer?
Losing my timing this late
In my career?
And where are the codes?
There ought to be codes.
Well, maybe next year . . .

Wrong objective. (5, Insightful)

smart_ass (322852) | about 10 months ago | (#44944655)

I don't know the actual objective ... but if the concern is "'We need to get more code out there, not improve how it looks.'" ... the objective is bad.

Wouldn't shouldn't this be about catching subtle logic / calculation flaws that lead to incorrect conclusions?

Agree ... if this is about indenting and which method of commenting ... then yeah ... bad idea.

But this has the possibility of being so much more. I would see it as free editing by qualified people. Seems like a deal.

Re:Wrong objective. (1)

cheater512 (783349) | about 10 months ago | (#44944723)

Exactly. If the code they are writing looks like bad PHP from 10 years ago then it needs to be exposed.

What is needed is more *good quality* code being published.

Re:Wrong objective. (4, Insightful)

mwvdlee (775178) | about 10 months ago | (#44945291)

I think that's exactly the opposite of the point the GP was trying to make.

If it looks like bad PHP from 10 years ago but contains no bugs, then that is completely okay.
If it looks like old COBOL strung together with GO TO's and it works, it's okay.
If it looks like perfect C++ code but contains bugs, the bugs needs to be exposed, especially so if the research results are based on the output of the code.

Re:Wrong objective. (3, Informative)

Macchendra (2919537) | about 10 months ago | (#44946517)

It is easier to find bugs in code where all of the objects, variables, methods, etc. are named according to their actual purpose. It is easier for other researchers to integrate their own ideas if the code is self documenting. It is easier to integrate with other software if the interfaces are cleanly defined. It is easier to verify the results of intermediate steps if there is proper encapsulation. Also, proper encapsulation reduces the chances of unintended side-effects when data is modified outside of scope.

Re:Wrong objective. (1)

mwvdlee (775178) | about 10 months ago | (#44946687)

All of which are great if code is to be maintained, which this type of code rarely is.
None of which affects whether the code actually works.

Re:Wrong objective. (1)

Macchendra (2919537) | about 10 months ago | (#44946823)

Making bugs visible does affect whether the code actually works. So does making the components testable.

Re:Wrong objective. (2)

MiniMike (234881) | about 10 months ago | (#44946963)

All of which are great if code is to be maintained, which this type of code rarely is.

Not always true, probably not by a long shot. I'm maintaining code written over a span of time beginning in the 1980's (not by me) and last updated yesterday (and again as soon as I'm done here...). Some written very well, some quite the opposite. Not often is scientific code used for just one project, if it's of any significant utility.

Re:Wrong objective. (2)

swillden (191260) | about 10 months ago | (#44947009)

All of which are great if code is to be maintained, which this type of code rarely is.

Or if it is re-used, which is one of the potential benefits of publishing it alongside the paper.

Also, since the purpose of research papers is to transmit ideas, clear, readable code serves readers much better than functional but opaque code... and that assumes the code is actually functional. Ugly code tends to be buggier, precisely because it's harder to understand.

Re:Wrong objective. (3, Insightful)

ebno-10db (1459097) | about 10 months ago | (#44947469)

If it looks like bad PHP from 10 years ago but contains no bugs, then that is completely okay.
If it looks like old COBOL strung together with GO TO's and it works, it's okay.
If it looks like perfect C++ code but contains bugs, the bugs needs to be exposed, especially so if the research results are based on the output of the code.

None of the above. It's scientific code. It looks like bad Fortran (or even worse, FORTRAN) from 20 years ago, which is ok, since Fortran 90 is fine for number crunching.

In all seriousness, my experience is that "Ph.D. types" (for want of a better term) write some of the most amateurish code I've ever seen. I've worked with people whose knowledge and ability I can only envy, and who are anything but ivory tower types, but write code like it was BASIC from a kindergartener (ok, today's kindergarteners probably write better code than in my day). Silly things like magic numbers instead of properly defined constants (and used in multiple places no less!), cut-and-paste instead of creating functions, hideous control structures for even simple things. Ironically, this is despite the fact that number crunching code generally has a simple code structure and simple data structures. I think bad code is part of the culture or something. The downside is that it makes it more likely to have bugs, and very difficult to modify.

Realistically, this is because they're judged on their results and not their code. To many people here, the code is the end product, but to others it's a means to an end. Better scrutiny of it though would lead to more reliable results. It should be mandatory to release the entire program within, say, 1 year of publication. As for it being obfuscated, intentionally or otherwise, I don't think there's much you can do about that.

Re:Wrong objective. (4, Informative)

Anonymous Coward | about 10 months ago | (#44944775)

The problem is most papers do not publish the code, only the results. This causes dozens of problems: if you want to run their code on a different instance you can't, if you want to run it on different hardware you can't, if you want to compare it with yours you only sort of can since you have to either reimplement their code or run yours on a different environment than theirs, which makes comparisons difficult. Oh, and it makes verifying the results even more worse, but it isn't like many people try to verify anything.

On the one hand catching bugs can help find a conclusion was wrong sooner than it would happen otherwise. On the other hand it may make it less likely that authors will put their code out there. Anyhow, I think it's a good idea and worth a shot. Who knows, maybe it'll end up helping a lot.

Re: Wrong objective. (3, Insightful)

icebike (68054) | about 10 months ago | (#44945023)

Well running The ORIGINAL author's code isn't that important.

What's important is the analysis that the code was supposed to do.

Describing that in mathematical terms and letting anyone trying to replicate the research is better than handing the original code forward. That's just passing another potential source of error forward.

Most of the (few) research projects I been called to help with coding on are strictly package runners. Only a one had anything approaching custom software, and it was a mess.

Re: Wrong objective. (4, Insightful)

ralphbecket (225429) | about 10 months ago | (#44945131)

I have to disagree. Before I go to a heap of effort reproducing your experiment, I want to check that the analysis you ran was the one you described in your paper. After I've convinced myself that you haven't made a mistake here, I may then go and try your experiment on new data, hopefully thereby confirming or invalidating your claims. Indeed, by giving me access to your code you can't then claim that I have misunderstood you if I do obtain an invalidating result.

Re: Wrong objective. (5, Interesting)

old man moss (863461) | about 10 months ago | (#44946061)

Yes, totally agree. As someone who has tried to reproduce other people's results (in the field of image processing) with mixed success. It can be incredibly time consuming trying to compare techniques which appear to be described accurately in journals, but omit "minor" details of implementation which actually turn out to be critical. I have also had results of my own which seemed odd and were ultimately due to coding errors which inadvertently improved the result. Given the opportunity, I would have published all my academic code.

Re: Wrong objective. (1)

Shavano (2541114) | about 10 months ago | (#44947099)

You had the opportunity. You could have put your code and notes on how to use it and the appendix to your papers.

Re: Wrong objective. (1)

Impy the Impiuos Imp (442658) | about 10 months ago | (#44947549)

I agree. Code is math, and thus of the experiment and analysis, and is not just an interpretation. "Duplicate it yourself" stands against the very idea of review and reproduction.

While there is tremendous utility in an independent reconstruction of an algorithm (I have numerous times built a separate chunk of code to calculate something in a completely different way, to test against the real algorithm/code, in practice they debug each other) the actual code needs to be there for review.

They may have a desire to keep it secret for exclusivity reasons of one type or another (fame, future additional research, money) that can't justify secrecy in normal publication.

Re: Wrong objective. (0)

Anonymous Coward | about 10 months ago | (#44947921)

Don't forget magic numbers!

"The algorithm is wieghted by two parameters A1 and A2"

And nowhere in the paper does it say what values of A1 and A2 were used :)

I love academia.

Re:Wrong objective. (4, Insightful)

dcollins (135727) | about 10 months ago | (#44944947)

Yeah, it seems like the real objective should be to get more code read and verified as part of the scientific process. (Just "getting more code out there" and expecting it to go unread would be pretty empty.)

One problem is that the publish-or-perish process has gotten sufficiently corrupt that many results are irreproducible, PhD students are warned against trying to reproduce results, and everyone involved has lost the expectation that their work will be experimentally double-checked.

Re:Wrong objective. (4, Interesting)

Anonymous Coward | about 10 months ago | (#44945553)

As a PhD student I am actively encouraged to reproduce results, mostly this has been possible but I know of at least one paper which has been withdrawn because my supervisor queried their results after we failed to reproduce them (I'll be charitable and say it was an honest mistake on their part).

I guess whether you are encouraged to check others work depends on your university and subject, but in certain areas it Does happen.

Re:Wrong objective. (0)

Anonymous Coward | about 10 months ago | (#44946701)

I would say that when I was in academia, I was never discouraged to reproduce results but that I would have to sacrifice my own time and money for it. That is kind of hard when your average grad student is already over-worked and underpaid.

Looking over the shoulder (2)

glennrrr (592457) | about 10 months ago | (#44946283)

I remember when I was in graduate school looking over a member of my group's shoulder and realizing he thought that the ^ operator in C meant raise to the power of instead of being the bitwise XOR operator. Scientists are often pretty indifferent programmers.

Re:Looking over the shoulder (2)

biodata (1981610) | about 10 months ago | (#44947147)

this^1000

Re:Looking over the shoulder (1)

ebno-10db (1459097) | about 10 months ago | (#44947511)

In all fairness that's an easy mistake to make, because ^ means exponentiation in other languages. It's an historical stupidity, like the fact that log() is the natural log, not log10().

Re:Looking over the shoulder (1)

Shavano (2541114) | about 10 months ago | (#44947841)

Frequently. It's not supposed to be their main area of expertise and they often learn just enough to solve their immediate problem. And why should they learn more? So occasionally they make blunders like that, but a professional computer programmer wouldn't know what problem to code or what analysis needs to be done in the first place. That's what the scientists are good at.

Re:Wrong objective. (1)

Shavano (2541114) | about 10 months ago | (#44947741)

Ph.D. dissertations require original research. However, assigned classwork for Doctor's and Master's students would be improved if it involved replication and re-analysis of recent research in the field to study methods of data collection and analysis. This would make replication and reexamination of recent research a routine part of academia. The benefits for the students would be seeing how other researchers do their work and practice at methods of analysis and occasionally the satisfaction of showing that the original work was wrong. Also, the demonstration that if they publish bad work, there's a likelihood that it will be discovered by other researchers who will refute their findings.

Re:Wrong objective. (0)

Anonymous Coward | about 10 months ago | (#44945017)

It sounds like he wants people to publish code more often, because it isn't common yet. Fear of scrutiny, less openness. It's a meeting of different types of specialists, and he's scared. Some people understand peer review, but not how to deal with criticism.

Re:Wrong objective. (1)

K. S. Kyosuke (729550) | about 10 months ago | (#44945823)

Not to mention that the idea of not publishing code is at stark odds with the goal of scientific publication, which is reproducibility: as things depend more and more on the processing SW, papers and datasets aren't enough, you need the code was used to generate the results, otherwise it's irreproducible.

Re:Wrong objective. (1)

Shavano (2541114) | about 10 months ago | (#44947063)

I don't know the actual objective ... but if the concern is "'We need to get more code out there, not improve how it looks.'" ... the objective is bad.

Wouldn't shouldn't this be about catching subtle logic / calculation flaws that lead to incorrect conclusions?

Agree ... if this is about indenting and which method of commenting ... then yeah ... bad idea.

But this has the possibility of being so much more. I would see it as free editing by qualified people. Seems like a deal.

That's one of two worthy objectives. The other is to make the code more suitable for use by other researchers.

Provide a tool then BUTT OUT (0, Interesting)

Anonymous Coward | about 10 months ago | (#44944663)

Yes Mozilla. BUTT OUT!!! Your coders are not scientists. Provide a code review tool like Findbugs and perhaps offer to assist pre-publication, but don't start spreading your "way of doing things" which puts off your own users. Scientists have enough to deal with

Re:Provide a tool then BUTT OUT (1)

phantomfive (622387) | about 10 months ago | (#44944801)

Believe it or not, there actually are at least some scientists in the Mozilla Science Lab. Crazy, right?

Re:Provide a tool then BUTT OUT (0)

Anonymous Coward | about 10 months ago | (#44945043)

Different AC, but I still think an analysis tool would be of more use.

Re:Provide a tool then BUTT OUT (1)

ebno-10db (1459097) | about 10 months ago | (#44947609)

Yes Mozilla. BUTT OUT!!! Your coders are not scientists. ... Scientists have enough to deal with

Scientists have enough to deal with ... like buggy code? RTFA. It causes real problems, and I have no use for the "we're specialists, you couldn't possibly help us" attitude (often it's espoused to hide problems).

Would you trust a chemist who didn't know the proper practices for working in a chem lab? If not, why should you trust someone doing computational chemistry problems who doesn't know how to code? It's too easy to fall for the "how hard could this be" syndrome. For example, the time Richard Feynman spent a sabbatical working in a biology lab and trashed an important experiment due to his ignorance of the proper methods (a mistake, which unlike many other people, he freely admitted to).

Hell Yes! (5, Insightful)

Garridan (597129) | about 10 months ago | (#44944673)

Where do I sign up? If I could get a "code reviewed by third party" stamp on my papers, I'd feel a lot better about publishing the code and the results derived from it. Maybe mathematicians are weird like that -- I face stigma for using a computer, so anything I can do to make it look more trustworthy is awesome.

Re:Hell Yes! (5, Insightful)

JanneM (7445) | about 10 months ago | (#44944733)

Problem is, at least in this trial they're reviewing already published code, when it's too late to gain much benefit from the review on the part of the original writer. A research project is normally time-limited after all; by the time the paper and data is public, the project is often done and people have moved on.

There's nobody with the time or inclination to, for instance, create and release a new improved version of the code at that point. And unless there's errors which lead to truly significant changes in the analysis, nobody would be willing to publish any kind of amended analysis either.

Re:Hell Yes! (1)

Anonymous Coward | about 10 months ago | (#44944851)

There is a reason that models have to be validated. If you choose validation cases well, a code that passes them will almost certainly be a good model. Beyond that, you do the best you really can, and that's that.

Otherwise, here, I've got 40k lines of code here, anyone want to check it over for me? This is free of charge, right?

Re:Hell Yes! (0)

Anonymous Coward | about 10 months ago | (#44946861)

And unless there's errors which lead to truly significant changes in the analysis, nobody would be willing to publish any kind of amended analysis either.

With emphasis on "significant changes." I've found a bug in code I've used in a published paper once, but it changed the results of that paper by about ~1% where the error bars were already 15-20%. There was not much to gain by publishing new to point that out, especially since the original paper still had the correct math as the bug was an implementation error. However, the bug fix was much more relevant to the new situation I was testing when I found it.

You are so wrong... (0)

Anonymous Coward | about 10 months ago | (#44946957)

That's not how research works. The researcher is continuing research in the same narrow area of a specific field. That is where their knowledge and expertise is. And their new research is a based on and a continuation of their old research. So, yes, finding problems in published code is _VERY_ important and useful. It also keeps _other_ researchers from using that as a basis for their research.

Re:Hell Yes! (2)

PsyberS (1356021) | about 10 months ago | (#44945469)

Where do I sign up? If I could get a "code reviewed by third party" stamp on my papers, I'd feel a lot better about publishing the code and the results derived from it.

Believe it or not, some computer science programming language conferences are doing *just that*.

http://cs.brown.edu/~sk/Memos/Conference-Artifact-Evaluation/ [brown.edu]
http://ecoop13-aec.cs.brown.edu/ [brown.edu]
http://splashcon.org/2013/cfp/665 [splashcon.org]

What is Mozilla? (1)

phantomfive (622387) | about 10 months ago | (#44944683)

When did Mozilla get a Science Lab? Here I always thought that all the Mozilla foundation made a decent browser, and now I find they have a science lab. What other things does Mozilla do?

Re:What is Mozilla? (1)

Anonymous Coward | about 10 months ago | (#44944735)

A tiddlywinks ballroom, two vending machines and a build-a-squirrel online project. Apparently they have made some attempt at an internet browser too.

they do Seamonkey, a better browser than Firefox (1)

raymorris (2726007) | about 10 months ago | (#44944795)

What else do they do, you ask? They support Seamonkey, Firefox's older brother. Firefox began as a stripped down,lightweight, minimalist version of Seamonkey. Though Firefox is no longer lightweight, Seamonkey is still more capable in some respects. The suite includes an email client and WYSIWYG editor, but I just like the browser.

While Firefox is controlled by the Mozilla Foundation, Seamonkey is community driven now, with hosting and other support from the foundation.

Not technical tho (1)

mjwalshe (1680392) | about 10 months ago | (#44945761)

Mozilla would appear to be be mostly commercial progamers so not sure that having them look at the code would give any value.

any review may find off-by-one, etc. (2)

raymorris (2726007) | about 10 months ago | (#44946571)

Having ANY second programmer look at the code may well find off-by-one or fence post errors and the like.

Re:Not technical tho (0)

Anonymous Coward | about 10 months ago | (#44946945)

That's true... the only people who can code worth a damn can't seem to find a job doing it.

Oh, wait... that's just kind of stupid.

Re:What is Mozilla? (1)

sg_oneill (159032) | about 10 months ago | (#44945141)

Mozilla is a bit like Apache, its a broad tent of vaguelly related projects , its not just firefox.

Re:What is Mozilla? (1)

jones_supa (887896) | about 10 months ago | (#44945477)

And of course let's not forget Thunderbird. A very good e-mail client in my opinion.

Re:What is Mozilla? (1)

jopsen (885607) | about 10 months ago | (#44945611)

Mozlla also does webmaker, education and let's not forget Firefox OS...

Re:What is Mozilla? (-1)

Anonymous Coward | about 10 months ago | (#44946117)

Wait? you thought Mozilla made a decent browser.

HAHAHAHAHA.

no.

Re:What is Mozilla? (0)

BitZtream (692029) | about 10 months ago | (#44946509)

When did Mozilla know jack shit about proper code reviews.

Everything they do seems to indicate just the opposite, they fracking suck at writing clean bug free code, and suck just as much at reviewing it.

They seem to be projecting. Trying to act like they are good at something they suck at while claiming others aren't good at it.

Re:What is Mozilla? (1)

ebno-10db (1459097) | about 10 months ago | (#44947779)

they fracking suck at writing clean bug free code, and suck just as much at reviewing it

Then how come the browser I'm using right now works pretty well?

Software architecture (1)

Anonymous Coward | about 10 months ago | (#44944687)

The overall structure of most the code in HEP [1] is nasty. It's too late for the likes of ROOT [2]: input of software engineers at the early stages of code design could be very useful.

1. https://en.wikipedia.org/wiki/Particle_physics
2. https://en.wikipedia.org/wiki/Root.cern

Mozilla needs to improve their own code. (0, Flamebait)

Animats (122034) | about 10 months ago | (#44944709)

Mozilla barely has control of their own code base. The number of open bugs keeps increasing. Attempts to multi-thread the browser failed. The frantic release schedule results in things like the broken Firefox 23, where panels in add-ons just disappeared off screen. They have legacy code back to Netscape 1, and it's crushing them. Firefox market share is declining steadily. Not good.

Re:Mozilla needs to improve their own code. (0)

Anonymous Coward | about 10 months ago | (#44944731)

This project has nothing to do with the people working on Firefox.

Re:Mozilla needs to improve their own code. (0)

Anonymous Coward | about 10 months ago | (#44944793)

My first thought also, and, i'm sure, the same for any moderate+ user of Firefox. In some ways it's very very good. (i actually moved back to it from Chrome since, e.g. chrome has no off-line capability. So if you have to restart w/ a bunch of tabs it just reloads EVERYTHING. It's almost like it doesn't even have a disk cache.) On the other hand... it takes several plug-ins to manage memory usage and, as mentioned, there are years old bugs open for everything from annoyances to outright problems.

Crappy tax dodgers review science papers now? (0)

Joining Yet Again (2992179) | about 10 months ago | (#44944711)

See subject line. I don't know what the hell qualifies Mozilla to review scientific code. For one thing, scientific code in academic papers is proof-of-concept - it's designed to show how to implement something according to the description in the paper, not engineered for general deployment.

The bla bla need more people counterargument is bollocks, however - there are enough people in computational biology doing utterly pointless things.

Perhaps Mozilla's looking for another way to justify its on-going tax avoision status, of course.

Re:Crappy tax dodgers review science papers now? (0)

Anonymous Coward | about 10 months ago | (#44946837)

For one thing, scientific code in academic papers is proof-of-concept - it's designed to show how to implement something according to the description in the paper, not engineered for general deployment.

So if the paper says "The function getRandomNumber() returns a random number", but the code says "int getRandomNumber() { return 4; }", that's fine?
Wouldn't it be better if the bugs were removed, so we can check if the conclusions in the paper still holds?

Don't forget spreadsheets (4, Informative)

MrEricSir (398214) | about 10 months ago | (#44944713)

As we've seen recently, bad decisions can be made from errors in spreadsheets. We need these published so they can be double-checked as well.

Re:Don't forget spreadsheets (-1)

Anonymous Coward | about 10 months ago | (#44944761)

Now, now, we can't just go on and hang out the dirty linen...

After the last few versions of FireFox, I wonder if these guys "reviewing" anyone else's code should be treated as a red flag.
Label: "Warning. This code has been reviewed by Mozilla Labs."

Re:Don't forget spreadsheets (1)

Anonymous Coward | about 10 months ago | (#44944863)

They should be publishing their code because the basic precept behind peer reviewed publishing is that results could be reproduced. Most of the time they are not but computational scientists need to be constantly reminded that they are performing experiments, not publishing the code is exactly the same as a synthetic chemist not including an experimental section (the procedure for the synthesis).

Re:Don't forget spreadsheets (0)

Anonymous Coward | about 10 months ago | (#44944963)

What should be published is sufficient information for someone with programming experience to reproduce your results. I can write that I convolved two functions, but you don't need to see the code that I used to do the convolution. If I made some approximation or used an algorithm that may fall apart in some limits, that is worth mentioning.

Re: Don't forget spreadsheets (0)

Anonymous Coward | about 10 months ago | (#44945029)

We don't need to see your code?

Doveryai, no proveryai, comerade academician.

Re:Don't forget spreadsheets (1)

swillden (191260) | about 10 months ago | (#44947061)

If I made some approximation or used an algorithm that may fall apart in some limits, that is worth mentioning.

Uh, huh. And what if you don't realize that your code has subtle failings that may have significantly altered your results? Anyone trying to reproduce your results but doing it right will fail, but be unable to explain why their results differed. Without your code peer review of your work is both harder and less valuable.

Unless deterring review is the researcher's intent, of course.

Re:Don't forget spreadsheets (1)

dkf (304284) | about 10 months ago | (#44947217)

I can write that I convolved two functions, but you don't need to see the code that I used to do the convolution.

So you used a standard library for doing the convolution, cited that library correctly, and showed how you called the library? That would be very good academic programming and paper-writing too. Of course, the flip side also holds: if you don't show your methods properly, or don't cite others work that you use or reference, you're a bad academic. If you do it all yourself when much of it isn't your research focus, you're just wasting your time (and encouraging others to ignore you).

Re:Don't forget spreadsheets (2)

VortexCortex (1117377) | about 10 months ago | (#44944979)

bad decisions can be made from errors in spreadsheets.

Oh, If only you knew...

We need these published so they can be double-checked as well.

Well, I wouldn't go so far as publishing my findings, but now I always double-check spread sheets when I'm not sure if it is or isn't a ladyboy.

Re:Don't forget spreadsheets (0)

Anonymous Coward | about 10 months ago | (#44946947)

Hear, hear.
When I was in grad school, the majority of programming effort that I observed was actually in spreadsheet macros and Access dashboards.
The only exceptions to this rule appeared to be when the experiment itself required programming in a lower language (e.g. high-density, high-speed information visualization experiments). It's so bad, that one class was amazed that our group used "an actual relational database" (Access, so no) in our project on data aggregation and visualization.

Worst part of that experience? Three different students wanted to know what a "primary key" was. *sigh*
and, yeah, this was a grad course in Comp Sci, so it's not just bio/chem or social sciences that are plagued with this mentality.

Spreadsheet macro programming is like the Dark Side of the Force, it will be with you always.
So, clean it up.

Re:Don't forget spreadsheets (1)

ebno-10db (1459097) | about 10 months ago | (#44947855)

As we've seen recently, bad decisions can be made from errors in spreadsheets.

For that problem, let's just get rid of spreadsheets (at least as they're implemented in most programs). Copy-and-paste is the standard way to do the same computation in several places. How much further could you get from good practice? Reviewing the "code" requires peering at every cell. Etc., etc,. etc. Lastly, the people who use them are often idiots who have no idea what they're doing. At least if you made them use a programming language, they'd never get it to run. That way they couldn't pretend that they made meaningful calculations.

Hypocritical (0)

Khyber (864651) | about 10 months ago | (#44944809)

Mozilla better work on de-bloating its own code first.

peer review should include code review... (0)

Anonymous Coward | about 10 months ago | (#44944821)

Actually, wasn't the latter partially inspired by the former?

Get used to it (1)

flyingfsck (986395) | about 10 months ago | (#44944835)

If you want to code, then you got to get used to code reviews. It is the only way to improve quality and a scientist that doesn't want to improve quality should not be a scientist.

Re:Get used to it (1)

VortexCortex (1117377) | about 10 months ago | (#44944995)

Correction: a scientist that doesn't want to improve source quality shouldn't be a codemonkey...

Re:Get used to it (0)

John Allsup (987) | about 10 months ago | (#44945987)

Nor should such a scientist rely on the results of computer code in his research.  What you rely on in proper research, you should be an expert in.  Scientists who use code should be codemonkeys, but not all scientists should use code -- pen, paper and a well drilled mind are far more powerful, properly mastered and harnessed.

Re:Get used to it (0)

Anonymous Coward | about 10 months ago | (#44946923)

For many scientists, the computer is just an extension of pen and paper. In the work I've done, after simplifying it as much as possible on paper, you end up with a nonlinear PDE. A few simple, asymptotic examples can be worked out with pen and paper, but applying it to real world geometries requires a computer. Nothing would be gained by endlessly doing arithmetic with pen and paper for years to solve something that is straightforward on a computer in a few seconds and can be verified and validated.

Re:Get used to it (1)

BitZtream (692029) | about 10 months ago | (#44946541)

Correction: a scientist that doesn't want to improve source quality isn't a scientist.

Some can argue that they don't have time or budget to do so, but flat out not wanting to is a failure of the process itself. Its not someone you want to trust to make predictions on data.

Oh dear (1)

mjwalshe (1680392) | about 10 months ago | (#44945895)

You obviously haven't worked with people who are world leaders in their field they are not going to take advice from some commercial web dev on code.

Though back in the day I did make one guys code a bit more user friendly (his origioal comment was I dont need any prompts to remind me what i need to type ) as we had scaled to 1:1 models and as one single run of the rig could cost £20k in materiel's.

The Horror (3, Interesting)

Vegemite (609048) | about 10 months ago | (#44944911)

You must be joking. Many scientific papers out there have results based on prototype or proof of concept software written by naive grad students for their advisors. These are largely uncommented hacks with little, if any, sanity checks. To sell these prototypes commercially, I have had to cleanup after some of these grads. I take great sadistic pleasure in throwing out two years of effort and rewriting it all from scratch in a couple of weeks.

If we knew what we were doing... (2)

dargaud (518470) | about 10 months ago | (#44944931)

...it wouldn't be called research now does it ? Seriously manu scientific projects start with a vague idea and no funds. You do a table experiment, connect it to a 15 year old computer, then grow from there. In some projects I got no more than a quarter page of specifications for what ended up as 30 thousand lines of code. Yes I write scientific code, and no it's not always pretty and refactored and all that. Also there's never any money.

Too heavy mozilla drives mac users to chrome (1)

hereshalkidiki (3221071) | about 10 months ago | (#44944983)

I've been a fateful mozilla user for years. However on MAC due to the slowness of the browser and the high RAM consumption I permanently switched to Chrome. So may be they should make an experiment on how to keep their MAC users because until now they've been great at that. When I went to buy VPN from http://vpnarea.com [vpnarea.com] I was surprised to find out that they had an extension for Chrome but not for Mozilla.

I agree, but FYI: (1)

TheSeatOfMyPants (2645007) | about 10 months ago | (#44945139)

MAC [wikipedia.org] (all-caps) - Machine Access Code, a hexadecmial address used to identify individual pieces hardware on a network
Mac [wikipedia.org] - marketing name for the longstanding "Macintosh" line of computers by Apple

I've used Firefox since it first came out, but it's so damned bloated with unneeded 'extras' that I only stick with it because it's the one browser that allows extensions like AdBlock Plus to block outgoing server requests, not just hide the results. I had defected over to Opera for several months, but when they decided to become a Chrome clone, I gave up on it altogether.

Re:I agree, but FYI: (1)

_merlin (160982) | about 10 months ago | (#44945235)

I've used Firefox since it first came out, but it's so damned bloated with unneeded 'extras' that I only stick with it because it's the one browser that allows extensions like AdBlock Plus to block outgoing server requests, not just hide the results.

FWIW Safari allows extensions to block the requests before they're made as well, although the exact mechanism may be different.

Re:I agree, but FYI: (0)

Anonymous Coward | about 10 months ago | (#44946967)

MAC [wikipedia.org] (all-caps) - Media Access Control, a hexadecmial address used to identify individual pieces hardware on a network

FTFY! At least read the page you're linking to!

Saying it's a hexadecimal address is nonsense. It's just bits. It's most often written in hexadecimal, but that doesn't make the MAC-address itself hexadecimal. After all, an IP-address isn't decimal, is it?

Re:Too heavy mozilla drives mac users to chrome (1)

smash (1351) | about 10 months ago | (#44945217)

Good enough Safari had me ditch both Mozilla AND Chrome. I've had no real issue with Safari since 4.0... certainly nothing big enought to justify installing another browser to secure and maintain.

Science is supposed to be scary (0)

Anonymous Coward | about 10 months ago | (#44945263)

Peer review and all. But put your stupid egos aside and concentrate on what you're supposedly trying to achieve. That's science, buddy.

Absolutely necessary ... (1)

Anonymous Coward | about 10 months ago | (#44945357)

Most of my collegues at the university are terrible coders and I am often even not sure how much I trust their results. Even if it does scare people, there has to be more awareness about code review in the scientific field than there is today.

Quality (0)

Anonymous Coward | about 10 months ago | (#44945385)

Quality doesn't really say much. I assume they mean efficiency, readability, re-usability,... things professional coders are confronted with daily.
Roger Peng should stop bitching, good code is written by professional coders, we don't look down on you (in public) for writing bad code along with research papers. This is an outreach from professional coders to academics, I find it quite arrogant to warn that this "might backfire'.

Good intentions, bad implementation (1)

OneSmartFellow (716217) | about 10 months ago | (#44945437)

Having seen some code written by an esteemed Bio-Chemist, I agree that experienced programmers should be reviewing their code, but then, you'd expect a true scientist to have an expert review his stuff anyway.

My experience was a real eye opener. Between the buffer overruns, and logic holes, I am amazed the crap ran at all. The fact that it compiled was a bit of a mystery until I realized that it was possible to ignore compile errors.

Re:Good intentions, bad implementation (2, Insightful)

Anonymous Coward | about 10 months ago | (#44947179)

This is a logical fallacy that many 'smart' people fall into. I am smart (in this case usually PhD's or people on their way to it) so this XYZ thing should be no sweat. They seem to forget that they spent 10-15 years becoming very good at whatever they do. Becoming a master of it. Yet somehow they also believe they can use this mastery on other things. In some very narrow cases you can do this. But many times you can not. Or even worse assuming no one else can understand what you are doing or they will 'get it wrong'.

When the right thing to do is find another master in that other field. Even that is dangerous. You will also see many out there who then follow in the footsteps of these 'know it all' masters. Yelling the word 'science' at anyone who disagrees. Disagreeing is not because they think you are wrong (maybe you are), but because they do not understand.

In this case writing code is *easy*, writing good code takes work. Even those who are masters at it make mistakes. We call them bugs. Even when you are good at it you still work at making it correct, even if you do it just because you have 'been there'. There are whole books out there on anti-patterns, patterns, development style, code philosophy, etc. From my POV it usually takes someone about 2 years to become somewhat 'ok' at programming. Somewhere in the 5-10 year mark they become masters. Then that is if they do it every day.

only 200 lines (0)

Anonymous Coward | about 10 months ago | (#44945539)

Sorry, but almost no meaningful review can come of 200 line sample of some program.

This is the equivalent of saying "we want to review one paragraph of your paper."

You might find a few typos from copy/paste, but good luck catching whole-program issues.

Re:only 200 lines (0)

Anonymous Coward | about 10 months ago | (#44945847)

You're clearly not a programmer.

I could write a 200 page book about the things that can go wrong in 200 lines of code.

200 lines can be filled with duplicate code, logic holes, syntax errors, unnecessary complicated nested ifs, never ending loops, doing easy things complicated, unreadable code, ....

Re:only 200 lines (0)

Anonymous Coward | about 10 months ago | (#44947007)

I could write a 200 page book about the things that can go wrong in 200 lines of code.

Only if you omit a lot, use a very small font and/or use very large pieces of paper. Otherwise I think it would be impossible to only use 200 pages.

Egoless programming (2, Interesting)

Anonymous Coward | about 10 months ago | (#44945947)

Back in the late 70s middle ages of comp sci...
There was this thing called "egoless programming" being taught. The idea being that we have to inculcate in developers the idea that your code is not necessarily a reflection of your personal worth, and that it deserves to be poked at and prodded, and that you should not take personal offense by it.

Yeah, it's a child of the 60s kind of thing, but it does work.

This is a huge challenge in the biomedical research field, because to be successful, you need personality traits like a strong ego (yes, *I* am brilliant, and my idea is the best, and you should fund it, and not that other bozo).

Re:Egoless programming (1)

John Allsup (987) | about 10 months ago | (#44945973)

That modern research rewards egoism is one of the most dangerous, worrying and disillusioning features of modern research.  The best thinkers are sure to be suffocated in the face of masses of intellectual university graduates chasing research money and the dream of being regarded as one of those 'best thinkers'.

Re:Egoless programming (1)

ebno-10db (1459097) | about 10 months ago | (#44947933)

The idea being that we have to inculcate in developers the idea that your code is not necessarily a reflection of your personal worth, and that it deserves to be poked at and prodded, and that you should not take personal offense by it.

Wusses and namby-pambies. I take the opposite approach. Three or more bugs found in your code results in summary execution, with your corpse hung from the flagpole as a reminder to others.

We need more code out there (1)

John Allsup (987) | about 10 months ago | (#44945959)

and to improve how it looks, and lose the shame that we instinctively feel in the face of criticism.  No-one codes perfectly, so there is always room for useful criticism and progress, and we need to get that awareness of coding issues out as well, not just code alone.

Re:We need more code out there (0)

Anonymous Coward | about 10 months ago | (#44946129)

I resent that.

I code perfectly, so fuck you.

Science is so passe! (0)

mark_reh (2015546) | about 10 months ago | (#44946081)

Faith is where it's at! Looking at "science" journals is like looking at internet pron- it's a one way ticket to H-E-double hockeysticks! You need some proper churchin'!

researcher vs. software developer (5, Informative)

Anonymous Coward | about 10 months ago | (#44946213)

People doing scientific research and software developers are really doing very different things when they write code. For software developers or software engineers, the code is the end goal. They are building a product that they are going to give to others. It should be intuitive to use, robust, produce clear error messages, and be free of bugs and crashes. The code is the product. For someone doing scientific or engineering research, the end goal is the testing an idea, or running an experiment. The code is a means to an end, not the end itself; it needs only to support the researcher, it only needs to run once, and it only needs to be bug free in the cases that are being explored. The product is a graph or chart or sentence describing the results that is put into a paper that gets published; the code itself is just a tool.

When I got my Ph.D. in the 1990s, I didn't understand this, and it brought be a lot of grief when I went to a research lab and interacted with software developers and managers, who didn't understand this either. The grief comes about because of the different approaches used during the development of each type of code. Software developers describe their process variously as a waterfall model, agile development model, etc.. These processes describe a roadmap, with milestones, and a set of activities that visualize the project at its end, and lead towards robust software development. The process a researcher uses is related to the scientific method: based on the question, they formulate a hypothesis, create an experiment, test it, observe the results, and then ask more questions. They do not always know how things will turn out, and they build their path as they go along. Very often, the equivalent "roadmap" in a researchers mind is incomplete and is developed during the process, because this is part of what is being explored.

In my organization, this makes tremendous conflict between software developers, who want a careful, process driven model to produce robust code, and researchers, who are seeking to answer more basic questions and explore unknown territory in a way that has a great deal of uncertainty and cannot always easily deliver specific milestones and clarity into schedule that is often desired.

It is worse when the research results in a useful algorithm; of course, the researcher often wants to make it available to the world so that others can use it. This is more of a grey area; if the researcher knows how to do software engineering, they may go through the process to create a more robust product, but this takes effort and time. The fact that Mozilla wants to help debug scientific code is a very good thing; it often needs more serious debugging and re-architecting than other software that is openly available.

I wish more people understood this difference.

so... (0)

Anonymous Coward | about 10 months ago | (#44946553)

...peer review is now bad for science?

Or is it a safety thing? The little researcher written code I have seen is so horrible that it shouldn't be inflicted upon anybody else. Could easily cause heart attacks and strokes in many a programmer if too much of that gets out. :-)

Especially genetic analysis (0)

Anonymous Coward | about 10 months ago | (#44946585)

I've had the rare privilege of reviewing some DNA analysis toolkits in Java. The complete morass of logic free debris, which was supposed to be "replaced by the new version" written by the same monkey who'd abandoned the old project due to how embarrassing all the failures wee, was coupled with a complete lack of any kind of error checking, bounds checking, or milestones so days or weeks of analysis which failed at the very end could be pulled in and the final broken analysis step patched and re-run.

But of course, it was commercial and closed source, so the complete inability to string ATGC together in predictable order from years of data sampling was blamed on everything else, and doubtless millions of dollars of experimental research based on *bad code* was burned by companies excited to have the genetic data they wanted. Be very, very scared of what mistakes by gene sequencing companies can lead to, because a lot of their data is just plain wrong.

The Other Edge of the Sword (4, Interesting)

fygment (444210) | about 10 months ago | (#44947333)

Roger Peng's comment shows a typical, superficial understanding of programming. Ironically, he would be the first to condemn a computer scientist/coder who ventured in to biostatistics with a superficial knowledge of biology. I believe he would feel that anyone can program, but not anyone can do biostatistics. And I deeply disagree. Tools have been provided so that _any_ scientist can code. That does not mean that they understand coding or computer science.

I have personally experienced that especially in the softer sciences like biology, economy, meteorology, etc., the scientists have absolutely no desire to learn any computer science: coding methodology, testing, complexity, algorithms, etc. The result is kludgy, inefficient code heavily dependent on pre-packaged modules, that produces results that are often a guess; the code produces results but with a lack of any understanding of what the various packaged routines are doing or whether they are appropriate for the task. For example, someone using default settings on a principal component analysis package not understanding that the package expects the user to have pre-processed the data; the output looks fine but it is wrong. It is the same as someone approaching engineering without some understanding of thermodynamics and as a result wasting their time trying to construct a perpetual motion machine.

Been doing this for 15 years (1)

Anonymous Coward | about 10 months ago | (#44947621)

For the brother-in-law, MD/PhD at local school - he sits on several review boards.

The biggie is not the code, but the data set. Like to design data sets to test code rather than do code reviews.

Have also done some code reviews when the b-in-law was not certain. And have found 'bogus' code twice.

Another (anecdotal) point - all problems found were with life science students. NONE/ZERO/NADA problems with code done by physical sciences or engineering people. Unless you want to count some of the most ugly Python code ever seen...

Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Create a Slashdot Account

Loading...