Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

And another thing! Why should a project be done when it gets published?! What kind of software project would make one release and call it good forever? Nonsense.


In an ideal world you would be right, but that's not how the funding structures for academic research are set up.

Think of a research lab as a company that gets paid per prototype and then has to market the concept for the next prototype in an infinite loop. If you can't package up what you're doing into a sequence of small prototypes then you're not getting paid.


The requirement for publishing in a scientific journalshould be opensourcing all the code used.

You can't expect the results to be reproductible 20 years later otherways.


20 year old code that has not been maintained most likely would not run on a modern system anyway, unless it is extremely simple.


Research labs operate on a very slow tech upgrade cycle anyway; since code is handed down from assistant to post-doc to grad student, complete rewrites would take up a significant fraction of a person's time at any given lab, and so codebases are often as long-lived as the labs in which they live. We're talking decades-old FORTRAN here. Running twenty-year-old software is a barrier for some labs, but not all.


My experience is the opposite, that a lot of labs are running R code that won't work 6 months later, and no one actually recorded the package version numbers that were used.


TBF, the experience I have is with physics and mechanical/civil engineering labs, where there historically hasn't been much of a reliance on R. And in any case said experience is several years out of date.

Speaking of, I haven't played with R - what are its standard methods for handling dependencies? I'm particularly enamored of the pip and npm way of doing it, where you create a version-controlled artifact (requirements.txt and packages.json, respectively) that defines your dependencies. Does R not have a similar system, or do people just not use it?


R isn't fantastic for handling dependencies. If your code is bundled up as a package then you can specify version numbers for your dependencies, but I don't know of any equivalent to `pip freeze` to actually list these. Installing anything other than the latest version of a package is a bit of a pain, and setting up environments for separate projects is pretty much unheard of.

I'm a bit bitter about the whole "writing reproducible code in R", as I'm currently wasting a lot of time trying to get R code I wrote at the start of my PhD to run again now I'm writing up.



It can always be ported. Without the code you're stuck.


One where you don't get sustained funding to maintain it. In compbio even major resources, known to everyone in the domain only have funding from one two-year grant to another.


I don't really understand what you are suggesting should be done? data-mining?

also, labs keep the data around, and sometimes use it again for novel posthoc analyses, even years later. in fact that's what I'm doing now!


Who pays for that?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: