And another thing! Why should a project be done when it gets published?! What ki...

fmap · on Aug 7, 2018

In an ideal world you would be right, but that's not how the funding structures for academic research are set up.

Think of a research lab as a company that gets paid per prototype and then has to market the concept for the next prototype in an infinite loop. If you can't package up what you're doing into a sequence of small prototypes then you're not getting paid.

ajuc · on Aug 7, 2018

The requirement for publishing in a scientific journalshould be opensourcing all the code used.

You can't expect the results to be reproductible 20 years later otherways.

skummetmaelk · on Aug 7, 2018

20 year old code that has not been maintained most likely would not run on a modern system anyway, unless it is extremely simple.

azernik · on Aug 7, 2018

Research labs operate on a very slow tech upgrade cycle anyway; since code is handed down from assistant to post-doc to grad student, complete rewrites would take up a significant fraction of a person's time at any given lab, and so codebases are often as long-lived as the labs in which they live. We're talking decades-old FORTRAN here. Running twenty-year-old software is a barrier for some labs, but not all.

_Wintermute · on Aug 7, 2018

My experience is the opposite, that a lot of labs are running R code that won't work 6 months later, and no one actually recorded the package version numbers that were used.

azernik · on Aug 7, 2018

TBF, the experience I have is with physics and mechanical/civil engineering labs, where there historically hasn't been much of a reliance on R. And in any case said experience is several years out of date.

Speaking of, I haven't played with R - what are its standard methods for handling dependencies? I'm particularly enamored of the pip and npm way of doing it, where you create a version-controlled artifact (requirements.txt and packages.json, respectively) that defines your dependencies. Does R not have a similar system, or do people just not use it?

_Wintermute · on Aug 8, 2018

R isn't fantastic for handling dependencies. If your code is bundled up as a package then you can specify version numbers for your dependencies, but I don't know of any equivalent to `pip freeze` to actually list these. Installing anything other than the latest version of a package is a bit of a pain, and setting up environments for separate projects is pretty much unheard of.

I'm a bit bitter about the whole "writing reproducible code in R", as I'm currently wasting a lot of time trying to get R code I wrote at the start of my PhD to run again now I'm writing up.

nonbel · on Aug 8, 2018

Theres tools for this:

https://rstudio.github.io/packrat/

https://stackoverflow.com/questions/43018752/version-control...

ajuc · on Aug 7, 2018

It can always be ported. Without the code you're stuck.

a_136_chiffa · on Aug 10, 2018

One where you don't get sustained funding to maintain it. In compbio even major resources, known to everyone in the domain only have funding from one two-year grant to another.

your-nanny · on Aug 7, 2018

I don't really understand what you are suggesting should be done? data-mining?

also, labs keep the data around, and sometimes use it again for novel posthoc analyses, even years later. in fact that's what I'm doing now!

roel_v · on Aug 7, 2018

Who pays for that?