Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>I am interested to know the distinction between "production-ready" and "science-ready" code.

In general, scientists don't care how long it takes or how many resources the code uses. It is not a big deal to run a script for an extra hour, or use up a node of supercomputer. Extravagent solutions or added packages to make the code run smoother or faster is only wasting time. It speed/elegance only really matters when you know the code is going to be distributed to the community.

Basically scientists only care if the result, is true. If the result it outputs is sensible, defensible, reliable, reproducible. It would be considered a dick move to criticism someones code, if the code was proven to produce the correct result.



> It would be considered a dick move to criticism someones code, if the code was proven to produce the correct result.

Formal proof is much much harder than making code understandable and reviewable. It can be done but it is not easy, and can yield surprising results:

https://en.wikipedia.org/wiki/CompCert

http://envisage-project.eu/proving-android-java-and-python-s...


Do you know how you could get to the state that "the code was proven to produce the correct result"?

If not by unit tests, code review or formal logic, then what?


Not all scientific code is amenable to unit testing. From my own experience from a PhD in condensed matter physics, the main issue was that how important equations and quantities “should” behave by themselves was often unknown or undocumented, so very often each such component could only be tested as part of a system with known properties.

You can then use unit testing for low-level infrastructure (e.g. checking that your ODE solver works as expected), but do the high-level testing via scientific validation. The first line of defense is to check that you don’t break any laws of physics, e.g. that energy and electric charge is conserved in your end results. Even small implementation mistakes can violate these.

Then you search for related existing publications of a theoretical or numerical nature, trying to reproduce their results; the more existing research your code can reproduce, the more certain you can be that it is at least consistent with known science. If this fails, you have something to guide your debugging; or if you’re very lucky, something interesting to write a paper about :).

The final validation step is of course to validate against experiments. This is not suited for debugging though, since you can’t easily say whether a mismatch is due to a software bug, experimental noise, neglected effects in the mathematical model, etc.


>If not by unit tests, code review or formal logic, then what?

Cross referencing independent experiments and external datasets.

Science doesn't work like software. The code can be perfect and still not give results that reflect reality. The code can be logical and not reflect reality. Most scientists I know go in with the expectation that "the code is wrong" and its results must be validated by at least one other source.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: