
So, I'm wondering about methodologies of developing climate models. They're all done in software now, so it's essentially a large software project - something I (and many other forum members) have a lot of experience in. The state of the art climate models are reported have at least a couple of millions lines of code in them. That's a number which implies an absolutely staggering complexity, which arguably is too large to have any chance of ensuring correctness.
For comparison, code for critical system like F-35 is of similar complexity, but the military spent more that $1 trillion developing it (I don't think anything near that was spent on climate modelling), AND it's inherently more testable. Whereas my uninformed suspicion is that code for climate models is:
1. Developed by underpaid and overworked grad students and postdocs who have every incentive to overlook potential errors in code (modern academia has the most corrupting incentives) as long as they're reasonably certain that no one will call them out on it.
2. Developed without nearly enough rigor and methodologies across the whole software lifecycle. Scientist are basically not software engineers.
3. Tested against the authors's biases. I.e. they run the model and if it predicts something surprising (to them), they go back to the code and look for bugs - but if it predicts something that confirms their biases (like "+4C till 2100"), then they declare mission accomplished, as the model is clearly working now.
3a. They test against past data for which the correct predictions are already known to see if the model predicts them. That's probably the best they can do, but of course there's a huge risk of overfitting (https://en.wikipedia.org/wiki/Overfitting). You can build models which fit both to the past data and to your biases at the same time.
On a broader level, since it's acknowledged that Earth's climate is a chaotic system [1] (https://en.wikipedia.org/wiki/Chaos_theory), isn't modelling it also an activity ruled by chaos? I.e. one where a tiny change to the model's structure or constants or inputs will affect model's outputs in a major way? Doesn't this make a problem of climate modelling hopeless?
I'd love to have someone knowledgeable answer this doubts of mine.
[1] Or is it just weather? Can we even know if climate is a chaotic or non-chaotic system?