Thank you and thank you for your thought-provoking comment. Much appreciated

You characterization of Popper is spot on, I think. And you're absolutely right that DS is inherently inductive. It doesn't get much more inductive than regression and a great deal of data science is basically regression on roller skates.

And yes, I'd say a lot of DS is pseudo-science, though not so much because its hypotheses are not falsifiable, but more because data scientists don't really try. In fact, the "Data Speak" perspective is so ground in, that they accept the "conclusions" of their data (analysis) more or less as truth and don't even get to expressing them as hypotheses. (Many, but not all).

I'm not sure the solution is to try and go deductive, though. I think falsification (Popper's attempt to replace inductivism with something deductive) is where he is weakest. I'm not even sure he's that convinced of it himself - especially in his later writing.

It doesn't really solve the problem; it just replaces the contingency of inductive demonstration with the contingency of the deductive not yet proven false. And falsification harder than Popper makes out, as Duhem showed pre-Popper and Quine re-iterated a little after Popper (check out my earlier falsification article).

Popper was a pretty committed frequentist and he was enough influenced by positivism (even in opposition) that he was quite down on induction. I think re-readings of Hume (on induction) and the rise of Bayesianism gives a third way, which is to take the contingency on the chin, drop any aspiration towards logical or deductive certainty, and work with how data influence plausibility instead.

We're still very much in the Popperian paradigm of conjecture and contest between theories, but where Popper only really accepts refutation, I think Bayesianism and a bit of largess with respect to induction allows for confirmation as a legitimate process to raise and lower the (relative) plausibility of hypotheses.

We're never better than our best conjecture anyway, so we may as well try and formalize how we figure out what that is, but also how we make decisions when we can't.

--

Mathematical modelling for business and the business of mathematical modelling. See stochastic.dk/articles for a categorized list of all my articles on medium.

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store