Statistical bootstrapping has been on our minds recently. It initially seemed like a good way to analyze the results of campaigns powered by pipe, the Automattic machine learning pipeline. However, as my colleague Demet Dagdelen and I have discovered, bootstrapping isn’t as simple as it seems. Indeed, we spent the equivalent of several workdays discussing the topic and becoming more familiar with the limitations of bootstrapping.
In addition to our internal discussions, I spent some time outside work studying the topic and publishing a post on common bootstrapping pitfalls. A few months ago, I also gave a conference talk on bootstrapping, which you can watch below. The summary of the talk can be found on my blog, along with links to my slides and simulation code.