Here's the logical structure of what you will be taught in terms of #statistics as a masters student in pretty much any #science field.
If MY DATA is a sample from two random number generators of PARTICULAR TYPE, and MY TEST has a small p value then MY FAVORITE EXPLANATION FOR THE DIFFERENCES IS TRUE.
This is, quite simply, a logical fallacy. The first thing wrong is that your data IS NOT a sample from a random number generator of that particular type. So we can ignore the rest logically.
Having hypothesized a mechanism for how MY DATA comes about, and explained that in terms of a simplified model with CERTAIN PARAMETERS and collected data from these processes, does the mechanism and the data imply different parameters for my different experimental conditions? Given the results of those analyses, do the models consistently correctly predict future data?
This is the logic of quantitative model building and #Bayesian statistics.
Here it is people. A PhD student describing details of what they've come to realize is the completely scientifically bankrupt methodologies their high-powered successful, well funded lab PI demands the lab members do. Everything this person says is basically commonplace in todays labs #science#openscience#statistics#bayesian
This week, PyMC version v5.13.0 was released. PyMC is one of the main #Python 🐍 libraries for 𝐁𝐚𝐲𝐞𝐬𝐢𝐚𝐧 statistics ❤️. It provides a framework for probabilistic programming, enabling users to build #Bayesian models with a simple Python API and fit them using 𝐌𝐚𝐫𝐤𝐨𝐯 𝐂𝐡𝐚𝐢𝐧 𝐌𝐨𝐧𝐭𝐞 𝐂𝐚𝐫𝐥𝐨 (MCMC) methods 🚀.
The new release includes new features, bug fixes 🐞, and documentation improvements 📖. More details on the release notes 📝 👇 #DataScience#machinelearning#statistics
The MCMC sampling is simultaneously finished and unfinished before you wake your computer monitor and look at the progress bar. It's the Schrodinger's MCMC.
You'll be working with another reviewer to read and run the code, make sure it fills a basic checklist which usually only takes a few hours, and beyond that whatever youd like to focus on. Both of these are collaborative review processes where the goal is to help these packages be usable, well documented, and maintainable for the overall health of free scientific software.
Its fun, I promise! Happy to answer questions and boosts welcome.
Edit: feel free to volunteer as a reply here, DM me, or commenting on those issues! Anyone is welcome! Some experience with the language required, but other than that I can coach you through the rest.
Following my previous posts on Bayesian Statistics, if you are looking for a resource to get started with, I recommend watching this great workshop by Angelika Stefan at R-Ladies Amsterdam meetup 👇🏼
The workshop focuses on the foundation of Bayesian statistics and covers topics such as:
✅ Parameter estimation
✅ Prior and posterior distribution
✅ Likelihood
(1/3)Modeling Short Time Series with Prior Knowledge in PyMC 🚀
Yesterday, I shared an article by Tim Radtke about forecasting insufficient time series data with a Bayesian approach using R. Here is the Python version 🧵👇🏼
(2/3) The TLDR is when you need to model a short time series (less than one seasonal cycle) and have some knowledge or assumption about the expected behavior of the series - either from a similar series (i.e., similar products or geos) you can translate those assumptions to the model's prior distributions and use it to build a forecasting model.
(1/2) Modeling Short Time Series with Prior Knowledge
When modeling time series data, you may find yourself with insufficient data. Insufficient data in time series would typically be defined as less than one seasonal cycle. This would challenge us to understand whether some events are driven by seasonality or other reasons, such as one-time events, outliers, etc.
(2/2) If you have some prior knowledge about the series (e.g., learning from similar products or goes, etc.), you should consider using the Bayesian approach.
The article below by Tim Radtke provides an example of how to incorporate prior assumptions into a time series forecasting model when having insufficient data.
We have a new pape on polarisation with an #ABM of naïve Bayesian agents. It ends a decade of thinking about #testimony from a #Bayesian perspective, so I thought I’d summarise that decade in a thread.
The Issue: Much of what we believe to ‘know’ we know through the testimony of others. Intuitively, how much I adjust my beliefs in response to you saying “it is snowing” should depend on how reliable/accurate you are (ie the likelihoods associated with your report) 1/9
(1/6)This time of the year ☃️...Statistical Rethinking 2024 ❤️❤️❤️
This has become a tradition. Like previous Decembers, this week, the 2024 edition of the Statistical Rethinking course was announced. If you are looking to learn Bayesian statistics, I highly recommend checking it out.
Nice work by @turion integrating live Bayesian learning into a Functional Reactive Programming app. That's the power of embedded probabilistic programming languages like Monad-Bayes:
Bayesian cross-validation by parallel Markov Chain Monte Carlo by Alex Cooper, Aki Vehtari, Catherine Forbes, Lauren Kennedy, and Dan Simpson. http://arxiv.org/abs/2310.07002
fast general parallel brute force Bayesian cross-validation with GPUs
constant memory streaming estimates and convergence diagnostics
assessing convergence (Rhat) and accuracy (MCSE) of aggregated result from parallel computations
In my Bayesian Data Analysis course this week I explained the Metropolis algorithm and next week I'll explain HMC and NUTS. For demonstrating these, Chi Feng's MCMC interactive demos https://chi-feng.github.io/mcmc-demo/ have been super useful!
New paper Past, Present, and Future of Software for Bayesian Inference by Erik Štrumbelj, Alexandre Bouchard-Côté, Jukka Corander, Andrew Gelman, Håvard Rue, Lawrence Murray, Henri Pesonen, Martyn Plummer, and I.
This review aims to summarize the most popular software for Bayesian inferenve and provide a useful map for a reader to navigate the world of Bayesian computation.