For model calibration (esp via logistic regression), does anyone know of a statistical investigation of the properties of the resulting calibrated predictions?
IOW, if we use predictions from one model as inputs to another model, do we know the probability distribution of the final predictions?
I don't think many of my stats folks are here, but FYI - I am registered for this year's Joint Statistical Meetings! Hope to see a bunch of friends there
So I'm probably going to be nerd sniped into developing a Jupyter notebook to examine the question of how well are mid income families 2 adults and 2 kids doing relative to how well their parents were doing 30 years earlier. I'm going to use a dirichlet prior over the weights on a 5 item CPI based expense index. The missing part is paired nominal earnings of people and their parents... Anyone know a dataset #statistics#data#economics@economics@a.gup.pe
This NBER paper used anonymized tax records and actually matched children to their actual parents... For kids born in the 80s to parents who had median incomes, MORE THAN HALF of them had LOWER incomes than their parents CPI all-items adjusted
Here's the logical structure of what you will be taught in terms of #statistics as a masters student in pretty much any #science field.
If MY DATA is a sample from two random number generators of PARTICULAR TYPE, and MY TEST has a small p value then MY FAVORITE EXPLANATION FOR THE DIFFERENCES IS TRUE.
This is, quite simply, a logical fallacy. The first thing wrong is that your data IS NOT a sample from a random number generator of that particular type. So we can ignore the rest logically.
So the very basic stuff you are taught in masters level #biostats or #socialsciences stats or whatever is built on two layers of logical fallacy.
So that's why poor stats practices are absolutely more common than good stats practices. That's why the kind of thing posted on the Reddit post I mentioned in my recent posts is actually everywhere in science. Students are literally taught to do things wrong in the textbooks.
So I promise you if you name a prominent journal in your field where people publish data based analyses with standard statistical results I will find multiple papers published in the last 3 months where the logical fallacies implied by the paper's analysis absolutely demolish the scientific merit of the conclusions and the correct conclusion from the paper will be "hunh interesting data but we learn virtually nothing reliable from the analysis"
useR! 2024, the global R user conference, will be taking place in Salzburg, Austria (as well as virtually) in July 2024. We have a full lineup of giants in the field of data science. Thank you Maëlle Salmon for being a part of the conference!
Maëlle Salmon, with a PhD in statistics, is a Research Software Engineer and blogger.
The plotting, statistical, and data selection tools in the mapdata.py data explorer (https://pypi.org/project/mapdata/) can be used even if you don't have any map data. Just add dummy latitude and longitude values to the data table. Zeroes will do. The map and the dummy columns can both be hidden, and you can then explore the data table with the other available tools.
useR! 2024, the global R user conference, will be taking place in Salzburg, Austria (as well as virtually) in July 2024. We have a full lineup of giants in the field of data science. Thank you, Kurt Hornik!
Professor of Statistics & Mathematics, Chair, Department of Finance, Accounting and Statistics, Wirtschaftsuniversität Wien
useR! 2024, the global R user conference, will be taking place in Salzburg, Austria (as well as virtually) in July 2024. We have a full lineup of giants in the field of data science. Thank you, Dr., @kellybodwin for being a part of the conference!
Kelly Bodwin is an Associate Professor of Statistics and Data Science at Cal Poly in San Luis Obispo, CA. https://events.linuxfoundation.org/user/
A new online service from the UK’s Office for National Statistics (ONS), the Explore Local Statistics service, collates 57 local measurements, across topics ranging from health and school results to smoking and income levels.
You simply input a postcode, and you can explore, download and map all kinds of statistics that tell you something about what living there is like.
Can anyone who understands statistics with regards to covid infection rates understand why the rating, which used to be from 1-10 in severity is now calculated at 1-20?
@CuriosityCat
It's automated and she wouldn't see responses, nor can we see what she's responding to on there (unless you have an x account, which I no longer have), which can be frustrating.
Bill Comeau is the same. Grr @MoriartyLab@auscandoc@DenisCOVIDinfoguy@moriartylab
Sponsorship Opportunities are now available for UseR! 2024.
UseR! 2024 is the 2024 edition of the annual R User Conference, and will be taking place in Salzburg, Austria in July 2024 (as well as online).