Do you want to speak at posit::conf(2024)? Now is the time to start thinking about your talk because the call for proposals has just opened! https://posit.co/blog/speak-at-posit-conf-2024/
We're broadly interested in talks about any aspect of data science, and don't worry if you've never given a talk before as all accepted talks get speaker training from our wonderful partner Articulation.
Many thanks to @mariatta, @lorenipsum, the rest of the @pycon organizing team and @ThePSF staff, and everyone else who made PyCon US in Pittsburgh possible and awesome. See you again in 2025!
I've given several internal versions of this workshop at Amazon and I daresay it's been very well received. The power of these new data wrangling libraries is honestly staggering. We use them all the time at work. You should too.
20 bucks gets you in the door. All proceeds to Ukraine aid orgs. #rstats#pydata
I have been thinking a bit about how to detect supply chain attacks against popular open source projects such as scikit-learn.
If you have practical experience with https://reproducible-builds.org/ in particular in the #Python / #PyData ecosystem, I would be curious about any feedback to the plan I suggest for scikit-learn in the following issue.
Feel free to reply on mastodon first, if you have questions.
We can't replace them, but we welcome anyone looking for a friendly, inclusive community to join us at the Data Science Learning Community (@DSLC) https://DSLC.io
We want to apply to the Google Season of Docs for #PUDL but have never worked with an outside technical writer before. Does anybody have someone to recommend? It's a #Python project focused on producing open data describing the US energy system.
Join #PyData#Pittsburgh for a casual gathering of the local, national, and international PyData community on the sidelines of #PyCon US 2024! Meet up with fellow #DataScience, #MachineLearning, and scientific computing enthusiasts when the world's largest Python conference comes to town.
Opportunity Scholars at posit::conf(2024). The application deadline is approaching fast; March 22nd. If you're a strong candidate or know someone who is, please act quickly.
Opportunity Scholarships receive free tickets, a workshop, support for travel and accommodation, plus lots of swag.
Do the one thing I really need Python for via {reticulate} by just sending it the exact dataframe it needs and sending the results back to R for post-processing
Hadn’t occurred to me until recently, but I am really, REALLY liking it.
posit::conf(2024) virtual tickets are now available!
Join us on August 12-14—from all over the world—to live stream the incredible talks and keynotes that will be taking place in Seattle.
We understand that not everyone will be able to make the trip to Seattle this year, so we’re excited to offer a fully virtual offering for everyone as an alternate option.
REGISTER: https://posit.co/conference/
cloudpickle is a library used by PySpark, Dask, Ray and joblib / loky (among others) to make it possible to call dynamically or locally defined function, closures and lambdas on remote Python process workers.
This is typically necessary for running code in parallel on a distributed computing cluster from an interactive developer environment such as a Jupyter or Databricks notebooks.
The R4DS Online Learning Community has thousands of members, hundreds of which are active on our Slack every week. You might be wondering: Why not charge those learners? Why is the Community funded through donations?
Ya está abierto el registro para nuestra reunión de abril: 🐲 LLMOps & ML para Drilling Performance y Python & Mazmorras, este mes en las oficinas de Repsol
I ran a quick Gradient Boosted Trees vs Neural Nets check using scikit-learn's dev branch which makes it more convenient to work with tabular datasets with mixed numerical and categorical features data (e.g. the Adult Census dataset).
Let's start with the GBRT model. It's now possible to reproduce the SOTA number of this dataset in a few lines of code 2 s (CV included) on my laptop.