ramikrispin, to python
@ramikrispin@mstdn.social avatar

(1/5) 𝐇𝐚𝐩𝐩𝐲 𝐒𝐚𝐭𝐮𝐫𝐝𝐚𝐲! ☀️
Here are a few steps you can take to reduce your Python 🐍 image size 👇🏼

TLDR - Using slim image and multi-stage build

ramikrispin, to datascience
@ramikrispin@mstdn.social avatar

DevOps for Data Science - New Book 🚀

Always happy to see new MLOps books! The DevOps for Data Science is a new book by Alex K Gold. As the name implies, the book focuses on topics related to DevOps for data scientists. This includes the following:
✅ Command line
✅ Working with Linux systems
✅ Docker
✅ Scaling resources
✅ Network, domains, DNS, SSL, etc.
✅ Authentication

ramikrispin, to datascience
@ramikrispin@mstdn.social avatar

Building a GPT-2 from scratch 🚀

Andrej Karpathy released today a tutorial for reproducing GPT-2 from scratch. OpenAI released GPT -2 in 2019, and it is a 124M parameters model. This four-hour tutorial covers setting up the GTP-2 network and then training and optimizing its parameters.

It looks like a really cool tutorial; I hope to get the bandwidth to watch it in the coming weeks!

📽️ https://www.youtube.com/watch?v=l8pRSuU81PU

ramikrispin, to ArtificialIntelligence
@ramikrispin@mstdn.social avatar

(1/2) Congratulations to my friend Lior and his co-author Meysam for the release of their new book - Mastering NLP from Foundations to LLMs 🎉

I met Lior a few years ago at a conference, and since then, I have been following his work in the field of NLP ❤️.

4nn4_clickt, to llm German
@4nn4_clickt@mstdn.social avatar

Wie kann Museen dabei helfen, Sammlungen zu erschließen? Sebastian Ruff vom Stadtmuseum Berlin erzählt von seinen Erfahrungen mit Tools zur automatischen Schlagwortgenerierung. Sein Fazit: die Arbeit damit kostet erstmal Zeit & die Tools haben ihre Grenzen, aber sie haben Potenzial. Wichtig am Anfang: vollständige Thesauri mit Normdaten & Fehlstellenanalyse im Datenbestand ☝️
https://www.kultur-b-digital.de/digitale-kultur/impulse/ki-im-stadtmuseum-interview-mit-sebastian-ruff/

@museum

leanpub, to datascience
@leanpub@mastodon.social avatar

The Hundred-Page Machine Learning Book (PDF + EPUB + extra PDF formats) by Andriy Burkov is on sale on Leanpub! Its suggested price is $40.00; get it for $14.00 with this coupon: https://leanpub.com/sh/F07t1Azi

physaliacourses2, to datascience

🔍 Want to make your data analysis fully reproducible with R?

🚀 Join us this October for the 3rd edition of our course with @paocorrales and @eliocamp !

Don't miss out! 🔗https://physalia-courses.org/courses-workshops/r-reproducibility/

#rstats #datascience #reproducibility

ramikrispin, to python
@ramikrispin@mstdn.social avatar

Posit recently released a new Shiny extension for VScode, supporting both Shiny for R and Python 🚀

More details on the release post 👇🏼
https://shiny.posit.co/blog/posts/shiny-vscode-1.0.0/

Extension 🔗: https://marketplace.visualstudio.com/items?itemName=Posit.shiny

#rstats #python #shiny #DataScience

leanpub, to datascience
@leanpub@mastodon.social avatar

The Hundred-Page Machine Learning Book (PDF + EPUB + extra PDF formats) by Andriy Burkov is on sale on Leanpub! Its suggested price is $40.00; get it for $14.00 with this coupon: https://leanpub.com/sh/RIsQReL4

stevensanderson, to datascience
@stevensanderson@mstdn.social avatar

Learn how to split strings and get the first element in R using base R, stringi, and stringr. Check out my latest post for examples and tips. Give it a try and share your experiences!

#R

Post: https://www.spsanderson.com/steveondata/posts/2024-06-05/

image/png

rpodcast, to datascience
@rpodcast@podcastindex.social avatar

Episode 167 of the @rstats @rweekly Highlights Podcast is a full (not partial) match with great R content! https://serve.podhome.fm/episodepage/r-weekly-highlights/issue-2024-w23-highlights

🛠️ Compa-tibble functions @grusonh
🏫 R tutorial worksheets with Quarto @nrennie

We're loving the ways we can add modern features to this show. Once you grab a new podcast app from https://newpodcastapps.com, you can see them in their full glory!

h/t @mike_thomas @jonmcalder 🙏

OpenDataLu, to datascience French

Vous êtes data scientist et vous travaillez pour le secteur public ? Le dernier guide de bonnes pratiques du Ministère de la digitalisation vous est destiné !

https://mindigital.gouvernement.lu/fr/publications/guide-manuel/guide-data-scientists-fr.html

N'hésitez pas à publier le résultats de vos analyses sur data.public.lu si vous le pouvez, ou à inclure des données déjà disponibles sur en open data dans vos analyses.

#dataScience #openData #Luxembourg #researchLuxembourg

sebkrantz, to datascience
@sebkrantz@fosstodon.org avatar

In the development version of {collapse} [v2.0.15, available via install.packages("collapse", repos = "https://fastverse.r-universe.dev")], the pivot() function has received a FUN argument to support aggregation, including a number or hard-coded internal functions that do this "on the fly". Initial benchmarks show that this significantly outperforms other pivot table options in R. More at https://sebkrantz.github.io/collapse/reference/pivot.html (feel free to test and give feedback).

ramikrispin, to machinelearning
@ramikrispin@mstdn.social avatar

(1/2) I am excited to present at the useR!2024 conference on July 2nd!

I am going to run a virtual workshop about deployment and monitoring data and ML pipelines using free and open-source tools. This includes setting pipelines using GitHub Actions, Docker 🐳, R, and Quarto 🚀.

When 📆: July 2nd at 10 AM PST

moorejh, to LLMs
@moorejh@mastodon.online avatar

Our KRAGEN paper is out! This method combines LLMs & RAG with Graph of Thoughts for asking complex questions of a knowledge graph or any vector DB https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btae353/7687047

Posit, to python
@Posit@fosstodon.org avatar

posit::conf(2024) virtual tickets are now available!
Join us on August 12-14—from all over the world—to live stream the incredible talks and keynotes that will be taking place in Seattle.

We understand that not everyone will be able to make the trip to Seattle this year, so we’re excited to offer a fully virtual offering for everyone as an alternate option.
REGISTER: https://posit.co/conference/

#posit #rstats #python #pydata #DataScience

news, to ai
@news@mastodon.toptechtidbits.com avatar

AI-Weekly for Tuesday, June 4, 2024 - Issue 115
https://ai-weekly.ai/newsletter-06-04-2024/

The Week's News in Artificial Intelligence
A Mind Vault Solutions, Ltd. Publication

Subscribers: 22,694 Opt-In Subscribers were sent this issue via email.

leanpub, to python
@leanpub@mastodon.social avatar

The course Dirty Data Dojo: Cleaning Data (Excel & Python) by Lee Baker is on sale on Leanpub! Its suggested price is $119.00; get it for $49.50 with this coupon: https://leanpub.com/sh/fwmKxsmd

ramikrispin, to opensource
@ramikrispin@mstdn.social avatar

I am excited to present at the Dev AI conference in Paris on June 19!

I am going to run a workshop about the deployment and monitoring of ML pipelines with free and open-source tools. This includes using tools such as GitHub Actions and Pages, Docker, Python, Quarto, etc.

More details are available on the conference website👇🏼
https://events.linuxfoundation.org/ai-dev-europe/

Thanks to the Linux Foundation and the conference organizers for the invite!

FelipeSMBarros, to python Portuguese
@FelipeSMBarros@mastodon.social avatar

🚀 Anúncio: Nova Versão do Módulo Python crossfire!

A nova versão do módulo Python crossfire, desenvolvida por mim e @cuducos está disponível!

✨ Novidades:

Bug corrigido: Agora compatível com Google Colab!
Funcionalidade extra: Parâmetro que desempacota dados aninhados para facilitar a análise.
Ideal para jornalistas de dados e analistas! Cadastre-se na API do Fogo Cruzado e acesse os dados direto no Python.

Mapa da região de recife apresentando pontos indicando a localização de tiroteios e os motivos daods mesmos, como "Ataques a civis", "Ação Policial", entre outros.

leanpub, to datascience
@leanpub@mastodon.social avatar

The Hundred-Page Machine Learning Book (PDF + EPUB + extra PDF formats) by Andriy Burkov is on sale on Leanpub! Its suggested price is $40.00; get it for $14.00 with this coupon: https://leanpub.com/sh/HEQaRVfD

stevensanderson, to datascience
@stevensanderson@mstdn.social avatar

🚀 TidyDensity's New AIC Functions! 🚀

The TidyDensity package now includes new functions to calculate the Akaike Information Criterion (AIC) for various distributions, streamlining model quality assessment. Use functions like util_negative_binomial_aic() to automate AIC calculations, ensuring precise model evaluation.

Happy coding!

Post: https://www.spsanderson.com/steveondata/posts/2024-05-31/

RConsortium, to HR
@RConsortium@fosstodon.org avatar

🐘✨ Great news from Marcela Victoria Soto at R4HR in Buenos Aires! She recently shared updates about their dynamic activities: "Data analysis is crucial for agile decision-making in companies." Join them on June 1, 2024, for the "Data Visualization in HR" event. Perfect for Spanish-speaking R users interested in HR analytics. 📅👥 Read more: https://www.r-consortium.org/blog/2024/05/30/r4hr-in-buenos-aires-leveraging-r-for-dynamic-hr-solutions

telescoper.blog, to ai
@telescoper.blog@telescoper.blog avatar

Before I head off on a trip to various parts of not-Barcelona, I thought I’d share a somewhat provocative paper by David Hogg and Soledad Villar. In my capacity as journal editor over the past few years I’ve noticed that there has been a phenomenal increase in astrophysics papers discussing applications of various forms of Machine Leaning (ML). This paper looks into issues around the use of ML not just in astrophysics but elsewhere in the natural sciences.

The abstract reads:

Machine learning (ML) methods are having a huge impact across all of the sciences. However, ML has a strong ontology – in which only the data exist – and a strong epistemology – in which a model is considered good if it performs well on held-out training data. These philosophies are in strong conflict with both standard practices and key philosophies in the natural sciences. Here, we identify some locations for ML in the natural sciences at which the ontology and epistemology are valuable. For example, when an expressive machine learning model is used in a causal inference to represent the effects of confounders, such as foregrounds, backgrounds, or instrument calibration parameters, the model capacity and loose philosophy of ML can make the results more trustworthy. We also show that there are contexts in which the introduction of ML introduces strong, unwanted statistical biases. For one, when ML models are used to emulate physical (or first-principles) simulations, they introduce strong confirmation biases. For another, when expressive regressions are used to label datasets, those labels cannot be used in downstream joint or ensemble analyses without taking on uncontrolled biases. The question in the title is being asked of all of the natural sciences; that is, we are calling on the scientific communities to take a step back and consider the role and value of ML in their fields; the (partial) answers we give here come from the particular perspective of physics

arXiv:2405.18095

P.S. The answer to the question posed in the title is probably “yes”.

https://telescoper.blog/2024/05/30/is-machine-learning-good-or-bad-for-the-natural-sciences/

#AI #ArtificialIntelligence #arXiv240518095 #Astrophysics #Cosmology #DataScience #deepLearning #MachineLearning

ramikrispin, to datascience
@ramikrispin@mstdn.social avatar
  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • mdbf
  • InstantRegret
  • ethstaker
  • magazineikmin
  • GTA5RPClips
  • rosin
  • modclub
  • Youngstown
  • ngwrru68w68
  • slotface
  • osvaldo12
  • kavyap
  • DreamBathrooms
  • Leos
  • thenastyranch
  • everett
  • cubers
  • cisconetworking
  • normalnudes
  • Durango
  • anitta
  • khanakhh
  • tacticalgear
  • tester
  • provamag3
  • megavids
  • lostlight
  • All magazines