Before I head off on a trip to various parts of not-Barcelona, I thought I’d share a somewhat provocative paper by David Hogg and Soledad Villar. In my capacity as journal editor over the past few years I’ve noticed that there has been a phenomenal increase in astrophysics papers discussing applications of various forms of Machine Leaning (ML). This paper looks into issues around the use of ML not just in astrophysics but elsewhere in the natural sciences.
The abstract reads:
Machine learning (ML) methods are having a huge impact across all of the sciences. However, ML has a strong ontology – in which only the data exist – and a strong epistemology – in which a model is considered good if it performs well on held-out training data. These philosophies are in strong conflict with both standard practices and key philosophies in the natural sciences. Here, we identify some locations for ML in the natural sciences at which the ontology and epistemology are valuable. For example, when an expressive machine learning model is used in a causal inference to represent the effects of confounders, such as foregrounds, backgrounds, or instrument calibration parameters, the model capacity and loose philosophy of ML can make the results more trustworthy. We also show that there are contexts in which the introduction of ML introduces strong, unwanted statistical biases. For one, when ML models are used to emulate physical (or first-principles) simulations, they introduce strong confirmation biases. For another, when expressive regressions are used to label datasets, those labels cannot be used in downstream joint or ensemble analyses without taking on uncontrolled biases. The question in the title is being asked of all of the natural sciences; that is, we are calling on the scientific communities to take a step back and consider the role and value of ML in their fields; the (partial) answers we give here come from the particular perspective of physics
arXiv:2405.18095
P.S. The answer to the question posed in the title is probably “yes”.
Le thème : les modèles de language et la robotique open hardware. Si ça vous intéresse de découvrir une autre facette que Skynet et la machine à billet,
FreeCodeCamp released today a new course for fine tuning LLM models. The course, by Krish Naik, focuses on different tuning methods such as QLORA, LORA, and Quantization using different models such as Llama2, Gradient, and Google Gemma model.
“The Protein Universe Atlas is a groundbreaking resource for exploring the diversity of proteins. Its user-friendly web interface empowers researchers, biocurators and, students in navigating the “dark matter” to explore proteins of unknown function.”
🥁 That’s what the committee said about this work, one of the #SIBRemarkableOutputs 2023 👏
So… Big Tech is allowed to blatantly steal the work, styles and therewith the job opportunities of thousands of artists and writers without being reprimanded, but it takes similarity to the voice of a famous actor to spark public outrage about AI. 🤔
The MLX is Apple's framework for machine learning applications on Apple silicon. The MLX examples repository provides a set of examples for using the MLX framework. This includes examples of:
✅ Text models such as transformer, Llama, Mistral, and Phi-2 models
✅ Image models such as Stable Diffusion
✅ Audio and speech recognition with OpenAI's Whisper
✅ Support for some Hugging Face models
MIT launched the 2024 edition of the Introduction to Deep Learning course by Prof. Alexander Amini and Prof.Ava Amini. The course started at the end of April and will run until June. The course lectures are published weekly. The course syllabus keeps changing from year to year, reflecting the rapid changes in this field.
Stanford University released a new course last week focusing on Deep Generative Models. The course, by Prof. Stefano Ermon, focuses on the models beyond GenAI models.
(1/2) Google released a new foundation model for time series forecasting 🚀
The TimeFM (Time Series Foundation Model) is a foundation model for time series forecasting applications. This pre-trained model was developed by the Google Research team. It joins the recent trend of leveraging foundation models for time series forecasting, which includes Salesforce's Moirai and Amazon's Chronos.
Version 0.12.0 of the skforecast Python library for time series forecasting with regression models was released this week. The release includes new features, updates for existing ones, and bug fixes. 🧵👇🏼
This summer there will be four courses 😯:
Computational Neuroscience, NeuroAI, Deep Learning, and Computational Tools for Climate.
Mentors will hold a one-hour meeting every week with a small cohort of students, where they will discuss with them and help them progress in their journey in industry and academia.
Je bosse au 4/5 sur les modèles de langage (LLM, parfois appelées IAs) et à 2/5 sur la robotique open hardware AMA (jlai.lu) French
Hello!...