Posit, to random
@Posit@fosstodon.org avatar

The sparklyr package and friends have been getting some important updates in the past few months!

sparklyr is a package that allows you to interact with Spark using familiar R interfaces, such as dplyr, broom, and DBI. You can also gain access to Spark's distributed Machine Learning libraries, Structure Streaming, and ML Pipelines from R.

Read more in the blog post: https://blogs.rstudio.com/ai/posts/2024-04-22-sparklyr-updates/

#RStats #Databricks #Spark

slink, to OpenAI
@slink@fosstodon.org avatar

TIL: #atlassian (#Jira, #Confluence) uses 22 Third-Party Sub-processors "[...] to process personal data on behalf of Atlassian customers[...]" among which are #Databricks for "analytics product features" and #OpenAI for "intelligence product features". https://www.atlassian.com/legal/sub-processors #infosec #cloud #saas #ai

theaiml, to opensource

After months of work and $10 million, Databricks has unveiled DBRX - the world's most potent publicly available open-source large language model.

DBRX outperforms open models like Meta's Llama 2 across benchmarks, even nearing the abilities of OpenAI's closed GPT-4. Novel architectural tweaks like a "mixture of experts" boosted DBRX's training efficiency by 30-50%.

#databricks #opensource #openai #grok #gemini #llm #model #meta #llama #anthropic #claude #chatgpt #top #ai #training #public

Posit, (edited ) to random
@Posit@fosstodon.org avatar

Did someone say enhancements for #Databricks users in Posit Workbench?

The recent release of Posit Workbench includes two specific enhancements for Databricks users:

🔑 Delegated Databricks credentials

🖥️ New Databricks UI within #RStudio

With this release, users can log in to a given Databricks workspace when they start an RStudio or #VSCode session and interact directly with the clusters in that Workspace from their preferred environment.

📖 Learn more: https://posit.co/blog/posit-databricks-webinar-dec2023-q-a/

#RStats

Posit, to python
@Posit@fosstodon.org avatar

We’re excited to announce the latest release of Posit Workbench and RStudio IDE 2023.12.0!

• GitHub Copilot now generally available
#Quarto and R Markdown improvements
• New #Databricks Pane and Databricks ODBC Driver

Learn more in the blog post: https://posit.co/blog/rstudio-2023-12-0-whats-new/

#RStats #Python

Posit, to random
@Posit@fosstodon.org avatar

The new version of pysparklyr is on CRAN! 🎉

Pysparklyr is the new extension to sparklyr that allows you to interact with #Spark & Databricks Connect. The new version has big user-facing updates that make working with #Databricks and #RStats together even easier.

Read more: https://posit.co/blog/pysparklyr-for-interacting-with-spark-databricks-connect/

kellogh, to random
@kellogh@hachyderm.io avatar
Posit, to random
@Posit@fosstodon.org avatar

We are thrilled to announce that the latest version of sparklyr is on CRAN. sparklyr is the popular and powerful #RStats interface for #Apache #Spark, including Spark clusters hosted in #Databricks.

Thanks to the new Spark Connect protocol, you can access Spark’s powerful distributed computing features from RStudio Desktop, a Posit Workbench instance, or any running R terminal or process.

Learn more in the blog post: https://posit.co/blog/databricks-clusters-in-rstudio-with-sparklyr/

kellogh, to azure
@kellogh@hachyderm.io avatar

out of curiousity has anyone seen data corruption issues with either Azure DevOps Repos (Git only) or #Databricks' Git provider. Specifically interested in during merge conflicts, where a file was changed and the change isn't reflected in Git history. Boosts appreciated #AzDO #azure #azuredevops #git #github

richard_dick,

@kellogh yup, I’m pretty sure that’s what’s going on here

kellogh,
@kellogh@hachyderm.io avatar

@richard_dick can i get more info? DM if needed…

kellogh, to ai
@kellogh@hachyderm.io avatar

The #databricks #AI assistant is so bad. Seems like they're using a worthless #LLM and also doing a really bad job of stuffing the context

bradfordc,

@kellogh it’s using an OpenAI model AFAIK. in my experience it’s produced more relevant output than vanilla gpt-3.5-turbo for PySpark and SQL generation. some use cases definitely work better than others. what type of tasks have you tried it for? curious what the issues were

kellogh,
@kellogh@hachyderm.io avatar

@bradfordc

  • it annoys me that it doesn’t know about previous messages in the chat so I have to repeat myself

  • explain error wouldn’t even rephrase info that was plainly visible in the error message, like column names. IIRC it was an AnalysisException in pyspark

To be fair it does do a great job with fixing code

kellogh, to random
@kellogh@hachyderm.io avatar

i've got mixed experiences with #databricks. i love the product, but their customer support is almost antagonistic

jbu,

@kellogh I find the product to be a 5d cube of incompatibility. Like if Lego never standardised on a bump size

kellogh,
@kellogh@hachyderm.io avatar

@jbu that’s a good way to put it. The older core stuff is good, but the later stuff is a sprawling horror

changelog, to linux
@changelog@changelog.social avatar

🚀 New episode of The Changelog!

This week we’re taking you to the hallway track of The #Linux Foundation’s #OSSummit North America 2023 in Vancouver, Canada 🇨🇦

This episode features three conversations about #opensource #AI:

1️⃣ Beyang Liu (Co-founder and CTO at #Sourcegraph)
2️⃣ @dennyglee (Developer Advocate at #Databricks)
3️⃣ Stella Biderman (Head of Research at #EleutherAI)

🎧 https://changelog.fm/541

  • All
  • Subscribed
  • Moderated
  • Favorites
  • provamag3
  • InstantRegret
  • mdbf
  • ethstaker
  • magazineikmin
  • GTA5RPClips
  • rosin
  • thenastyranch
  • Youngstown
  • osvaldo12
  • slotface
  • khanakhh
  • kavyap
  • DreamBathrooms
  • JUstTest
  • Durango
  • everett
  • cisconetworking
  • Leos
  • normalnudes
  • cubers
  • modclub
  • ngwrru68w68
  • tacticalgear
  • megavids
  • anitta
  • tester
  • lostlight
  • All magazines