#duckdb - kbin.social

frankel, 8 days ago to random

#DuckDB Doesn’t Need Data To Be a #Database

https://www.nikolasgoebel.com/2024/05/28/duckdb-doesnt-need-data.html

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrbrmstr, 9 days ago to random
#DuckDB v0.10.3+ now supports direct network querying of any supported dataset type on Hugging Face via a new hf:// URL scheme.

e.g., hf://datasets/{my-username}/{my-dataset}/{path_to_file}
FROM 'hf://datasets/ibm/duorc/ParaphraseRC/*.parquet' LIMIT 3;  
https://huggingface.co/docs/hub/datasets-duckdb
reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ramikrispin, 9 days ago to datascience

DuckDB can now read data from Hugging Face via the hf:// prefix 👇🏼

https://duckdb.org/2024/05/29/access-150k-plus-datasets-from-hugging-face-with-duckdb

#data #duckdb #DataScience #huggingface

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ stevensanderson

edrogers, 21 days ago to random

@nicksspirit giving a lightning talk ⚡ on "5 Things You Don't Know About #DuckDB" at #PyConUS. A great talk packed with cool features about this OLAP tool

#PyCon

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ kjaymiller

Posit, 24 days ago (edited 24 days ago) to Flooring

The world of big data, databases, and R is rapidly evolving with an explosion of tools and packages. We're delighted to announce two workshops at posit::conf(2024) tailored for working with large datasets:

• Big Data in R with Arrow, led by Nic Crane and Steph Hazlitt
• Databases with R led by @kirill

More info on the workshops here: https://reg.conf.posit.co/flow/posit/positconf24/publiccatalog/page/publiccatalog?search=&tab.day=20240812&search.sessiontype=1675316728702001wr6r

#DuckDB #RStats #Arrow #Parquet #Database

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ thomas_mock

Posit, 1 month ago to random

We’re thrilled to announce dplyr powered by DuckDB: duckplyr 🎉

A collaboration between the dplyr project team at Posit, cynkra, and DuckDB, duckplyr is a powerful new option that marries the user-friendly dplyr syntax with the execution capabilities of DuckDB.

Learn more: https://posit.co/blog/duckplyr-dplyr-powered-by-duckdb/

#tidyverse #RStats #duckdb #sql #dplyr

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ gaborcsardi, Drmowinckels

hrbrmstr, 2 months ago to random

For day 7 ("hazards") of the #30DayChartChallenge we use #DuckDB to read in and wrangle the U.S. mass shootings Google Sheet data curated by Mother Jones to look at the distribution of the number of fatalaties per-incident from 1982-2023.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrbrmstr, 2 months ago to random

Initial "survey" chapter on the four core ways to use #DuckDB from #RStats is up https://duckdb.hrbrmstr.app/r-survey.html

Will have separate deep-dive chapters on each this weekend, provided power and internet hold up.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrbrmstr, 2 months ago to random

Added a short chapter on fine-tuning #DuckDB to the Cooking With DuckDB e-book https://duckdb.hrbrmstr.app/fine-tuning.html

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ramikrispin, 2 months ago to datascience

DuckDB 🦆 + dplyr 🔧= duckplyr 🚀🚀🚀

DuckDB released a new R package - duckplyr, which enables running dplyr functions using the DuckDB engine on the backend ❤️. The package, on the backend, translates and maps the dplyr code into DuckDB. This will enable dplyr users to work with large datasets with higher performance.

Resources 📚
Code: https://github.com/duckdblabs/duckplyr
Documentation: https://duckdblabs.github.io/duckplyr/
Release post: https://duckdb.org/2024/04/02/duckplyr

#rstats #duckdb #data #DataScience #sql

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Cmastication, brodriguesco

_TimTaylor, 2 months ago to random

Motivated by the recent blog (https://duckdb.org/2024/04/02/duckplyr) I finally took {duckplyr} for a spin and 🤯. Staggeringly quick (though I only tried the example from the post). Definitely going to kick the tyres some more. {dplyr} on the front with #duckdb at the rear is the perfect example of the cool user experience you can create with #RStats (not forgetting to mention the years of experimenting / hard work from all involved). Big 😃 for me right now.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ gaborcsardi

hrbrmstr, 2 months ago to random

Another "Cooking With #DuckDB" chapter is up showing how to wire-up the Monaco editor to DuckDB WASM's query engine.

https://duckdb.hrbrmstr.app/wasm-interactive.html#editor

Now I gotta get yesterday's newsletter out (no "Bonus” edition as I've got lots of food to make for the holiday).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrbrmstr, 2 months ago to random

Another daily update to "Cooking With #DuckDB" (https://duckdb.hrbrmstr.app/):

Finally added a changelog during the active development phase.

Added chapter on using SQLite databases from the DuckDB CLI.

Added new chinook.db SQLite dataset to data.
1/3

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrbrmstr, 2 months ago to random

Added a couple introductory #DuckDB WASM chapters to the “Cooking With DuckDB” e-book.

https://duckdb.hrbrmstr.app/wasm.html

Working on putting up each of those chapter's examples into a repl.it collection, so folks can more easily see complete solutions.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrbrmstr, 2 months ago to random

Did y'all know #DuckDB has the equivalent of #RStats factors?

https://duckdb.hrbrmstr.app/factors.html

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrbrmstr, 2 months ago to random

The “Cooking With #DuckDB" e-book is progressing pretty well. 18 "cookbook-style" chapters that cover common use-cases with plenty of examples.

Spent some extra time on "Read Parquet File(s) From The Web" to peel back the covers on how DuckDB does remote Parquet ops so fast.

Now has a cover img & repo links: https://duckdb.hrbrmstr.app/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

davidbisset, 2 months ago to opensource

WhatTheDuck:

#opensource web application built on #DuckDB. It allows users to upload #CSV files, store them in tables, and perform #SQL queries on the data.

https://github.com/incentius-foss/WhatTheDuck #JavaScript

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrbrmstr, 2 months ago to random

What I failed to mention in the Bonus Drop post last night is that I have an in-progress e-book on "Cooking With #DuckDB" up & posting new chapters regularly.

📕 site: https://duckdb.hrbrmstr.app/convert-json.html

Quarto/code repo: https://codeberg.org/hrbrmstr/cooking-with-duckdb/

I still need to add repo metadata to the yaml and am trying to figure out something decent for a cover (I will not use image gen AI for it).

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrbrmstr, 2 months ago to random

Tomorrow is def gonna be a #DuckDB heavy Drop.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Osunderdog, 2 months ago to random

Weekend plans!

I'm messing around with #oauth2 on google. I want to do some of my own picture investigations.

I finally was able to retrieve 'mediaItems'. Now on to ingesting them into a little database... I think I'm going to go with #duckdb. Just because I need some experience with it.

Might as well use #sqlachemy so I can be reminded how much I dislike it.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ cliffwade

jamesog, 2 months ago to random

“DuckDB as the New jq” is a short but good read. I'd heard of DuckDB but not looked into it. This is a cool use-case.

https://www.pgrs.net/2024/03/21/duckdb-as-the-new-jq/

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ btaroli

btaroli, 2 months ago

@jamesog thanks for sharing this! I’m going to have to play with it. I have some complex #JSON cases that might really benefit. I also see promise in cases where today I might translate #XLSX to #CSV and then import into #SQLite. Why do all that if I can query the original directly? Awesome!

#DuckDB #jq

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jakobmiksch, 2 months ago to random German

My talk about #duckdb at #fossgis2024

#fossgis #osgeo

https://pretalx.com/fossgis2024/talk/MQDUEW/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ mwfc

hrbrmstr, 3 months ago to Bash

A larger Observable Framework demo is on the way (prbly publish it tonight) with #RStats and #Bash data loaders and alot of #DuckDB data wrangling.

This is super addictive.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrbrmstr, 3 months ago (edited 3 months ago) to random

Want a summary of some fields in a remote parquet file? All you need is #DuckDB

$duckdb \ -csv \ -c "summarize from read_parquet('https://data.hrbrmstr.dev/tag-activity.parquet') select unique_ips, total_tag_activity"$

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrbrmstr, 3 months ago to random

#DuckDB is very quickly becoming the {tidyverse} of SQL https://mastodon.social/@duckdb/112021260625671864

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...