#FluidX3D#CFD v2.17 is out! Some huge #GPU/#CPU hardware has been announced at #Computex, so I've made my code ready. Until now I've been using 32-bit indexing, which overflows for >2³² grid cells in a domain, equivalent to 225 GB VRAM. Now my #OpenCL code will at runtime automatically compile with 64-bit indexing when more cells are used. 🖖🧐
Also, I've added a new raytracing-based field visualization. Thank you @python for the idea! 💡
Release notes 👉 https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.17
#OpenCL has a compiler flag -cl-fp32-correctly-rounded-divide-sqrt. If you don't pass this, then divisions and square roots are incorrectly rounded. Shouldn't this be the other way around? How many other flags to I need to pass in order for arithmetic to be correct?
@bashbaug Used the intercept layers again today and I was wondering if injecting captured buffers/images is something which is either supported (and I haven't found how to do it yet) or something planned.
Like when I'm comparing between vendors with rusticl, it would be helpful if I could just replace image/buffer outputs with the content from a different capturing to quickly verify if the first difference is actually causing the bug I'm seeing or if it's something else.
If anyone else encounters a similar problem in the future, buffer (and image!) injection is now implemented in the OpenCL Intercept Layer 🎉.
You can take a buffer or image from one device or driver, inject it as a kernel input for a different device or driver, and see how it affects the results.
We released an updated version of the OpenCL Intercept Layer yesterday, just in time for #IWOCL!
This release supports the latest OpenCL extensions, includes a bunch of performance improvements, and adds a bunch of new features, including the ability to capture an OpenCL kernel and replay it outside of an application for easier debugging.
How realistic can a #CFD simulation be? Here is a 1 billion cell #FluidX3D simulation of an impacting raindrop, fully raytraced in 8K. FluidX3D contains state-of-the-art volume-of-fluid and surface tension models for highly accurate free surface simulations. Combined with my own #OpenCL#raytracing engine, results are rendered on-the-fly at resolution as large as remaining #GPU VRAM can hold. 🖖😋💧📺 https://youtu.be/MmLNQIW_Sic
FluidX3D is on #GitHub: https://github.com/ProjectPhysX/FluidX3D
#HPC#CUDA#OpenCL#LAPACK
If you had to do a lot of linear least square solves, with potentially rank-deficient matrices, what would you use on a GPU? On CPUs, LAPACK's DGELSY does work, but most GPU libraries seem to not implement routines for rank-deficient matrices.
This is wild: #FluidX3D can "SLI" together 🔵 #Intel Arc A770 + 🟢 #Nvidia Titan Xp, pooling 12GB+12GB of their VRAM for one large 450M cell #CFD simulation. Top half on A770, bottom half on Titan Xp. They seamlessly communicate over PCIe. Performance is ~1.7x of what either #GPU could do on its own. 🖖😋🖥🔥 #OpenCL shows its true power here - one implementation works on literally all GPUs at full performance, even at the same time. Happy #SimulationFriday! https://youtu.be/PscbxGVs52o
Anyway, any OpenCL applications you want to see working on Rusticl and which aren't atm? Or in general? It's slowly getting into the state, where things "just work".
@VileLasagna Has a blog post on the relative speed of different #GPU compute frameworks on the same hardware and driver.
Tl;dr: on an #Nvidia card, with Nvidia drivers, #CUDA is the slowest, by far. Fastest is our old stalwart #OpenCL - almost twice as fast when used only for compute. #Vulcan is good, and the least affected by using the card for your desktop at the same time. Read it - it's good.
Passively participating in #Genuary2024 — Day 8 Chaotic System. In 2012/13 I designed an award-winning audioreactive brand identity system for Leeds College Of Music based on the DeJong strange attractor with tens and hundreds of millions of particles per frame. This massive almost 1 year project consisted of a Mac/PC desktop app (written in Clojure, OpenCL & OpenGL) for exploring the attractor, creating presets and scheduling render jobs for super hi-res print assets (which would take a hours to render and were the biggest image sizes I ever had to deal with, up to 3x3 meters @ 150 dpi). I also had to develop an entire AWS based ad-hoc render farm and asset & user management system for the school to generate personalized video assets, allowing each student to upload their own music, handle audio FFT analysis and beat detection/mapping (all in Clojure) and to create individual sound-responsive clips for their in-school digital signage system and for sharing on social media... Most key aspects were handled via various old thi.ng libraries (e.g. https://thi.ng/simplecl for OpenCL interop). The server app also handled transcoding to dozens of video formats (via ffmpeg) and semi-automatic provisioning of EC2 machines for render/transcoding jobs...
An example video is below (music: Heyoka, Blue Towel)
@gabmus
I use #rocm 5.7 to run #opencl, google's #jax (for pymc), and #pytorch on two vega cards (Vega 64 and Radeon pro WX9100) on arch and ubuntu. They all run Ok, but correct setup needs some googling around, and jax beeds exporting some #xla flags. Situation is much, much better than 2 years ago, though. @oblomov
#FluidX3D has passed 2000 Stars! It is the most popular #CFD software on #GitHub now! 🖖😊⭐️ https://github.com/ProjectPhysX/FluidX3D
Feeling blessed that my work is useful to so many people across the globe, with users in 75 countries already! 🌍
42% EU, 30% Americas, 25% Asia, 3% Oceania+Africa
The red lightning bolt continues: #FluidX3D has passed 3000 Stargazers on #GitHub - from 82 countries! 🖖🥳⭐
Releasing this software for free really has turned out win-win: I've received so much valuable feedback, and answered with as many bug fixes and updates, with many more to come. I am enabling cutting-edge #CFD simulations for everyone, with very little hardware resources, on literally every computer that has a #GPU, regardless of vendor.
👉 https://github.com/ProjectPhysX/FluidX3D #SimulationFriday#OpenCL