tomayac, Be sure to check out Deepti Gandluri's and Austin Eng's talk on #WebAssembly and #WebGPU enhancements for faster Web AI: https://youtu.be/VYJZGa9m34w?si=_PPuJQJ9Zor5AyID. Running Al inference directly on client machines reduces latency, improves privacy by keeping all data on the client, and saves server costs. To accelerate these workloads, #Wasm and WebGPU are evolving to incorporate new low-level primitives.
#GoogleIO
Add comment