@shwestrick@discuss.systems
@shwestrick@discuss.systems avatar

shwestrick

@shwestrick@discuss.systems

incoming NYU CS faculty Fall 2024 :: post-doc CSD at CMU :: programming languages :: parallel algorithms :: music :: lead dev of the MaPLe compiler (https://github.com/mpllang/mpl)

This profile is from a federated server and may be incomplete. Browse more on the original instance.

adrian, to random
@adrian@discuss.systems avatar

I know how dumb this sounds, but I think it’s overwhelming how many more 64-bit numbers there are than 32-bit numbers. it’s, like, a lot.

shwestrick,
@shwestrick@discuss.systems avatar

@adrian it’s fun also to think about how small 2^64 is. If my napkin math is right, the biggest supercomputers today (exascale) can enumerate every 64-bit integer in less than a minute.

shwestrick, to random
@shwestrick@discuss.systems avatar

Very excited to announce that I will be starting in September as an assistant professor of computer science at NYU!

Can't wait to move to New York this summer!

shwestrick,
@shwestrick@discuss.systems avatar

@boarders definitely, I’d enjoy that! Let’s try to remember for a few months from now.

shwestrick, to random
@shwestrick@discuss.systems avatar

a little late to the game, but:

I am on the 2024 job market for tenure-track positions!

site:
https://www.cs.cmu.edu/~swestric/
research statement:
https://www.cs.cmu.edu/~swestric/other/westrick-research-statement.pdf

My research focuses on parallel programming. Specifically: I want to make it simpler and safer to develop parallel software.

A common theme in my work is provable efficiency. I want to raise the level of abstraction at which programmers are able to consistently achieve high performance. To do so, we need high-level abstractions that provide the programmer with guarantees on both safety and performance. I work on designing, analyzing, and implementing these abstractions.

For example, my PhD focused on improving the performance of parallel functional languages. The key was identifying a memory property called disentanglement which enables provably efficient parallel GC with nearly no additional synchronization.
https://www.cs.cmu.edu/~swestric/20/popl-disentangled.pdf

My dissertation on this topic received the John C. Reynolds Doctoral Dissertation Award from ACM SIGPLAN.
https://sigplan.org/Awards/Dissertation/

In another line of work, we're tackling the granularity control problem. We developed new language implementation techniques for extremely fine-grained parallel programming, with provable guarantees on both work (low overhead) and span (high parallelism)
https://www.cs.cmu.edu/~swestric/24/popl24-par-manage.pdf

We implemented all of this work by developing MPL, a new compiler and run-time system for a parallel functional language. We use MPL at Carnegie Mellon University to help teach parallel programming to over 500 students each year.
Students are able to write parallel programs that perform well, with no prior parallelism experience, within just a few weeks.
https://github.com/mpllang/mpl

And, our experiments have shown that MPL can outperform memory-managed languages such as Java and Go, and can even compete with low-level hand-optimized code written in unsafe languages such as C/C++.
https://github.com/MPLLang/parallel-ml-bench

My work has made a lot of progress here, but there's still so much more to do! Please consider hiring me :)

shwestrick, to random
@shwestrick@discuss.systems avatar

Excited to announce! Accepted at POPL, and preprint available:

Automatic Parallelism Management
by Sam Westrick, Matthew Fluet, Mike Rainey, and Umut A. Acar

https://www.cs.cmu.edu/~swestric/24/popl24-par-manage.pdf

We present a fork-join parallel language where the programmer liberally expresses all opportunities for parallelism, without worrying about the cost of spawning threads/tasks.

In other words, we tackle the granularity control problem. Spawning a task/thread is expensive! If you're not careful, the cost of spawning a parallel task will outweigh its benefits.

Programmers battle this by inserting constant thresholds into their code (e.g., split the problem into N/1000 chunks of size 1000, and spawn only one task per chunk.)

This is tedious.

Worse, sometimes the threshold is not tunable statically. In map(f, S), it depends on f.

We address this by compiling par into a form with nearly zero cost. Programmers can then par liberally, without worrying much about overhead. The key is a new calling convention which embeds a "potentially parallel" task into the call-stack using only two stack slots(!)

The scheduler then dynamically chooses whether or not to promote a potentially parallel task into an actual parallel thread. We give an algorithm which does this provably efficiently, essentially by applying Heartbeat Scheduling (https://www.chargueraud.org/research/2018/heartbeat/heartbeat.pdf), with some refinements.

Really excited about this paper. We're able to significantly reduce the burden of manually tuning "how much parallelism" to create.

image/png
image/png

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • Durango
  • DreamBathrooms
  • osvaldo12
  • InstantRegret
  • ngwrru68w68
  • magazineikmin
  • mdbf
  • thenastyranch
  • Youngstown
  • slotface
  • everett
  • kavyap
  • ethstaker
  • megavids
  • tester
  • GTA5RPClips
  • tacticalgear
  • modclub
  • khanakhh
  • rosin
  • cisconetworking
  • normalnudes
  • provamag3
  • Leos
  • cubers
  • anitta
  • lostlight
  • All magazines