My old classification for the high-level reasons you'd make a distributed system:
Your system is fundamentally distributed (e.g. a multiplayer game, or chat system)
You want to increase reliability
You want to increase performance (beyond what one computer can do)
These are very different! Sometimes one technique can help with more than one of these, but more often they're in conflict with each other, and you do different things to achieve each goal.
@shachaf I see the reliability of distributed systems as an empirical claim, and I'm not sure "let's make this system more reliable by distributing it" has panned out more often than not for the industry ;)
After ~4 years, I actually handled poisoned locks for the first time! I have a sequence number + 2 copies thing for lock-free snapshots, but writers must serialise. The write mutex doesn't really protect any invariant, except that we mustn't interleave write sequences. Interruption is fine, as long as the interrupted writer never resumes.
@tobinbaker Only lock-free for readers. I think the kids call that "left-right"… although 3 copies is nicer for readers, under low write load?
For full lock-freedom, I tried single-copy and RDCSS (actually wait-free for readers)… If I had to do it again, I'd probably go for something simpler with generation counters and type stable records. Manage an explicit free list and the max # of records scales with # concurrent writers. Only lock-free, but a lot fewer atomics, and the space overhead is reasonable.
Say I'm manually profiling my code, e.g. recording how much time a function takes. Right now I'm storing all samples to build statistics (e.g. percentiles) on exit, but that means O(n) memory usage for samples. Are there (necessarily approximate) O(1) memory alternatives?
Not sure if this is a gcc bug or some weird corner of UB or what...
But I have a packed struct containing a uint32 as the first field. I'm running on ARMv7-M so 32-bit unaligned loads are allowed (but not 64-bit).
This struct is being read directly via casting from a network RX buffer that is likely not aligned to any particular byte boundary. It's a) packed and b) has 32-bit fields in it.
So silly me assumed that gcc would generate either bytewise reads (assuming no alignment at all) or a ldr instruction (accepting that 32-bit unaligned loads are OK).
But for some reason at -O3 it generates a 64-bit read with ldrd, which promptly hard faults. I have no idea why it's doing that given that I was just __builtin_bswap32'ing a single 32-bit field.
Was able to work around the issue with memcpy, but seriously WTF? If I'm using a packed struct I'm explicitly telling the compiler not to make any assumptions about alignment because I'm directly serializing the data from somewhere. Where did it magically get the idea that my packed 32-bit field had 64-bit alignment?
@azonenberg@whitequark AFAICT, the address arithmetic to construct a pointer way past the end of the struct is UB. Working with uintptr_t is OK I think (until provenance).
@azonenberg@whitequark The problem re UB is that getPathStart works off a field's address. Any pointer derived from that address must be in that field, or just one past the end of the field.
@azonenberg@whitequark@steve A pointer to uint32_t must be aligned… IMO memcpy is the way to go (or the go way, with byte loads + shift/add in software, and hope the compiler recognizes the 32-bit load).
There's a bit of stuff in this article which phrases it in terms of changes over time, e.g. compute capability has grown and we no longer need big data. But it seems closer to reality that it was never required, and continues to not be required. (looking forward to the same style of post happening in a few years vis-a-vis microservices)
@dotstdy@ltratt >10 years ago, $WORK handled enough transactions to observe a couple collisions in random 63-bit ids every day (where each id represents a different [haha] transaction that exchanges a tiny but real amount of money between 2+ companies). I don't think it would have fit on a laptop back then… and SSDs weren't exactly mass market yet.
I think the only reason we didn't have to run everything with colocated CPU and storage like hadoop is someone had had the foresight to negotiate a fixed fee license for vertica w/o any limit on the storage footprint.
@wingo Re Joe Marshall's stack hack, I'm pretty sure Common Larceny compiled to CLR. When I implemented it for delimited continuations in CL, I ended up with only 2x code blowup: one instance for no capture at all, and a fully broken up version of the ANF-ed steps as the jump target when restoring any stack frame. That gave me near-parity for performance without capture, and avoided a quadratic size blow up.
Work is both performance and liability^Wcorrectness oriented, and I noticed a common pattern is that we'll generate commands with a fully deterministic program (i.e., a function), reify the command stream, and act on the commands.The IO monad is real!
@pervognsen Yup, auditability/testability is great. I remember googlers trying to test on or parse logging output. I think they really wanted a command list.
@pervognsen The moment you expose a dependency graph as a service, people like me will send batches of 200K tasks because there's no other pipelining API (that's how I ended up on a call with Azure).
@pervognsen and I think you want to schedule jobs, where jobs consist of n independent tasks. The abstraction isn't perfect, but it lets you scale to millions of tasks.
For people who've been around much longer, has there been any retrospectives on Rust's decision to allow panics to unwind rather than abort? I've mostly come to terms with it in a practical sense but it's something that really "infects" the language and library ecosystem at a deep level, e.g. fn(&mut T) isn't "the same" as fn(T) -> T and it's especially troublesome if you're writing unsafe library code and dynamically calling code through closures or traits that could potentially panic.
Hey software license knowledgeable friends. We recently put code out for a paper that is BSD licensed.
What would happen if some other company forked it and made a bunch of changes/ improvements?
Would it still be copyright EA in the license on their fork? And it'd have to stay BSD right?
Ty, random curiosity :) https://github.com/electronicarts/fastnoise/blob/main/LICENSE.txt
@demofox No patent… yet ;) IME, big co lawyers prefer ASLv2 over BSD because the former comes with a patent grant for using the licensed software… which could be considered important given the domain.
Ok so the internet is the epitome of cache invalidation problems (f5 and dns), and the challenge of naming things (urls). Are there significant off by one errors? :P