danderson,
@danderson@hachyderm.io avatar

dangit, my inner data structure has a structural fault because of rust ownership semantics.

Conceptually, the inner struct is a binary tree where inner nodes can carry a value, and leaves can carry a value or a child tree. If you need a leaf to hold both a child and a value, you store the child and move the value to the child's root node.

Conceptually again, lookups walk down this tree-of-trees looking for the node representing the lookup key, and nearest self-or-parent value is the result.

danderson,
@danderson@hachyderm.io avatar

But to make lookups fast, "empty" nodes under a value node hold a reference to their nearest parent value (again conceptually), and the tree is laid out such that you can look up the leaf nodes in a single memory access.

So, as you traverse down, you do leaf lookups only, and the answer you get is one of: nothing here; a value; a reference to a value somewhere up-tree; or a child tree.

danderson,
@danderson@hachyderm.io avatar

To be able to answer the "nearest parent" query a lookup is asking, as you're walking down the tree-of-trees, whenever you see a child tree and continue down, you also remember which value would be the answer if there was no child tree. That way you can traverse downwards a single time, with no backtracking, and if you end up in a sub-tree where you get a "nothing here" answer, the last value you saw in another tree on the way down is your answer.

danderson,
@danderson@hachyderm.io avatar

In C, this is all straightforward to implement: it's all raw pointers, so you just sprinkle the pointers into the various trees where they need to be and that's all fine.

In rust, I implemented the tree nodes as an enum with the 4 alternatives, which is fine... Except for what happens at leaves with children.

Say you had a tree with a value somewhere in the middle, such that a leaf node is a reference to that value. (again reference conceptually, in rust specifically it's an Indirect(usize)

danderson,
@danderson@hachyderm.io avatar

But then as you're inserting a new value, you discover that this leaf needs to be a child tree. That's fine, the rule from earlier is that if you need a child tree at a place that isn't empty, you attach the child tree and move the previous value into the child's root node.

And this is where the rust police kicks down the door, because in the scenario above, the child's root node now references a node in its parent tree, which you can't express cleanly in safe rust.

danderson,
@danderson@hachyderm.io avatar

It's possible to express what's happening here, but fundamentally the C-ish representation of this data structure demands a bunch of mutable pointers that span multiple data structures, forming a graph that basic Rust ownership cannot express. You have to start reaching for Rcs and Cells and other things to transpose the algorithm directly, and I'm trying to avoid that.

danderson,
@danderson@hachyderm.io avatar

The way I implemented this in Go was: leaf nodes in the inner trees can have both a value and a child. So, in rust terms, the leaf node looks something like:

enum Value {
Direct(V),
Indirect(OtherNodeID),
}

struct Leaf {
value: Option<Value>,
child: Option<Box<Table>>
}

Which is fine, that works, but also it doubles the memory footprint of nodes, something I was specifically trying to avoid in the rust impl.

danderson,
@danderson@hachyderm.io avatar

So... Need to ponder some more. This data structure is already quite memory hungry. It's the core tradeoff it makes - memory cost scales down poorly, but for large numbers of values it amortizes decently and in return all operations are very fast indeed.

If I have to double the node size (as I did in Go), the overall type ends up being 1.5-2x more expensive in memory than it needs to be, and that really sucks.

danderson,
@danderson@hachyderm.io avatar

Fundamentally, if I'm sticking to the algorithms as defined, I have 3 options: leaf nodes become 2x larger (carry both a value and a child, both optional), lookups become slower (need to carry enough state to resolve non-ptr cross-tree references), or segfaults (use unsafe and raw ptrs as references, which the struct can hide and still obey the rules of unsafety, but I may have a skill issue).

Or, figure out if a tweak to the algorithms could save me some trouble...

piecritic,
@piecritic@hachyderm.io avatar

@danderson can’t the leaf also be an enum

djm,
@djm@cybervillains.com avatar

@danderson Does RefCell help here?

danderson,
@danderson@hachyderm.io avatar

@djm It's an available option, yeah. I believe the access patterns would behave themselves under runtime borrow checking.

However, a RefCell would also ends up weighing 2 pointers in this instance, so ends up in that solution bucket still (and I think within that solution bucket I can do an implementation that works under static borrow checking, though not 100% certain yet).

4raylee,
@4raylee@mathstodon.xyz avatar

@danderson another approach I’ve used is to give up on using pointers for representing edges in graph algorithms, and using indices into a backing array instead. This can work well as long as you never need to return a language-native pointer to an element. No idea of that’s an easy option here — I haven’t read the algorithm.

danderson,
@danderson@hachyderm.io avatar

@4raylee So, that's what I'm doing in part, yeah. The struct is a tree of trees, and within the inner trees, I use "index of the actual value in the backing array" as the indirection mechanism. But sometimes, it seems, I need to be able to indirect between inner trees, and so would have to also keep track of the path I've taken through the tree to resolve them (or have parent ptrs, but that introduces more headaches with rust's ownership semantics)

dr2chase,
@dr2chase@ohai.social avatar

@danderson if, extremely hypothetically, Go gave you the option to say that a struct was "bit packed", and by that I mean that you could (say) stuff a pointer-to-4-aligned and two booleans into the space of a single pointer, but you could never, ever, take the address of any of its fields, and they would all race together -- is that worthwhile? Less than zero promises, I'm just curious whether that would work well for you or not. There's lots of reasons this will never happen, of course.

danderson,
@danderson@hachyderm.io avatar

@dr2chase For this particular struct, what I need is a union of two pointers. The C implementation uses the pointer's LSB to record the referent's type (terminal value, or child tree).

Given such an unsafe.TwoPointers that the GC knew how to traverse (to keep the ptr alive), the Go implementation could use a third less memory than it does.

I'm not holding my breath though :)

dr2chase,
@dr2chase@ohai.social avatar

@danderson That's actually a type system problem, not a GC problem. The GC is amusingly omnivorous in its consumption of pointers. This would require implementation of tagged unions, which in a "hack" form would embed the tag in the unused pointer bits (assuming that any are unused) and otherwise in an extra field. HOWEVER, access to the pointer + field would require double-wide aligned stores to avoid creating new tearing stores (we want to get rid of the ones we have, not make new ones).

danderson,
@danderson@hachyderm.io avatar

@dr2chase Yeah, I assumed tagged unions would be the blocker here. FWIW, for the cases where I've personally wanted a union of two pointers (which is rare), I would be fine to require that the referents have alignment > 1, to guarantee the presence of a usable tag bit. But that really starts being a magical library type with bespoke semantics, likely not worth the downside.

dr2chase,
@dr2chase@ohai.social avatar

@danderson
No language change necessary, and I will be darned to Gopher heck for this: https://go.dev/play/p/BirpA56Oi1l

danderson,
@danderson@hachyderm.io avatar

@dr2chase Huh, how does that not cause the values to get collected? Oh, if we assume sizeof both types is at least a couple bytes, that's still a pointer into the struct, even if not to the struct, and so it works out?...

dr2chase,
@dr2chase@ohai.social avatar

@danderson The Go GC recognizes interior pointers, yes. I'm surprised nobody else has done this, it's horrible, but I've seen much much worse.

danderson,
@danderson@hachyderm.io avatar

@dr2chase If we accept that, past a certain level of optimizing, crimes are inevitable... This isn't the worst, imo. You even get to hide all the unsafe twiddling inside a nice friendly type and eat the sin for your callers!

irenes,
@irenes@mastodon.social avatar

@danderson yeah, we hear that. like, we do think those ownership rules exist for a reason, lifetime bugs in the C versions of things like this can be very subtle, but... yeah

if you can't remove the up-pointers, you pretty much have to turn them into weak pointers, we don't see another way

danderson,
@danderson@hachyderm.io avatar

@irenes The obvious other way is to replace my current Indirect(usize) values (which tell you what other index of an inner array to look at for the actual value) with a naked *Value, and start using unsafe blocks to manage that.

All the unsafety would be hidden way inside the type, and that's the most efficient straight-line way to achieve equivalent performance to the C version, and it's possible to maintain the invariants... But I'm trying to avoid unsafe because noob.

irenes,
@irenes@mastodon.social avatar

@danderson we very much think it's appropriate to have guard rails no matter how experienced you are.

danderson,
@danderson@hachyderm.io avatar

@irenes Oh certainly, and there could be some type wrapping going on to keep the unsafety very constrained and surrounded by asserts.

... but, data structures that need to do cursed things to go fast is one of the places where judicious unsafety is most common, for a reason :)

creachadair,
@creachadair@mastodon.social avatar

@danderson @irenes The morally defensible case for an unsafe treatment is when you have knowledge that cannot be expressed to the compiler, but which you've already ensured some other way (e.g., by constraining the input higher up in the call tree). It's still unsatisfying, of course. And I always worry about the next joker who tries to cross the street because they saw me do it. 💀

danderson,
@danderson@hachyderm.io avatar

@creachadair @irenes Yup, and in this case, the straight translation from C of the data structure is one of those cases where the borrow checker cannot verify that the language invariants will be respected, but as the implementer of the structure I believe I can enforce those invariants from the caller's perspective, while doing unsafe things within.

Famous last words, of course.

danderson,
@danderson@hachyderm.io avatar

@creachadair @irenes Although rust does have a decent amount of tooling to help with writing correct unsafe code, such as miri (which is very cool: it's an interpreter for rust's MIR intermediate representation, which tracks a bunch of extra state and so can detect quite a lot of invariant violations and invocations of undefined behavior). So, it's still less dangerous than C, arguably.

But I'm still trying very hard not to reach for unsafe until I'm more grown up :)

creachadair,
@creachadair@mastodon.social avatar

@danderson @irenes Yeah. And Rust lets you express to the compiler many things you'd otherwise have to shoulder the burden for yourself. As you say, there's value in seeing how far you can push it, especially at the beginnings of things.

irenes,
@irenes@mastodon.social avatar

@creachadair @danderson we do also think there's value in digging deep to find the compiler-enforced ways to do these things. :) practice it on the projects that don't matter, so you'll have the knowledge for the ones that do

danderson,
@danderson@hachyderm.io avatar

@irenes @creachadair Definitely, although again from what I can tell so far with rust, complex data structures that want to go fast is a prime location where you end up wanting specific data layout and access patterns, and also safe rust's fundamental constraints cannot express them.

A lot of them exist in the gray area intersection of all correct programs, and correct programs that rust's semantics can express. Alas.

danderson,
@danderson@hachyderm.io avatar

@irenes @creachadair And, afaict, rust as a language community considers that totally fine, and gives you two choices: stop wanting the cool data structure that goes brrr (e.g. because another rust-compatible one works just as well for your needs), or drop into unsafe rust, where the correct program you want can be expressed , but the rules become more complex and fuzzy, and many guardrails evaporate.

An expert rust programmer would use unsafe here, IMO, and it would be the right choice.

danderson,
@danderson@hachyderm.io avatar

@irenes @creachadair Unfortunately for me, I'm definitely not confident enough yet to wield unsafe rust.

Fortunately for me, this is not something I need for anything, it's just a learning project, which I picked specifically because it's a good mix of somewhat complex types and algorithms that are forcing me to see and use a lot of the language :)

natevw,
@natevw@toot.cafe avatar

@danderson @irenes @creachadair So I was maybe coming at it from a different angle, but found https://cliffle.com/p/dangerust/ really helpful. My understanding of unsafe Rust is basically similar to silencing a linter warning or (frankly) writing code in C itself — especially if you're writing data structures your job might very well be to craft a few careful tricky parts into a safer (or in Rust's case outright "safe") abstraction.

danderson,
@danderson@hachyderm.io avatar

@natevw @irenes @creachadair Yup, I enjoyed that series, and the framing that unsafe {} is about telling the compiler who is responsible for maintaining some language invariants (you or the compiler).

I still prefer to avoid it as much as possible, because I still need all the help I can get with the language invariants while I'm learning rust.

willglynn,
@willglynn@hachyderm.io avatar

@danderson You can't express it with a &amp; reference to the parent, but you are free to express it with a ReferenceToValueInParentNode(usize) struct or enum variant. Perhaps this is another reason you need a type to represent the state of the tree traversal — if you captured that state, you could use it to interpret that kind of reference.

danderson,
@danderson@hachyderm.io avatar

@willglynn Yeah, adding one more variant to the enum is on my list of options here, but I don't love it because it forces lookups to carry more state and branches, and extremely fast/cheap lookups is one of the key benefits of this type.

But, short of embracing unsafe and stashing raw pointers as the indirection mechanism (and take on all the risk of enforcing the invariants), I don't see many other options.

willglynn,
@willglynn@hachyderm.io avatar

@danderson I think I'd have to see it to make better suggestions.

What makes this a branch in Rust and not a branch in C? Are they all pointers to values in C, some of which are owned and others borrowed, depending on data structure semantics invisible to the Rust compiler?

danderson,
@danderson@hachyderm.io avatar

@willglynn In C, nodes that carry values and "indirects" are both just pointers to the value, and the overall tree-of-trees data structure is responsible for not screwing up ownership and cleanup of this distributed state. It also uses pointer tagging to distinguish between pointer-to-value and pointer-to-child. So, the struct is just a sea of tagged pointers, and hopefully the programmer implemented stuff correctly 😁

And yes, the ownership in C is 🤷 who knows good luck don't segfault

willglynn,
@willglynn@hachyderm.io avatar

@danderson 🤔 This sounds like the data structure does not need to distinguish between pointer-to-value-that-I-own and pointer-to-value-that-I-reference. Value pointers are value pointers, direct or indirect.

If that's the case, can you just… clone() and store the value twice instead of storing explicit indirection?

danderson,
@danderson@hachyderm.io avatar

@willglynn But in C, the lookup state ends up being the input key (provided by caller), a pointer to the current inner table you're traversing, and a pointer of the last potential-result value you saw while traversing. Resolving non-pointer indirections across tables would require also remembering the path through the tree of tables, or putting a weak ptr to the parent in inner trees, and do some more derefs and memory accesses to record the right val.

All doable, just not quite as fast as C :(

willglynn,
@willglynn@hachyderm.io avatar

@danderson I think I need to see this to provide better suggestions.

Those two pointers sound like &, but If you need either/both to be mutable, then we're in the land of mutable aliasing problems. In that case I think this is a type notionally representing those two references but possibly with fewer &muts inside: maybe two &muts for one case, maybe one &mut and two indexes for another.

Do the problematic value pointers exist in the data structure in C, or are they part of the access state?

danderson,
@danderson@hachyderm.io avatar

@willglynn Yeah sorry to be cryptic, this is a somewhat intricate data structure and hard to talk about in short posts :)

The data structure is the ART (Allotment Routing Table), from https://cseweb.ucsd.edu//~varghese/TEACH/cs228/artlookup.pdf . The Go implementation I wrote is https://github.com/tailscale/tailscale/tree/main/net/art , stride_table.go is the "inner" tree type and table.go is the outer one. The go version implements path compression from the paper (necessary for ipv6 performance), but not element consolidation (~= pointer tagging).

danderson,
@danderson@hachyderm.io avatar

@willglynn The rust code isn't published anywhere, because it's less half written and also currently structurally broken, per this thread :)

As my next step I'm going to just disregard performance and memory utilization and get some version of this damn thing written and published, so that it's easier for other people to think about. But it'll take me a little bit :)

willglynn,
@willglynn@hachyderm.io avatar

@danderson Neat! I understand now that the need for the parent route pointer is due to the way this data structure handles longest-prefix matching.

One possible simplification for an MVP would be to require that values (route pointers) be Copy. This lets you use primitive numeric values (like the Go test suite), and it lets you use raw pointers (like the C library), and it lets you use &amp;'a T in Rust.

Box::leak() is safe and typed_arena isn't crazy. Start with T: Copy, relax later? 🤷‍♂️

danderson,
@danderson@hachyderm.io avatar

@willglynn Hmm, yeah, I started with requiring Copy iirc, then got annoyed at the constraint and dropped it 😁

In a perfect world when storing larger structs (which is the typical thing for route tables), it would be ideal to be able to get a &mut back out for in-place editing of route entries... But maybe it's okay to require a read-modify-write for that...

I don't see how the &'a T would work though, doesn't that make the caller's life hell with ownership? Is delete still possible?...

willglynn,
@willglynn@hachyderm.io avatar

@danderson You're promising that the referenced values live longer than the data structure. This is likely impractical for production code, but it's probably viable for the purposes of making tests that pass.

&amp;T is Copy even if T isn't, and you can make &amp;T from a T as long as you're okay with making that lifetime promise. Deletion is possible, but depending on how you got the &amp;T, freeing the T may not be.

danderson,
@danderson@hachyderm.io avatar

@willglynn Okay, so I didn't miss anything on the tradeoff of &T, thanks for the details!

Still pondering what to do here. I think a small algorithm tweak might actually unblock me for now: when doing a lookup, if a leaf is a child table, also record the route in the parent node of that leaf, and don't transfer indirect refs to the child tree on creation.

I think that preserves the semantics, at the cost of more work in lookups. Not great, but maybe tolerable for 1st impl...

willglynn,
@willglynn@hachyderm.io avatar

@danderson My read of the paper is that the data structure doesn't distinguish between a value being inserted once for 0.0.0.0/7, or a value being inserted once for 0.0.0.0/8 and again for 1.0.0.0/8. I think that means you can make that modification? I haven't thought through how that might affect deletion…

willglynn,
@willglynn@hachyderm.io avatar

@danderson There's a potential usage where the ART stores indexes into a table of routing entries, with the access to and lifecycle of those entries managed by the caller. (This is not fundamentally different than C ART storing raw pointers and telling you to figure it out.) This would work fine with a T: Copy bound.

willglynn,
@willglynn@hachyderm.io avatar

@danderson Oh! This is also not unlike TCAM. Hardware longest-prefix matching doesn't return a routing entry, it returns an index into a table of nexthops.

danderson,
@danderson@hachyderm.io avatar

@willglynn Hmm, interesting! Yeah mimicking that was on my list of desires, albeit for a different reason: in the inner trees, there can be at most 511 distinct entries (if you insert every single possible prefix), so the tree's backing array could be compacted to a 2b u16 instead of my current Entry type which is a pointer + union tag at least.

I skipped that initially because I couldn't think of a nice way to implement that backing table without insert/delete getting costlier.

danderson,
@danderson@hachyderm.io avatar

@willglynn But I don't have a good intuition for the amortized cost of growing a Vec on insert and having to track holes on delete.

Extending that idea to only store nexthop indices throughout might change that calculus too, and make the inner trees cheaper enough that the Vec management overhead is inconsequential.

... But for all these questions I really need a first implementation so that I can benchmark :)

willglynn,
@willglynn@hachyderm.io avatar

@danderson I think the fact that this can be externalized is a good reason to leave it out for an MVP. Start with T: Copy, get tests passing using integers, comfort oneself with the knowledge that integers could refer to something bigger 😆

danderson,
@danderson@hachyderm.io avatar

@willglynn Heh, yeah. Although you identified a problem up-thread: inserting some pairs of nested routes with the same value has the wrong semantics on delete, so needs to be forbidden in the API contract. Fortunately at least it's possible to detect the violation on insert and panic.

danderson,
@danderson@hachyderm.io avatar

@willglynn Oh right I remember now, I started down this road last week, and this implementation requires not only Copy but also Eq and PartialEq, and that created a lot of noise all over the code for the huge type bound :/ Unless I missed a way to define a bound alias, but I think I found an RFC about that which was rejected...

willglynn,
@willglynn@hachyderm.io avatar

@danderson Hmm… I think you're right, pointer equality carries important semantics here that T: Copy+PartialEq equality doesn't. Maybe this really does need to hide internal (unbounded) T storage.

danderson,
@danderson@hachyderm.io avatar

@willglynn Yeah, or the caller must only store unique values, and provide the indirection themselves outside the route table if they want to point several routes at one nexthop struct. But of course that continues to make the API less and less usable...

willglynn,
@willglynn@hachyderm.io avatar

@danderson The stride tables rely on compact, readily-duplicated value references. References need to be unique per insert to provide proper deletion – or at least, unique within the table at that point in time. Given these invariants, I don't think this can be a caller-defined type.

Table operations frequently compare and duplicate references, but they only follow the references at the very end, at most once per call. Maybe they're NonZeroUsize referring to an internal Vec&lt;T&gt;?

danderson,
@danderson@hachyderm.io avatar

@willglynn Yeah that was my thinking, although given the maximum number of entries a NonZeroU16 would be even cheaper.

Annoying part with an internal Vec is handling deletes, you end up with holes in your Vec and so have to scan on insert to find a reusable slot.

otoh rust is good at autovectorization and there are tradeoffs w/ branching where you only do the search if the alternative is growing the backing array...

willglynn,
@willglynn@hachyderm.io avatar

@danderson I was thinking a single backing store for all T in the table. You could still parameterize the internal reference type (a trait implemented for various numeric types?) but the low level tree code need not ever know a specific T value type.

danderson,
@danderson@hachyderm.io avatar

@willglynn oh, right, yes if the outer table is storing the Ts then NonZeroUsize is right, and the internal table has an API contract that prefixes are inserted with unique values still - but the outer table enforces that, not the caller.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • ngwrru68w68
  • rosin
  • GTA5RPClips
  • osvaldo12
  • love
  • Youngstown
  • slotface
  • khanakhh
  • everett
  • kavyap
  • mdbf
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • megavids
  • InstantRegret
  • normalnudes
  • tacticalgear
  • cubers
  • ethstaker
  • modclub
  • cisconetworking
  • Durango
  • anitta
  • Leos
  • tester
  • provamag3
  • JUstTest
  • All magazines