I am starting a new project that is intended to be designed as a (#p2p) protocol eventually with implementations in multiple languages. I know #Python well, but I have been learning #Rust and think I'll need to write at least some of the perf-sensitive components in Rust. Do I prototype it in Python and then rewrite in Rust later, or try and power through and write it in Rust now? #RustLang
edit: added hashtags fwiw
@jonny as someone who uses both professionally, i guess Python would be a slightly better choice, especially if you're more familiar with it. However, you may find this blog post from last week interesting, where the author compares some modern Python features with their rust counterparts:
@guenther
that's a really great read and resource, thank you for sharing :) surprisingly (to me at least) I already do most of those things (except things like using NewType, I like that a lot) without intending to mimic Rust. I guess that's just helpful bleed-over from learning it.
I guess this complicates language choice a little more lol, since what other lessons would I learn by using rust? but yes I eventually would like to write a unified implementation and use pyo3
@jonny If this project is primarily an exercise for you to learn rust, then go for it. I think there are two main arguments against rust:
You may get side-tracked by learning the language and fighting unfamiliar tooling, slowing down progress on the protocol/implementation
Rust is slow to compile, so using Python may allow for faster prototyping. That said, Rust tends to be easier to refactor (thanks to rust-analyzer), which may or may not cancel out the compilation cost.
@jonny avoid pre-optimizing. Until you have evidence some component is the bottleneck, most optimization effort is wasted. The bare minimum is algorithm/data structure. Then measure. Then revisit implementation details. Algorithm? Data structure? Language?
If you decompose the problem into pieces that aren’t hopelessly entangled with each other, then reimplementing is a new language is low risk.
distributed messaging + linked data -> ActivityPub
and
BitTorrent + git -> IPFS
then I suppose you could equivalently say "p2p the fediverse" and also "socialize IPFS."
or, trying to make a social, graph-based system for information ranging from snippets of text to massive heterogeneous datasets. p2p that actually has metadata in-protocol, designed for privacy and mutability (not append-only) and specifically trying to focus on backwards compatibility (to eg. HTTP servers, BitTorrent, IPFS, existing semweb/LD) .
so, lotsa stuff, but I'm at the point now of trying to formalize and iterate rather than imagine, and I think I have figured out some axiomatic concepts that bring the complexity of the protocol way down so the complexity of its use can explode.
don't quote me until I have running code in hand tho lmao
@jonny Sounds interesting! Have you thought much about how the protocol would be used? I mean, I can guess vaguely where this is headed, but I find having a specific practical need and user experience in mind can be important framing for this sort of work.
so intended use at multiple scales and occasions - using the circumstance of academia to build it both in terms of resources but also the dire need for systems to organize and archive heterogeneous data. so starting out by seeding a cluster of labs at UCLA to use it for managing their data from acquisition to archive, using that to work out kinks in building schemas, data ingestion, indexing, and transferring.
then building that further into a communication system by extending scoped permissions into social graphs, building means of having object types be presented in flexible media (eg. using a chat like, microblogging, document, notebook, or graphlike interface to create and manipulate subgraphs of the same underlying graph.
then building governance systems for a kind of "federated p2p" where you can explicitly model relationships of permission, mirroring, etc. where eg. a cluster of people might agree to all rehost some specified set of each other's graphs, and that might look like idk within academia departments, institutions, or societies setting up archives, or it might look like people making a chatroom or forum together, and so on. in the context of piracy that might look something like creating a tracker-like thing that uses an agreed upon set of schema to index a particular kind of thing.
so seeding a system using the data needs of academia to get some initial batch of peers with some overhead of donated storage with the intention of making information systems in the public good.
@ngaylinn
currently drafting a high-level set of components and libraries for the system, dev roadmap, comparisons to existing systems, and etc. to make it concrete, that infrastructure piece basically has the idea in longform prose and pseudocode
@jonny Neat! Sounds ambitious and open-ended, but also much needed and timely. I'll definitely be following with interest.
For now, I'm ignorant enough about scientific datasets and how academia works that I have no feedback. If you get to the point of considering user experience, frontend, or serving issues, though, do let me know! I might also be able to scare up a few KG experts from my network, but they'd be more likely to take an interest if there was a promising seed established.
@jonny if you write it in pure Python now it can be used anywhere there's a Python interpreter, and then you can use optional deps and the rust ffi/conditional imports to provide the go-fast button.
@jonny I vaguely recall activityPub having a language-independent testing/validation framework. something like that might be more impactful for future implementations than the choice of language for reference implementation?
@ansuz
i was planning on doing TDD writing the spec in parallel, but I have never seen a language independent testing/validation framework so I will take a look at that and see how that works, good idea.
@jonny "language independent" was a bit ambiguous - I think Christine wrote it in some lisp, but that it operates over the network layer? Interoperable is probably the better word
@ansuz
ohhh that makes sense. yes will definitely be writing a spec checking tool :) I shall take a look at how Christine wrote it but I also can only read lisp at a gist level (even tho I find it v interesting)
I want to write the protocol and the implementation in parallel so I can iterate as I go, and that would definitely be easier for me in Python, but I also feel like I might get led down wrong paths with kludges that are python-idiomatic but don't translate well.
I'm also aware that like anything moderately complex I'll need to completely rewrite it after the first draft, and maybe that would be a time to hop languages?
I also am just really excited to try writing Rust, almost finished the book
@jonny you should pick whichever you need to use to get a basic MVP out the door. That way users can start using it and tell you everything you did wrong.
@sqncs
ok that is smart. I was mostly weighing in my head "what if the python version is too slow to be useful" because it will involve doing a shitload of hashing and graph traversal, but I suppose that's a very pre-optimizing way of thinking yno
@jonny I'd be surprised if there wasn't already a python wrapper for something like that. But yeah if I were in your position I'd feel like I was over optimizing too early in the development.
@sqncs
yes definitely, I have just found async and multiprocessing to be a bit of a headache with the GIL, but there are ways of writing fast Python code for sure, one of the things I mean by not wanting to get misled by python idioms.
@jonny my very ADHD-brain thought is "use whatever will help you sustain your interest."
in my experience, trying a new language on a project can be a double-edged sword. on the one hand, it can make both the project and the language more interesting by giving me two sources of novelty at once, but on the other hand it can also be a source of frustration if I end up banging my head against the language while trying to make progress on the project
@joe_no_body
ack this is almost exactly my line of thought rn. I am leaning towards having a fun thing to play with rather than learning a language because I have been wanting to start this for so long and am tired of not having a prototype to point to!
@jonny you could also do what I do sometimes: start the project in Python, write a bunch of code, feel like it would be actually better in a different language, start over, write a bunch of code, decide actually Python was better, but not that last attempt, start over, write a bunch of code, etc.
Add comment