tim | TMI: Exhausting all the possibilities

I closed #2735, which had to do with how

let _ = e;

and

e;

can have different semantics in Rust. As you might guess, "_" effectively means "I'm never going to use this, so just give it a name I can't refer to." (Left-hand sides of let decls, by the way, can be any irrefutable pattern.) So why would they be different?

e might have a struct type (structs are the new classes) that has a destructor. So in the first case, the code generator looks at the decl and says "we're binding e to something, so run the destructor when that something goes out of scope". So if you actually had:

let _ = e;
error!("Hello");

e's destructor would run, presumably having observable effects, after "Hello" got printed out. On the other hand, with just e; the code generator sees that you're not doing anything with the result of e;, and runs the destructor immediately after evaluating e.

I changed it by adding a special case that checks the pattern on the left-hand side for "_"-ness. It feels a little odd to have this special case, but weirder to have these two forms behave differently.

Mainly, though, I'm still working on removing non-exhaustive matches from the Rust codebase. I started compiling a list of all the non-exhaustive matches I removed, and classifying them roughly. The hope is that this will be evidence that we can actually use to figure what, if any, language features to add to make it easier and safer to express the knowledge that's in your head (but not communicated to the compiler) when you wrote a match check before. (match is the new alt. Oh, syntax changes.)

Threaded | Top-Level Comments Only

From:

juli

A common idiom in C++, which may not be anything at all like relevant in Rust, is to use scoped constructor/destructor behavior to acquire a lock within a block of code. I don't recall whether Rust has inbuilt synchronization, but you may find people doing things like that, if there's not some better mechanism on offer. Locks aren't the only things, but by far the most common. It seems like special-casing on _ seems wrong — presumably you have some defined semantics around when constructors and destructors run, right?

tim

Is RAII the name of the idiom you're talking about? This is something I've heard about a lot but not really used in action when programming in C++ (which isn't surprising since I've done very little C++).

Rust has message-passing/actor-style concurrency, so the idea is that while library writers might use locks to implement some concurrency abstractions, ordinary programmers aren't going to use locks explicitly. Or at least that's the hope. However, we do use the same kind of idiom to do things like making sure that files get closed even if the current task gets interrupted by a user-thrown exception.

It seems like special-casing on _ seems wrong — presumably you have some defined semantics around when constructors and destructors run, right?

Semantics? Ha! No, really, though, the lack of formal semantics is proving to be a problem in some ways, such as not knowing what to do in this case. Niko pointed out that if we treat let _ = ... specially, we should also treat let (a,_) = ... specially if the second component of the tuple has a destructor, and so on. So my solution probably isn't quite right, but I don't know what the right thing is yet, partly because it's confusing to ask what's right in the absence of a semantics.

It's RAII, I guess. I've always thought of it as different, but it turns out that I'm wrong about that. The reason I think of it as different is that the thing which handles the scoping is not itself used. That is, you really do want your scoped-lock object to be named _. It has no meaningful operations. (For some reason in my big C++ codebase I decided to add some operations on it, presumably because I'm some kind of jerk.) With automatic reference-counting using the RAII model, obviously the smart pointer (or whatever) is something the programmer uses directly. In one case you're sort of faking syntax the language lacks (using scope for mutual exclusion), and in the other case you're actually using a resource which happens to be bounded by the scope in a meaningful way to fake semantics the language lacks (a reference-counted, garbage collected, etc., heap.)

luinied

...I meant to reply to this when it was posted, but then I had travel and somehow forgot? Anyway: it seems to me that you might define the scope of _ to always be trivial - that is, anything that happens at the end of its scope happens right after everything that happens at the beginning of its scope. Which makes sense to me, because _ is the variable you're swearing you don't actually ever want to use, and it seems like it would give consistent behavior in all the cases you're thinking of. Does this sound like a sensible solution to you?

This doesn't make it clear how to handle something like:

let (a, (_, b), _) = ...;

though. (Supposing that the wildcarded things both have destructors.) Which I know I didn't mention in my OP, but special-casing let _ = ... logically leads to the question of how to handle things with nested _s.

Maybe I'm missing something, but I think you'd:

evaluate the right-hand side,
bind its .1 component to a and its .2.2 component to b (this isn't something that can somehow have side-effects, is it?), and then
run the destructors on the .2.1 and .3 components (in that order if destructors otherwise fire queue style, in the opposite order if they fire stack style), because these parts were "bound" to _, which gives them a trivial scope by virtue of _ being the programmer's declaration that they don't care about whatever this would be bound to.

You are going to go against someone's intuition no matter how you do this. Functional programmers would expect cases like the one you first posted to work the same way, while C++ programmers aren't used to _ being special and would expect it to behave exactly like any other variable. I obviously have my own bias in siding with the functional programmers, but if that's the way you go with Rust, I think that declaring, in the part of the language definition that talks about _, that if effectively has a trivial, immediately-closed scope is a sensible way to do it. (But, again, I don't know Rust like you do, so maybe there are still ambiguities I'm not seeing.)

See https://github.com/mozilla/rust/issues/3181 for ongoing discussion of this! It's not entirely within my depth since I haven't really written a lot of code that used destructors heavily.

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Tim's journal

Taking metaphors too far since 1995

TMI: Exhausting all the possibilities

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

Profile

November 2021

Links

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags