TMI: More typechecker refactoring
Feb. 25th, 2013 08:12 pm
failures:
[run-pass] /Users/tchevalier/rust3/src/test/run-pass/unreachable-code-1.rs
[run-pass] /Users/tchevalier/rust3/src/test/run-pass/unreachable-code.rs
[run-pass] /Users/tchevalier/rust3/src/test/run-pass/weird-exprs.rs
There was something satisfying about that. Little weird-exprs tests doing their job.
As for why the unreachable-code tests failed: trans (the part of rustc that translates Rust to LLVM) tries to use type information to avoid generating code for unreachable Rust expressions. This is all pretty ad hoc, but since we have the _|_ type in the type system, why not use it?
The expression in this case looks like return + 1. In the old typechecker, if you looked up the type for this instance of 1, it would have been int or uint. But return always returns and arguments to + in Rust are always evaluated left to right, so we know that the 1 is unreachable here. So the new typechecker says 1 has type _|_ in this context.
That caused the code to translate int literals to complain, since it's not expecting an int to have type _|_. Why would anyone write code like this, you ask? They probably wouldn't, but macros could.)
So I'll just make trans more consistent in when it avoids generating code for unreachable expressions. Arguably it might be better to make trans not approximate what code is unreachable and just treat _|_-typed things as if they are reachable and generate code for them, but that's a change for another day.
(no subject)
Date: 2013-02-26 05:38 am (UTC)(no subject)
Date: 2013-02-26 07:40 am (UTC)We care because currently, the Rust typechecker looks for unreachable expressions -- ones that would have to be preceded by evaluating a value of type _|_ in the same context -- and warns about them. Though this should really be done outside the typechecker anyway.
The term comes from the theory of lattices and partial orders, and I guess Wikipedia is as good for the background on that as anything, though you don't have to understand all the math to understand how it works in programming languages (as usual).
(no subject)
Date: 2013-02-26 07:52 am (UTC)(no subject)
Date: 2013-02-26 10:54 am (UTC)I think all you need to know about partial orders is that ⊥ is below everything in the subtyping relation (which is a partial order). And ⊥ is below everything because it's uninhabited; the empty set is a subset of every set.
(I guess you mean the type of expressions (statements?) whose evaluation never returns? Values should be already evaluated, and you'll never actually get a value of type ⊥.)
(no subject)
Date: 2013-02-26 07:21 pm (UTC)(no subject)
Date: 2013-02-26 11:05 am (UTC)You've mentioned possibly wanting to add some kind of refinement types to Rust, so I just want to point out: If you have refinement types, you'll probably want to use them to write matches that aren't trivially exhaustive (but are exhaustive given the refinement type), like
(ML syntax, sorry.) And you'd also want to not generate any code for the nil arm if it's given:
case (xs : nonemptylist) of cons(h,t) ⇒ e1 | nil ⇒ e2because e2 will be unreachable. (Depending on exactly how you arrange the type system, you might not even be able to generate code from e2.)
(no subject)
Date: 2013-02-26 07:24 pm (UTC)If you're curious, Niko proposed this take on datasort refinements back in August, and we haven't discussed it much since then but I personally want it or something like it to be in some Rust version greater than 1.0.
(no subject)
Date: 2013-02-27 12:37 am (UTC)Maybe more importantly, you can't use that kind of refinement to express different sublanguages of ASTs—say you're writing a compiler, and you have some AST constructors that correspond to syntactic sugar ("derived forms"). Some pass is going to get rid of all the derived forms, and you want to write the subsequent passes without bothering with the cases for the derived forms, because you know they're impossible. If your refinements can look only at the head constructor, you can rule out derived forms at the top level of the case, but the type system doesn't know that the derived forms don't occur further down the AST. You can express that something is a binop, but you can't express that it's binops all the way down.
I was going to say that full datasort refinements shouldn't be much harder, but then I remembered that then you would need intersection types. (What? not everyone has those already?)
(no subject)
Date: 2013-02-27 01:06 am (UTC)Hmm, I think you're right about that, though I need to think about it more. And that would be too bad, since what you're describing about derived forms is exactly one of the sorts of things I'd like to use refinements for in the Rust compiler. More study needed, obviously.
As far as things like even-length lists (or vectors) or red-black tree invariants, I'm not sure that expressing these is a priority, because we're trying to focus on expressing patterns that occur frequently in systems programs. Obviously, you might use a red-black tree when writing an OS, but I'm not sure how urgent it would be, for people really writing systems code, to verify its correctness through types (rather than testing). In Rust we're focusing on trying to capture patterns that already arise in C and C++ programs and make them checkable by the compiler -- while integrating more ambitious static property checking on top of Rust would be an interesting research project for somebody, I'm not sure how much of a priority it is for us. I personally want refinements to eliminate "fail, the impossible happened" cases in the compiler, which could be addressed with subsets of enums (even if it's not exactly clear right now how to carve out a system that's just powerful enough).