tim | TMI: Fun with boxed classes

I've been writing posts but not posting them for the past couple days, so now I'm catching up. This one is from Tuedsay, May 8.

Tuesday featured a marathon meeting session in the morning, after I continued working on heap-allocated classes. I started out being kind of puzzled about how to annotate classes
in the AST to indicate whether they're allocated in the stack or the heap -- because they
already have region parameter annotations.

<spoiler> On IRC two days after I first wrote this, we decided to defer implementing heap-allocated classes, and as per suggestions from Niko and Patrick, I filed an RFC to allow static class methods, which could accomplish the same thing as @class without needing special syntax. But for posterity, I'll write about my thought process anyway.</spoiler>

I also got puzzled about whether @class cat { ... defines a different type than class cat { ... does. (In the end, that was one of the reasons why we decided to propose static methods instead of implementing classes as stated in the spec. I did end up adding an annotation to an item_class to say whether this class is on the heap or the stack, as well as adding a different annotation to a ty_class to say whether this type represented a heap class or a stack clas. But the whole thing smelled kind of bad to me.

Separately, I started outlining a rustc hacking guide, which is still minimal. The inspiration is the GHC Commentary.

Once I figured out how to add a keyword (the first addition to the hacking guide), I was
able to get the code for @class compiling. But then I realized that @class shouldn't and can't be a keyword, since the lexer splits off the @. (What we discussed in the meeting today -- making keywords tokens -- would solve that.) So, it won't be a keyword -- there's now just one funny-looking case in parser::parse_item.

Turning the crank, and moving on to trans (the code that actually generates LLVM code from Rust code), I got a dreaded LLVM assertion failure:

"Assertion failed: (Ty && "Invalid GetElementPtrInst indices for type!"), function checkGEPType, file /Users/TimChevalier/rust/src/llvm/include/llvm/Instructions.h, line 703."

(I'm reminded of my dream of ways of checking within rustc's middle end that we're generating well-typed LLVM code, so that we can get more informative error messages than this when there's an internal compiler error -- but for now, that's just a dream.)

This doesn't tell me a lot except that my changes to trans were generating bad code. Actually, it tells me a little more, in that what went wrong was that I was calling GEP on something of the wrong type. (GEP, or "Get Element Pointer", is an LLVM instruction that combines a dereference with a field index. The charmingly titled "The Often Misunderstood GEP Instruction" explains more.) It's still not that helpful, as in I don't know exactly what type it was being called on. So I have to fire up gdb and get a backtrace, so I can see exactly who called checkGEPType in a way that caused it to go wrong. (The Rust RTS should have the innate ability to print out a backtrace on failure, but it's not working right now for whatever reason.) And for whatever mysterious reason, I need to type "return" a bunch of times to get a useful backtrace (maybe something to do with the stack growth stuff). (All of this has the interesting side effect of causing gdb to hang on exit.)

In this case, though, the problem was pretty obvious: middle::trans::base::trans_class_ctor was doing the wrong thing when allocating memory for the self in a boxed class. I'd figured for a while that I would probably have to allocate memory at the call site instead of the code for doing that being inside the constructor code. And now is the time. It's just easier that way because, for boxed classes, now the caller can take care of the allocation and the reference-counting operations that go with it. (Rust has the interesting property that it manages most of its memory by reference-counting, relying on a cycle collector to, well, collect cycles.) So if trans_class_ctor gets the memory passed in from its caller, it doesn't even have to know whether the ctor is for an @class or a stack-allocated class.

Then, I got an inexhaustive-case in base::copy_val_no_check because I hadn't updated
ty::type_is_boxed to say that @classes are boxed. Well, they are. Now I'm getting another LLVM assertion failure:

"Assertion failed: (getOperand(0)->getType() == getOperand(1)->getType() && "Both operands to ICmp instruction are not of the same type!"), function ICmpInst, file /Users/TimChevalier/rust/src/llvm/include/llvm/Instructions.h, line 958."

Another backtrace... And printf-by-debugging because the logic for class ctors isn't working. I added a fairly ugly hack to trans a while ago that fills in the return type for a class constructor (because it's not written explicitly: within a class, a constructor is written like new(arg1 ... argn) { ... }, without an explicit return type, since the return type is always the enclosing class type), which involves passing around the set of all class constructor IDs. After moving some of the code for translating ctors to the call sites, the class constructor set wasn't getting extended properly.

The rest was just figuring out how to allocate memory for self.

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Tim's journal

Taking metaphors too far since 1995

TMI: Fun with boxed classes

Profile

November 2021

Links

Most Popular Tags

Style Credit

Expand Cut Tags