This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.
> From: Jim Blandy <jimb@xxxxxxxxxx> > Yes, the EXTRACT issues aren't critical. But the thread-related > problems with GCPRO that I don't see how to solve are those > created by the user's compiler rearranging code that operates > directly on heap references. The compiler is free to make > copies of heap references in registers where a copying GC can't > find them to update them. We agree about the problem caused by C semantics and in broad strokes about the compiler-taming tricks needed to solve them -- my solution differs from yours (and JNIs) in that it doesn't require reference objects to be explicitly heap allocated and freed. Loosely speaking, they are instead stack allocated. (Yes, I suspect you are thinking of making a tiny stack on the heap to allocate the local mn_frefs of a given call but keep reading.) In the GCPRO system I'm proposing, all parameters are passed by reference (by a `scheme_value *'); all return values returned by output parameters (again a `scheme_value *'); and local assignment is via a macro that can impose a write barrier rather than with C's assignment operator. In those regards, it very much resembles the mn_refs idea. (Though changing the value of an mn_ref seems something one is less likely to do in your system.) One difference in our proposals concerns the lifetimes of variables. Local mn_refs seem to live until some outermost call returns unless code explicitly creates then destroys a new mn_call. My approach controls variable lifetimes with GCPRO-style calls giving them lifetimes that coincide with C stack frame lifetimes. Note that I haven't made any proposal about what you call `global mn_refs'. I am planning on having an interface for allocating arrays of GC roots to which C data structures can refer. A simple example: /* scm_cons1 (result, arena, a) * * Return a new pair of the form ( *a . () ) * * Equivalent to (lambda (a) (cons a '())) * */ void scm_cons1 (scheme_value * result, scheme_instance arena, scheme_value * a) { struct cons1_locals { SCM_FRAME; scheme_value nil; } l; SCM_PROTECT_FRAME (l); SCHEME_MAKE_NIL (&l.nil, arena); SCHEME_CONS (result, arena, a, &l.nil); SCM_UNPROTECT_FRAME (l); } The parameter, `*a', is protected by the caller. `l.nil' is protected because SCM_PROTECT_FRAME has made it visible to GC and because it's address, not its value, is passed to the primitives `scm_make_nil' and `scm_cons'. The value stored in `l.nil' is protected by `scm_cons1' before `scm_make_nil' returns. The value stored in `*result' is protected by the caller of `scm_cons1' before `scm_cons' returns to `scm_cons1'. (If interprocedural optimization is allowed to screw this I'd like to know exactly how and why....) > The general view is like this: the GCPRO'd variables are inescapably a > data structure that is shared between the mutator thread that owns the > stack frame and some other collecting thread out there. But there's > no opportunity for the API implementation to do the needed > synchronization. Yes there is. The only way the GCPROtected variables are ever modified is in the primitives provided by the FFI. This includes assignment between two locals: SCM_LSET (&l.a, arena, &l.b); /* l.a = l.b */ and as tb pointed out, other C operators on scheme_values are also prohibited: scm_is_nil (arena, &l.a) /* rather than l.a == scm_nil */ A function using the FFI has no reason to ever land a raw `scm_value' in a register or compiler-created temporary variable. > The only way I can see to save GCPRO is to forbid collection except > when every thread is at a "safe point". In other words, you > reintroduce the restriction that "collection may only happen in calls > that do allocation", by saying that *every* thread must be in a > specially designated call. Not at all. For example, in `scm_cons1' above, GC can safely happen at any point at all during its execution from prolog to postlog (even if, for some strange reason, nil is newly heap allocated by scm_make_nil). The biggest issue in choosing between the two approaches, as far as I know, is the question of efficiency. The approach above has a few advantages in that regard, I think: a) Assuming that you plan to build little stacks on the heap to allocate the local mn_refs for a given call, the allocation overheads are probably close to a wash. I might get some advantage in allocation times by not having to do a separate overflow check and by getting the space for them when the C stack frame is allocated. I get some advantage by allocating a bunch of variables at once with GCPRO. I get some advantage by not having to separately stack allocate room for `mn_ref *' values. I get some disadvantage (speed-wise, not precision-wise) from the greater number of GCUNPRO calls. b) In single-threaded environment, I can inline some primitives and (at least my hunch is) get much better code. For example, SCM_LSET can come out to just an ordinary C assignment (`='); SCHEME_IS_NIL can come out to an == check on a local variable that may very well be in a handy register. c) You may have a good answer for this but I don't see it in your post to the list. Don't local mn_refs leak like a sieve? For example, `mn_cdr' returns a new `mn_ref *', right? It's not freed until some outermost call, associated with the `mn_call *' I got returns. So now what if I'm traversing a long list with K elements? Won't that allocate K local mn_refs which aren't freed until I return? Won't they, until then, be protecting the values they refer to? -t