[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Couple things...



On Tue, 23 Dec 2003 10:22:12 +0100, Michael Sperber <sperber@xxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:

"felix" == felix  <felix@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> writes:

felix> It's absolutely unnecessary to specify which C-level forms are macros, felix> or which are functions. Leave that to the implementors, and allow all
felix> the forms to be macros instead.

That certainly was our intention.  Do we overspecify anything by
saying "Note that most functionality on the C side is implemented by
macros."?

Well, it's _all_ functionality, so I don't see a reason for being
vague.

felix> Defining bindings from C is allowed, and the SRFI-document
felix> specifically points out the C init-code may run before Scheme
felix> init-code. Yet, SCHEME_DEFINE_EXPORTED_BINDING may GC, even
felix> before Scheme init-code has run?  Weird.

Sure.  Could you specify where this is causing problems for you?

I'm probably misunderstanding this, but do you mean that "Scheme init-code"
does include setting up the garbage collector? If it does, then no GC
can run, unless it has been initialized, right? Specifically, is
the Scheme runtime-system already set up, before the bindings are defined
from C? Or does the binding-registration from C init the Scheme world
for me?


felix> I find it a bit tricky to exactly specify what may GC and what
felix> not.

Yeah, me too. :-)

Exactly, in fact it's getting so tricky, that it's likely that not
all cases can be covered.


felix> For example: mutations (a la "SCHEME_RECORD_SET") may very well
felix> allocate storage (if the write-barrier involves allocating something felix> on the heap, that describes the mutated slot). The life-time of data felix> on the heap may be extremely short - what happens if GC or finalizers felix> run in a different OS-level thread? The authors would do good by not
felix> assuming every Scheme implementation does it like S48 or PLT.

It certainly wasn't our intention---we looked at a lot more Scheme
implementations than just those two.  However, even writing up the
current draft was difficult enough.  I'm happy to hear suggestions on
how to improve it.

And I'm not trying to be rude here, I want add. I appreciate already that
someone has the courage to think about a (somewhat) portable FFI,
even considering that I think the current draft isn't any good. ;-)


felix> Alternative approaches would be:

felix> 1) Selectively switch GC on/off in sections of C code (just like
felix>    critical sections, really).

Is this really practical in all conceivable environments?

It's safer at least. But not completely satisfying, I admit.


felx> 2) Allocate *once* a complete chunk that will be
felix>   able to hold all
felix>   data needed subsequently without triggering a GC.

That certainly seems impractical to me if you need to limit space usage.

Yes, it's not overly convenient, especially if one takes into account
that different objects may be allocated in different memory
arenas. But the problem is that we have to constrain GC, if all
possible scenarious have to be taken care of.

felix> SCHEME_CALL: "For example, suppose Scheme procedure s0 captures
felix> continuation a and then calls C procedure c0, which in turn calls
felix> Scheme procedure s1. Procedure s1 can safely call the continuation a, felix> because that is a downward use. When a is called Scheme will remove
felix> the portion of the C stack used by the call to c0."

felix> How do you know that? Why do you specify this? Does this mean a is a
felix> special kind of continuation, one that uses longjmp()? What if
felix> continuations are explicit (in a CPS manner)?

Those aren't continuations in the sense of the SRFI.  (How you you
tell from looking at one?)

Hm. I have to think more about this... I just feel uncomfortable
with the fact that `a' has to perform some explicit action that might
interact badly with the underlying execution model.

It *is* the stated intention of this SRFI to be
Scheme-implementation-agnostic.  However, of course our take on the
matter is limited by what we know.  So I suggest that, whenever you
say that we're being overly Scheme-48-specific, you make a concrete
suggestion on how to be more general.  We've put significant thought
into most of the issues you mention, so any lack of generality above
that probably reflects more a limit of our abilities than a limit of
our willingness to improve things.

felix> Why have countless macros that access and create Scheme data? Some
felix> basic forms for defining code callable from Scheme (and vice versa) felix> would be more than enough, together with a simple system of specifying felix> Scheme->C->Scheme type mappings. This would also remove the GC- related
felix> problems (mostly).

I'll be glad to see a concrete writeup of this idea. :-)


My idea is simpler, more safe, more portable and does away with most GC
and character-representation issues (as far as I'm able to understand
Unicode-related problems): define blocks of code externally, with a properly defined set of arguments/results and their types. The foreign
bindings could be defined in a "binding"-language, perhaps even by taking
ideas from SRFI-7:

;;; foo-n-bar.spec

(define-foreign (foo (x int) (y (string ascii)))
 (callback)
 (values (r1 float32) (r2 (vector int32)))
 (language c)
 (file "foo.c") )

(define-foreign (bar) (values (x (char latin-1)))
 (language c++)
 (code "x = yomama.foobar();") )

This (perhaps processed with an external tool that creates the proper
makefiles) would generate something that can be linked statically or dynamically with a Scheme system:

(define b (load-foreign-bindings "foo-n-bar"))

(define foo (foreign-binding-ref b "foo"))
(define bar (foreign-binding-ref b "bar"))

Or with some syntactic sugar:

(define-foreign-bindings "foo-n-bar"
 (foo "foo")
 (my-bar "bar") )

(define-values (x y) (foo 99 #\X))
(write (my-bar #\space))

Inside the foreign code, no GC may occur, unless the `callback' clause
is given (how exactly the callback takes place, is another issue). Arguments and return-values can be transformed to one (of several) representation that the foreign language can handle. Several target- languages could be supported (Java, C, C++, Objective-C). Multiple return values would be handled, with named output parameters. I admit that the representation-transformation may incur performance costs, but we get something much safer and easier to use for paying this price. We can reduce the number of needed accessors, if the arguments are available as natively known types. For handling Scheme data directly (which is in my experience not very often the case), we can pass the arguments as C-unions (or something that the
target language understands easily).
Arguments may or may not be mutable (this could be specified with additional type-qualifiers). Arguments may or may not share data with Scheme values, this could either be optionally specified or the implementation may chose to pass the data unconverted, if it matches the argument type for the foreign code. There is a lot of room for blowing up the binding language and its supported
types, but at first it could just support basic things, with future
SRFIs extending the language to support more facilities.


cheers,
felix