[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Comparing Pika-syle and JNI-style

This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.



Jim Blandy <jimb@xxxxxxxxxx> writes:
> Well, if SRFI-50 turned out not to be what I was hoping, and I didn't
> come to my senses quickly enough, I was going to turn <minor/minor.h>,
> into a .texi file, start a SRFI from that, and see what people said.

In light of that, I'm curious to know how people generally feel about
the Pika vs. JNI issue.  If there's a near consensus on one or the
other, then that could save a lot of trouble.


Here are links to the specs:
Pika:  http://arch.quackerhead.com/~lord/releases/pika
       http://regexps.srparish.net/src/pika/
Minor: http://svn.red-bean.com/repos/minor/trunk/include/minor/minor.h


Here's how I see it:

Commonalities:
- Both work by having C code manipulate only references to Scheme
  values, not Scheme values themselves.
- Both impose few restrictions on the representation of Scheme objects.
- Both allow GC to occur at any time.
- Both can be implemented in a way that interacts nicely with threads.


In Pika:
- Leaks are impossible, since references are stack-allocated.
- References are freed upon exit from the lexical block that owns
  them --- finer-grained than JNI-style.
- Probably less overhead than JNI-style.

But:
- Forgetting an UNGCPRO corrupts the GC's data structures, and may
  fail only intermittently.  Irregular exits (return; goto; break;
  continue) require attention.  longjmp is even harder.
- Functions may only return Scheme values by reference; they may not
  provide them as their (syntactic) return values.  Instead of writing
  "f (g (x))", you must write:

    g (&frame.x, &frame.temp);
    f (&frame.temp, &frame.temp2);

  In other words, you must write your code as linear series of
  operations which work by side-effects.
- Since the API functions all expect pointers to t_scm_word values,
  this discourages people from passing them around directly, but it
  can still be done --- e.g. "frame.x = frame.y;" --- and doing so
  will usually work.  But doing so is a bug.
- Variable declarations are cluttered with enclosing structs and GCPRO
  / UNGCPRO calls.


In JNI-style:
- Functions can return references directly, so code need not be
  linearized.  You can write "f (call, g (call, x))" --- if you know
  that "call" will return and free g's return value soon enough.
- Local references are freed automatically when the Scheme->C call to
  which they belong returns.  Leaks due to unfreed local references
  (which will probably be the most common sort of error) have a
  bounded and often (though not always) short lifetime.
- No GC data structures live on the C stack, so careless control flow
  and longjmps will not corrupt the GC's data structures.
- The "explicit free" model is familiar to C programmers.
- Variables are declared normally, and their values used directly.
- Since mn_ref is an incomplete type, it can't be dereferenced, so 
  people can't be sloppy and operate on the heap values directly.

But:
- The "explicit free" model is still error-prone.  The fact that leaks
  are bounded by their owning call's lifetime may not always help.
- Probably more overhead than Pika-style.
- Code will be cluttered with explicit-free crap.


Is this fair?  What have I missed?  What do people think?

It would be nice to see sample code in each style.  C implementations
of "cadr" and "assq" would be nice.  As far as I know, error checking
is similar under both interfaces, so that can be left out.


    mn_ref *
    cadr (mn_call *c, mn_ref *obj)
    {
      return mn_car (c, mn_cdr (c, obj));
    }


    mn_ref *
    assq (mn_call *c, mn_ref *key, mn_ref *alist)
    {
      while (mn_pair_p (c, alist))
        {
          mn_ref *pair = mn_car (c, alist);
          mn_ref *pair_key = mn_car (c, pair);

          if (mn_ref_eq (c, key, pair_key))
            return pair;

          mn_free_local_ref (c, pair);
          mn_free_local_ref (c, pair_key);
          alist = mn_to_cdr (c, alist);
        }

      return mn_false (c);
    }