[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: More JNI vs. Pika comparison



Tom Lord <lord@xxxxxxx> writes:
>     > It seems to me similar problems will occur working with any
>     > third-party tool that presumes it is sufficient to let people pass
>     > around pointers to data of their own definition.
> 
>     > So, in the end, it looks to me as if Pika will need to provide a
>     > JNI-style interface anyway, in addition to the C compound-statement-
>     > bound interface, which would still be the preferred interface for C
>     > code written against Pika interfaces.
> 
> I think that has to be read as "JNI-style" in only the broadest sense
> of the term -- a need for an interface to create locations whose
> lifetime is explicitly managed.  Narrower "JNI-style" features that
> are _not_ necessary include:
> 
> ~ reference counting for locations
> ~ "linear" functions
> ~ attachment of locations to a "call" structure whose lifetime
>   trumps the reference count of attached locations

Where did reference counting come from?  I don't think I've ever
mentioned it.  The Minor interface doesn't include any, nor does the
actual JNI, as far as I know.  Is there some use case where simply
duplicating references won't work just as well?

The linear functions are just an attempt to make the "explicit free"
discipline less troublesome.  Distinguishing local and global
references, and associating the former with calls, is the same.  I
agree that they don't make it non-troublesome.

I see the idea of explicitly freed references with dynamic lifetimes
as the essential idea in the JNI model.

> What about your parser example?  You exhibit code like this:
> 
>   /* The type of Bison semantic values.  */
>   #define YYSTYPE mn_ref *
> 
>   [....]
> 
> 
>   list: '(' list_data ')' { $$ = $2 };
> 
>   list_data:
>       datum list_data { $$ = mn_to_cons (c, $1, $2); }
>     | datum '.' datum { $$ = mn_to_cons (c, $1, $3); }
>     |                 { $$ = mn_null (c); }
>     ;
> 
> 
> It's worth noting first that that's pretty fragile code in two ways:
> 
> First, actions such as the one in:
> 
>       datum list_data { $$ = mn_to_cons (c, $1, $2); }
> 
> are destructive of $1.   A simple modification to:
> 
>       datum list_data { 
>                         $$ = mn_to_cons (c, $1, $2); 
>                         log_obj_added_to_list (c, $1);
>                       }
> 
> with the intention of logging the list element, not the new list spine
> pair, is incorrect.

Right; the object is freed too early.  But all explicit-free models
have this problem.  Your interface will, too, won't it?  (You've
provided example code for this below, so that's probably the better
place to answer the question.)

> Second, all intermediate values constructed in this parse but _not_
> stored in a reference that will be destructively updated (such as $2
> in the action above) are GC protected for the lifetime of the parse.
> To have a parse that protected a number of locations bound by the 
> depth of the value stack, one would need to write something like:
> 
> 
>       datum list_data { 
>                         $$ = mn_to_cons (c, $1, $2); 
>                         mn_unref (c, $2);
>                       }
> 
> In other words, on two counts at least, the enticing simplicity 
> of the exhibited code is at least a little bit misleading.

mn_to_cons is defined to free both its arguments.  Given that, there
shouldn't be any reference leak in the code as written, right?  (Not
that I expect you to rush off and check minor.h every time I post
code...)

("Linear" isn't a great term for this any more: in the sense that
"mn_to_car" is linear, "mn_to_cons" is a Y-shaped thing.  Using "to"
as my linearity marker in function names isn't great either; one would
like to use it to mark type-conversion functions.  So I've got to
revise all that.)

I certainly may have missed something, but my intention wasn't to
mislead:

- As far as I know, the problems of linearity are shared by all
  explicit-free models.

- And as far as I know, the number of references used by the posted
  code is proportional to the depth of the parse stack by the end of
  each action.

So I think you are right to be enticed by that enticing simplicity.  :)


> I suppose that the brute force Pika solution would look something
> like:
> 
>       datum list_data { 
>                         $$ = scm_allocate_location (instance);
>                         scm_cons ($$, instance, $1, $2);
>                         scm_location_unref (instance, $1);
>                         scm_location_unref (instance, $2);
>                       }
> 
> which, although four times as verbose as your original code (twice as
> verbose as the more robust form of your code), is not fragile wrt to
> "linear" operations and is accurate wrt to GC.

I can't refer to $1 and $2 after those *_unref calls, right?  The
purpose of the linear versions of functions like mn_cons is simply to
reduce clutter from the 'free' calls (while leaving them visible), and
to put them someplace they could be optimized.  But they're still
there; I think our code is essentially the same.