[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: when GC is permitted
> From: Eric Knauel <knauel@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
> On Thu 15 Jan 2004 00:16, Tom Lord <lord@xxxxxxx> writes:
>> I sampled some of the C code in a version of SCSH that I have on hand
>> (0.5.2 -- sorry, a download for a more recent version was taking _way_
>> too long so I'll risk being embarassed that everything has changed
>> since then).
> Actually, that changed completely in the 0.6-series, it's almost
> exactly the FFI Scheme48 is using. That's why migration of the
> existing bindings for scsh *0.6-series* and Scheme48 is easy.
Ok, then -- the motivation of the authors is now somewhat clearer to
me. I have 0.6.5 now. I'm looking at the revamped posix_regexp_match
(in regex1.c) and notice that:
~ it doesn't GC protect it's parameters as required by the SRFI
(BTW: this appears to _not_ be a bug in the context of s48 because
of assumptions the code makes about what can and can't cause
collection. However, a simplistic conversion of this code to the
analogous draft-FFI functions would, indeed, have a bug in this
~ it assumes that STRING_LENGTH returns an integer (SRFI says long)
~ it uses s48_raise_range_error which the SRFI doesn't provide
~ it contains the code:
There is no _enter_fixnum in the draft and, properly,
there is no number-constructing function in the draft
which is not in the "(may GC)" category. Yet that code
is not GC-safe if s48_enter_fixnum is replaced by a
possibly GC-causing function.
~ an instance of comparing to S48_FALSE using !=, an instance
of comparing to S48_TRUE using !=, and two instances of comparing
to S48_TRUE using ==
~ general assumption that s48_extract_string is not in
the the "may GC" class
Of course the draft agrees with that but I point it out here
to emphasize that the draft is fragile in this sense. If the
primary motivation is to be able to publish a few 10K LOC from
SCSH under a SRFI FFI then either the draft _can_not_ change
extract_string to "may GC" or all of that code must be reviewed
~ more use of error signalling functions not provided by the draft
~ this code which is incorrect under the current interpretation
of the draft (because it is incompatible with copy collection):
s48_cons (sch_result_cstime, S48_NULL))
> > > - most of scsh
> > > - bindings for ODBC (also for scsh)
> > > - bindings for NIS and LDAP (also for scsh)
> > I'd appreciate it if you could say more about this: quantities of
> > code, filenames and distributions containing them, and what you think
> > the effort of migration from native-scsh to draft-ffi would involve.
Thank you for replying, to that, btw.
> The scsh CVS repository at sourceforge.net contains ODBC and LDAP
> bindings in the modules scsh-ldap and the directory
> scsh/scsh/odbc. The LDAP bindings are almost complete and about
> 1200 LOC C-code and 1100 LOC Scheme-code (about 300 LOC automatically
> generated). The ODBC bindings consist of about 3000 LOC C-code
> (partially tricky) and about 2000 LOC Scheme-code.
> Currently, I'm busy cleaning up the ODBC bindings and changing them to
> use the SRFI 34/35 exception system. Building the c-stub as a shared
> module that can be dlopen()'ed by scsh and Scheme48 is also on my
> I'm very confident that migrating those bindings to the SRFI-FFI is
> not much work. Checking whether the GC annotations are (still)
> correct and a few search/replace-operations should be enough.
(1200 - 300) + 3000 * trickiness_bonus ~= 7000
I'm confident too that migrating s48 bindings to the draft is not, in
some sense, much work. That isn't my point.
I have two points, actually:
1) The kinds of bugs I found in syscalls1.c and regexp1.c are
a big deal in at least three respects:
a) They suggest that to the degree rapidly releasing this code under
the draft FFI is a priority for the authors, the draft is
constrained _by_this_code_ to not change in what would otherwise
be some fairly minor ways. (For example, that _extract_string
might GC.) In other words, the degree of value the authors
place on getting this particular code out easily is the same
degree they face a conflict of interest when it comes to modifying
b) These bugs include some that _will_ be bugs under the draft FFI
such as the pervasive assumption that enter_fixnum can not GC
and the occaisional vestige of C == and != comparisons to
certain "constants". The nested calls to s48_cons are another
c) The style of the code in posix_regexp_match -- in particular that
it is written with very strong assumptions (stronger than the
draft's in fact) about when GC can occur -- suggests to me that
(i) the proposed FFI is fairly hard to use and (ii) it's very
fragile and constraining of implementors. The trickiness that
(in s48, not in the draft) permits parameters to go unprotected
in posix_regexp_match is an example of why the proposed interface
is hard to use well. That this same code becomes wrong under
the fairly minor differences between the s48 ffi and the draft
illustrates how fragile the draft is.
2) I don't mean to diminish the work that has gone into this stuff but
we seem to be talking about, what, 20K LOC all told?
That's 20K LOC that, to be correct under the draft, will have to be
reviewed for the kinds of errors I found in syscalls1.c and
Meanwhile -- what happens if (a) the draft is finalized; (b) a
bunch of implementors provide it; (c) by hook or by crook a
certain amount of the SCSH code winds up being widely used.
Then we have a superficially credible Scheme FFI contradicted only
by the discussions on this list. Will it then be considered a
success if a few months later instead of 20K LOC depending on it
we have, scattered in various projects, 200K LOC depending on it?