This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.
On 7/28/05, Tom Emerson <tree@xxxxxxxxxxxxx> wrote: > > I'm not missing his point, actually. The stand-off markup may be > generated by someone else, say the data provider (in the case of data > acquired from the LDC or ELDA) and hence I do not have any Scheme > serialized data, rather character offsets into a UTF-8 scheme. Do either of those actually supply UTF-32 files along with data files holding codepoint offsets? UTF-8 is by far the most common storage format for Unicode, and required by most network protocols. Regardless, this has nothing to do with strings. This involves seeking to a byte position in a file, and extracting (and optionally converting to the internal encoding) a chunk of text. -- Alex