This page is part of the web mail archives of SRFI 56 from before July 7th, 2015. The new archives for SRFI 56 contain all messages, not just those from before July 7th, 2015.
Alex Shinn wrote:
Ideally, as Bear mentioned earlier, I like to think of the byte-level operations as the only primitives on top of which character-level operations are defined, but that is an implementation detail.
Yes, but you don't want to force every Scheme implementor to have to manage this char<->byte mapping in the Scheme run-time, as opposed to being able to use existing C/C++/Java APIs which don't work the way you want them to work.
"Complicated" should not prevent us from adding language features, and I don't see this as any more complex than having additional primitive port types.
Byte<->Char conversion is complicated. Not conceptually, but there are big tables and and a good chunk of code if you want to support many languages. Most operating systems and "core libraries" these days can do the translation. You really don't want to implement this code in your Scheme runtime, but instead you want to build on existing libraries and APIs. Existing APIs (Java, C++, C) disinguish byte I/o from chracter I/O, generally using different types. They may not support easy on-the-fly switching between binary mode and character mode. So the proposed model means Scheme run-times have to open ports in binary mode and do their own byte<->char conversion. That is not a nice to ask of Scheme implementors.
It makes no sense to mix character and binary I/O on the same port. Anyone who tries it is in a state of sin.I work very often with binary file formats, including Scheme libraries for handling ELF, TIFF, and the gettext .mo format among others. Every one of these mixes binary and character data.
I did not say character data - I said character I/O. It is perfectly feasable to read/write character and string data from/to a binary stream - but then you have to define how they are encoded or do the mapping before/after you write/read them. If you're in a Japanese locale, and write a string to an ELF file, what happens? What happens when I call (newline) in a Windows environment - should it write "\n" or "\r\n"? > Apparently almost
everyone who has ever designed a binary format is a sinner :)
Most of these formats don't support general characters. Of course you can have general characters encoded in a ELF section, but ELF views that as just binary data. ELF does know about labels and section names, but there is no support for multiple encodings or wide characters.
3) Extract character data in binary ports as binary first then convert with utility procedures to character/string.
Yes, conceptually that is what should be going on. But if you want to be able to do binary I/O on an arbitary port (that was opened in default mode) then that constrains the implementation unacceptably. Existing code that implements ports may have to be extensively rewritten. -- --Per Bothner per@xxxxxxxxxxx http://per.bothner.com/