This page is part of the web mail archives of SRFI 83 from before July 7th, 2015. The new archives for SRFI 83 contain all messages, not just those from before July 7th, 2015.
[I apologize - this message is somewhat off topic.] Lauri Alanko wrote:
On Tue, Jan 24, 2006 at 11:51:34AM -0800, Per Bothner wrote:What would using symbols and s-exp gain? What kind of operations would it make easier?There are two different issues here: how should paths or URIs be represented at run-time, and what kind of notation should be used for giving literal values for them in code. As you are speaking about "operations", I assume you mean the former here. To me it is obvious: _all_ common operations on URIs are easier if you have a structured representation instead of a flat string. Maybe the most common operation is resolving a relative URI against a base URI. A purely string-based implementation is a huge mess that involves searching for slashes from right to left (but remembering that consequent slashes count as a single one),
Actually, two slashes define the "authority" part.
detecting ".." and "." -segments and whatnot... it's the sort of thing you expect to see only in C code.
It's not *that* complicated. And note that the specification is in terms of string operations, so making sure that a "structued" implementation gives the correct results may actually be more difficult.
Any sane implementation will first parse the URI into its constituents and form a list of path segments, and then operate on that list. It would be just silly to constantly parse and unparse the URIs at every operation, so it's better to have a distinct internal representation for them. And indeed, this is why many languages do have special types or classes for representing URIs.
I don't disagree. Though "parsing and unparsing for every operation" is unlikely to be performance critical. More, it may actually be faster on modern computers, because it is more compact, and locality is great. (Remember that to a first approximation on modern computers instructions take no time - it is cache misses that are expensive.)
What about "path names" (as used in file operations): Should they be structured objects or strings?Definitely objects. Nowadays PLT Scheme has built-in support for path objects, but before that I used to use a simple library: ... Here relative-path calculates the relative path from "from" to "to". Would you like to do this kind of stuff using _strings_?
No - I want this to be hidden in my implementation, using appropriate library procedures. My actual preference is an abstract opaque "path" type with operations that can map to and from URI strings. So whether the internal representations uses URI strings or lists should be an implementation issue.
I just find it sad that underneath all these high-level conveniences, the operating system still uses strings for paths in the system call interface. As a result, '/' is an utterly magical character that cannot appear in any file's name.
I agree. Though I'm not sure how one would fix that, given that one does want a displayable and printable external representation. The RFC solution allows you to escape special characters, which means you've changed reserved '/' for reserved '%'.
There are good reasons to prefer strings (standard, universal, and familiar, as listed above). At least it makes sense to read and print pathnames using URI syntax.Certainly it should be possible, but hardly the default.
Ignoring path-name literals (which I think are less frequent), you still have to get pathnames from the user or the system. S-expressions as external syntax would still have to be validated, plus I don't think it would be the choice for user interfaces.
XML's surface syntax is also standard, universal and familiar. Would you suggest that XML data in Scheme code be therefore expressed with strings: "<foo>bar<baz/></foo>" instead of, say, Xexprs: (foo "bar" (baz))?
The latter, with one caveat: In Kawa, XML data are represented with special types, and I think this is needed to best match the XML data model. (Namespaces are one factor.) What happens in Kawa is that: (foo "bar" (baz)) *evaluates* to XML data, but it isn't XML data in itself. (It depends what you're trying to do whether this distinction is worthwhile, of course.) -- --Per Bothner per@xxxxxxxxxxx http://per.bothner.com/