Library Files Utilities
Derick Eddington
This SRFI is currently in ``draft'' status. To see an explanation of
each status that a SRFI can hold, see here.
To provide input on this SRFI, please
mail to
<srfi minus 104 at srfi dot schemers dot org>
. See
instructions here to
subscribe to the list. You can access previous messages via
the archive of the mailing list.
This SRFI implements SRFI 103: Library Files as a library. It may be used as the means for Scheme implementations to support SRFI 103; in which case, the dynamically configurable aspects of this SRFI are the configuration of SRFI 103; which makes the configuration of SRFI 103 dynamically reconfigurable and inspectable by users. If this SRFI is not used by Scheme implementations as the means to support SRFI 103, it is still useful for building upon to create software for managing library files and for users working with library files. A reference implementation is provided.
SRFI 103: Library Files only defines a standard for naming and finding library files. To assist in working with library files as defined by SRFI 103, and to assist Scheme implementations in supporting SRFI 103, this SRFI provides a library API of procedures and parameters for working with and configuring all the aspects of SRFI 103. E.g., a Scheme implementation can use this SRFI as its primary means of importing external libraries. Or, e.g., a library manager application can use this SRFI to do transcoding of path names when exchanging library files between different file systems, or to work with library files in other ways.
Implementations of this SRFI as an R6RS-like library must be named
(srfi :104 library-files-utilities)
, and there must also be an alias
named (srfi :104)
, following
SRFI 97: SRFI Libraries.
This specification refers to many aspects of SRFI 103: Library Files, and familiarity with it is assumed.
The name to use as the implementation-specific component of the file name
extension. It must be a non-empty string which is the non-encoded name. Its
encoded form is used to add or recognize the component. The four special
characters of SRFI 103, #\0
through #\9
, and any
characters encode-char?
returns true for are encoded. This
parameter must be initialized to the name of the host implementation.
If this SRFI is used as the means for the host implementation to support SRFI 103, changing the value of this parameter will dynamically reconfigure the implementation name used by SRFI 103. If this SRFI is not used as the means for the host implementation to support SRFI 103, changing the value of this parameter will not affect the implementation name used by SRFI 103.
The character used by the host platform to separate components in paths.
It is used to join or split components of paths. It is encoded in components of
relative library file paths. It must be a character, and it must not
be #\%
, #\.
, #\^
, or the same as the
value of environment-variable-separator
. This parameter must be
initialized to the host platform's path separator character.
If this SRFI is used as the means for the host implementation to support SRFI 103, changing the value of this parameter will dynamically reconfigure the path separator used by SRFI 103. If this SRFI is not used as the means for the host implementation to support SRFI 103, changing the value of this parameter will not affect the path separator used by SRFI 103.
The character used by the host platform to separate paths in the
SCHEME_LIBRARY_SEARCH_PATHS
environment variable. It must be a
character, and it must not be the same as the value of
path-separator
. This parameter must be initialized to the host
platform's environment variable separator character.
If this SRFI is used as the means for the host implementation to support SRFI 103, changing the value of this parameter will dynamically reconfigure the environment variable separator used by SRFI 103. If this SRFI is not used as the means for the host implementation to support SRFI 103, changing the value of this parameter will not affect the environment variable separator used by SRFI 103. Note that SRFI 103 uses its environment variable separator only once when initializing its search paths.
The list of names of directories to search for library files, in order of precedence. It must be a list, possibly empty, of paths which must be independent, i.e., one cannot be a prefix of another. This parameter must be initialized to the host implementation's search paths.
If this SRFI is used as the means for the host implementation to support SRFI 103, changing the value of this parameter will dynamically reconfigure the search paths used by SRFI 103. If this SRFI is not used as the means for the host implementation to support SRFI 103, changing the value of this parameter will not affect the search paths used by SRFI 103.
OK: | (search-paths '("." "asdf/fdsa" "/foo/bar/blah" "/foo/bar/zab")) |
ERROR: | (search-paths '("/foo/bar" "/foo/bar/zab")) |
The procedure which lists directories. It is used when finding library
files. It must be a procedure which takes a path naming a directory and returns
a list, possibly empty, of strings which are the relative names of the entities
in the directory. The procedure must return #F
if the directory
does not exist, so that the search may continue. The procedure may raise an
exception if there is any problem listing the directory (e.g., "not a directory"
or "permission denied"), or it may return #F
to ignore the problem
and allow the search to continue. This parameter must be initialized to a
procedure which lists directories as described.
If this SRFI is used as the means for the host implementation to support SRFI 103, changing the value of this parameter will dynamically reconfigure the means of listing directories used by SRFI 103. If this SRFI is not used as the means for the host implementation to support SRFI 103, changing the value of this parameter will not affect the means of listing directories used by SRFI 103.
The predicate which determines what additional characters to encode in
components of relative library file paths. It must be a procedure which takes a
character and returns true or #F
; true means the character will be
encoded, #F
means it will not be. It cannot determine whether the
special characters #\%
, #\.
, #\^
, and the
value of path-separator
are encoded because they are always
encoded. If an appropriate set of characters to encode for the host platform's
file system can be determined, this parameter must be initialized to a
procedure which indicates to encode these. If this cannot be determined, its
default value is a procedure which always returns #F
.
If this SRFI is used as the means for the host implementation to support SRFI 103, changing the value of this parameter will dynamically reconfigure the additional characters encoded by SRFI 103. If this SRFI is not used as the means for the host implementation to support SRFI 103, changing the value of this parameter will not affect the additional characters encoded by SRFI 103.
Returns a list, possibly empty, of strings which are the paths from the
current value of the SCHEME_LIBRARY_SEARCH_PATHS
environment
variable at the time the procedure is called, in the same order they occured in
the environment variable. The character used to separate paths in the
environment variable is the current value of
environment-variable-separator
. Empty elements (e.g., caused
by "a/b::c/d"
, ":a/b"
, etc.) are filtered out. If the
environment variable is not defined, '()
is returned.
Given a datum representing a <library name> (as defined by
R6RS
7.1), return a string which is a relative library file path which represents
the library name and can be used as the name of a file containing a library with
the given library name. The sequence of symbols in the library name are encoded
to make the leading path components. If the second argument is true, a last
path component with prefix "^main^"
is implicitly appended. If the
third argument is true, the file name extension of the last path component is
prepended with the encoded form of the current value of
implementation-name
. When encoding the library name's symbols and
the implementation name, the current value of encode-char?
determines what characters are encoded in addition to the characters which are
always encoded. The current value of path-separator
is used to
construct the returned path.
(library-name->path '(foo) #F #F) => "foo.sls" |
(library-name->path '(foo) #T #F) => "foo/^main^.sls" |
(library-name->path '(foo) #F #T) => "foo.acme.sls" |
(library-name->path '(foo) #T #T) => "foo/^main^.acme.sls" |
(library-name->path '(foo bar zab (1)) #F #F) => "foo/bar/zab.1.sls" |
(parameterize ((implementation-name "Δ") (path-separator #\\) (encode-char? (lambda (c) (not (char<=? #\a c #\z))))) (library-name->path '(foo ♥ λ bar (1 2 3)) #T #T)) => "foo\\%E2%99%A5\\%CE%BB\\bar\\^main^.1.2.3.%CE%94.sls" |
Given a path, if the path is a valid library file path then return an
association list of information about the path, else return #F
.
This can be used as a predicate to recognize library file paths versus other
paths. One association is always present: key 'library
and value
being the library name decoded from the path. If the path begins with one of
the search paths, an association is present with key 'search-path
and value being the search path as a string. If the last path component begins
with the implicit file name prefix "^main^"
, an association is
present with key 'implicit
and value being #T
. If the
file name extension is implementation-specific, an association is present with
key 'implementation
and value being the decoded
implementation-specific component as a string. Encoded characters are decoded
regardless of the value of encode-char?
. The current value of
path-separator
is used to recognize separate path
components.
(library-file-path-info "foo.sls") => ((library . (foo))) |
(parameterize ((search-paths '("/ab/cd/ef"))) (library-file-path-info "/ab/cd/ef/foo/bar/zab.1.2.sls")) => ((library . (foo bar zab (1 2))) (search-path . "/ab/cd/ef")) |
(library-file-path-info "foo/^main^.sls") => ((library . (foo)) (implicit . #T)) |
(library-file-path-info "foo.acme.sls") => ((library . (foo)) (implementation . "acme")) |
(parameterize ((search-paths '("/ab/cd/ef"))) (library-file-path-info "/ab/cd/ef/foo/bar/^main^.1.2.3.acme.sls")) => ((library . (foo bar (1 2 3))) (search-path . "/ab/cd/ef") (implicit . #T) (implementation . "acme")) |
(parameterize ((encode-char? (lambda (c) #F))) (library-file-path-info "%E2%99%A5/%CE%BB.%CE%94.sls")) => ((library . (♥ λ)) (implementation . "Δ")) |
(let ((info (library-file-path-info "♥/λ/^main^.7.Δ.sls"))) (parameterize ((implementation-name (cond ((assq 'implementation info) => cdr) (else "ignored"))) (path-separator #\\) (encode-char? (lambda (c) #T))) (library-name->path (cdr (assq 'library info)) (assq 'implicit info) (assq 'implementation info)))) => "%E2%99%A5\\%CE%BB\\^main^.7.%CE%94.sls" |
(library-file-path-info "foo.png") => #F |
(library-file-path-info "foo.1.+2.3.sls") => #F |
(library-file-path-info "^main^.sls") => #F |
Given a datum representing a <library reference> (as defined by
R6RS
7.1), find the files in the search paths whose paths match the library
reference, and return an association list describing the matching paths and
their ordering. Each association represents a search path which contains at
least one match. No association is present for a search path which does not
contain a match. The key of each association is the search path the association
represents. The associations are ordered the same as their keys are in
the search-paths
parameter. The value of each association is a
list of length one or two which represents the one or two possible directories
in the association's search path which contain matches. A directory containing
implicit file names is one of the possibilities, and a directory containing
non-implicit file names is the other possibility. If both directories exist and
contain matches, the directory containing implicit file names is ordered first.
Each element of this list is a non-empty ordered list of matching paths from the
corresponding directory, and these paths are relative to the association's
search path, and they are ordered as described in the
Ordering
section of SRFI 103. If no matches are found, '()
is returned.
/s/p/a/
foo/
bar.1.0.acme.sls
bar.1.2.other.sls
bar.1.2.sls
bar.1.acme.sls
bar.1.sls
bar.2.acme.sls
bar.2.sls
bar.acme.sls
bar.other.sls
bar.png
bar.sls
zab.sls
bar/
^main^.1.9.acme.sls
^main^.sls
blah.sls
s/p/c/
foo/
bar.1.1.sls
bar.3.sls
bar.other.sls
bar/
^main^.2.sls
^main^.other.sls
spb/
foo/
blah.sls
zab.sls
bar/
^main^.0.7.acme.sls
^main^.0.9.sls
^main^.1.0.sls
^main^.1.2.acme.sls
^main^.other.sls
^main^.png
zab.sls
spd/
foo/
it.sls
bar/
thing.sls
(parameterize ((search-paths '("spd" "s/p/c" "spb" "/s/p/a")) (implementation-name "acme")) (find-library-file-paths '(foo bar (1)))) => (("s/p/c" ("foo/bar.1.1.sls")) ("spb" ("foo/bar/^main^.1.2.acme.sls" "foo/bar/^main^.1.0.sls")) ("/s/p/a" ("foo/bar/^main^.sls" "foo/bar/^main^.1.9.acme.sls") ("foo/bar.acme.sls" "foo/bar.sls" "foo/bar.1.2.sls" "foo/bar.1.0.acme.sls" "foo/bar.1.acme.sls" "foo/bar.1.sls")))
Given a data structure returned by
find-library-file-paths
, join the relative paths with the
search paths they are under and return a flat list of these joined paths,
preserving the ordering in library-file-paths.
(join-and-flatten '(("s/p/c" ("foo/bar.1.1.sls")) ("spb" ("foo/bar/^main^.1.2.acme.sls" "foo/bar/^main^.1.0.sls")) ("/s/p/a" ("foo/bar/^main^.sls" "foo/bar/^main^.1.9.acme.sls") ("foo/bar.acme.sls" "foo/bar.sls" "foo/bar.1.2.sls" "foo/bar.1.0.acme.sls" "foo/bar.1.acme.sls" "foo/bar.1.sls")))) => ("s/p/c/foo/bar.1.1.sls" "spb/foo/bar/^main^.1.2.acme.sls" "spb/foo/bar/^main^.1.0.sls" "/s/p/a/foo/bar/^main^.sls" "/s/p/a/foo/bar/^main^.1.9.acme.sls" "/s/p/a/foo/bar.acme.sls" "/s/p/a/foo/bar.sls" "/s/p/a/foo/bar.1.2.sls" "/s/p/a/foo/bar.1.0.acme.sls" "/s/p/a/foo/bar.1.acme.sls" "/s/p/a/foo/bar.1.sls")
The reference implementation is provided as an R6RS library. It requires a directory listing procedure, a number of R6RS bindings, SRFI 39: Parameter Objects, and SRFI 98: An Interface to Access Environment Variables. It can be used by Scheme implementations as a built-in library, e.g., in a boot image. It can also be used as an externally-imported library.
For use as an externally-imported library, the reference implementation uses implementation-specific library files in order to initialize the parameters of this SRFI. Files are provided for Ikarus, Larceny, PLT, and Ypsilon, and these files should make it clear how other implementations can be supported. If this SRFI is built-in to a Scheme implementation, the implementation-specific libraries are not needed and the main library can be easily adapted to not use them.
The test program is provided as an R6RS program. It requires, in addition to an implementation of this SRFI, SRFI 39: Parameter Objects, and SRFI 78: Lightweight Testing.
The reference implementation and tests.
(Section which points out things to be resolved. This will not appear in the final SRFI.)
Are the initialization helper libraries all correct and as complete as they should be?
Should find-library-file-paths
return a joined and
flattened list, and get rid of
join-and-flatten
, instead?
library-file-path-info
can be used on the paths to know
their search path, etc.
How should absolute paths which aren't prefixed with a search path be interpreted?:
(parameterize ((search-paths '())) (library-file-path-info "/a/b/c/foo.sls")) => #F or ((library . (a b c foo))) ??? (parameterize ((path-separator #\\) (search-paths '())) (library-file-path-info "C:\\a\\b\\c\\foo.sls")) => ???
Relative paths are OK because it makes sense to use all the components:
(parameterize ((search-paths '())) (library-file-path-info "a/b/c/foo.sls")) => ((library . (a b c foo)))
What should be done about this absolute vs. relative path bug?:
(parameterize ((search-paths '("/a/b/c"))) (library-file-path-info "a/b/c/foo.sls")) => ((library . (foo))) ;; should be ((library . (a b c foo))) (parameterize ((search-paths '("a/b/c"))) (library-file-path-info "/a/b/c/foo.sls")) => ((library . (foo))) ;; should it be #F, ;; or should it be ((library . (a b c foo)))?
A substring
and string=?
could easily be done
instead, but then this would not work:
(parameterize ((search-paths '("/a/b/c"))) (library-file-path-info "//a///b////c/////foo.sls")) => ((library . (foo)) (search-path . "/a/b/c")) (parameterize ((search-paths '("//a///b////c"))) (library-file-path-info "/a/b/c/foo.sls")) => ((library . (foo)) (search-path . "//a///b////c"))
Some sort of smarter path splitting and joining is needed. A first
"/"
could be considered both a component and a separator, but
then what would be the ramifications to non-POSIX path support? We want to
support the ability to use different path types at run-time (via
parameterization). Make some path type parameter..? Incorporate path
normalization..? I wish a suitable path manipulation SRFI existed...
TODO: Anything else?
I thank everyone who helped with SRFI 103: Library Files. I thank all those who participated during the draft period of this SRFI. I thank David Van Horn for editing this SRFI and for suggesting it be separated from SRFI 103.
Copyright (C) Derick Eddington (2009). All Rights Reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.