This page is part of the web mail archives of SRFI 83 from before July 7th, 2015. The new archives for SRFI 83 contain all messages, not just those from before July 7th, 2015.
Already there has been a great deal of contention over several items in this SRFI. Although I know that this is a very difficult issue to tackle, and my initial reaction was reservation about joining the discussion, I think there are several ways to significantly improve the design of the system. Some of them I'm sure the authors will disagree with, but I'll put them forth anyway. My alternative proposal and some examples of it are available on the web at <http://mumble.net/~campbell/proposals/module.text> 1. Source storage and naming. We have seen several different view- points about how source should be stored and modules named. Some want a convenient mechanism for listing many modules in a single file, i.e. a module->file surjection; others want a bijective file<->module mapping; and there are still others, such as myself, who at times want certain modules to be able to be spread throughout multiple files, i.e. a file->module surjection. There is another position of wanting independence of the file system and module system, which I agree with as being useful. Furthermore, there are strong reasons to prefer being able to identify top-level forms in a module's body by being aligned at column 0 of lines in the file. All of these points are useful, and I think none of them really takes any significant precedence over the others. I suggest, though, that there is a way to support all of these desirable properties of the module system. What I am about to offer also has several more properties as well: it provides easy static analysis of a software system's organization at a high level, and it could allow for easier transition to R6RS. My suggestion is this, based on Scheme48's module system: that module descriptions be in a simple language distinct from Scheme, which would be in general placed in a file separate from those of the source code of modules' implementations. There would be one module descriptions file for a collection of related modules that would describe the high-level organization of the components of the system. In each module definition, there might be a (FILES pathname ...) clause specifying that there is implementation source in the supplied pathnames; there might be a (CODE command or definition ...) clause specifying embedded source; and there might be yet other clausess for more exotic forms of source storage. The module definition would also contain IMPORT(S?) clauses, clauses specifying the interface that the module implements (on which I shall elaborate below), FOR-SYNTAX clauses specifying the macro environment (akin to SRFI 83's current (FOR ... EXPAND), but arbitrarily nestable), and possibly more clauses. This structure has a number of advantages: a. Those who wish to embed multiple libraries in a single file can do so simply by using embedded CODE clauses in a module description file. b. Those who wish to have a file<->module bijection can simply use a single FILES clause for each of their modules. c. Those who wish to support multiple files in a single module can simply use multiple FILES clauses, or FILES clauses with multiple pathnames. d. The program's high-level organization is very clearly specified and laid out, making it easy to examine how all modules are interrelated -- easy not just for a human but also for a machine, simplifying, for instance, cross-referencing tools or editor features. e. This organization is conducive to static processing without delving into all the Scheme code. For example, one could load a single module description file for one software bundle, without requiring that all of the source be touched until the modules be actually needed: each module in the set of module descriptions could be loaded on demand (and not even the whole set at once would be needed, only the individually necessary ones). In this way, many, many modules can be available in a running image, with plenty of information about them supplied (e.g., for browsing) without requiring that they all be loaded, bloating the image. f. The organization I propose more clearly separates the module language from the Scheme language, which in the present proposal is not so clear, though the separation is certainly still equally present. The meaning of program terms at 'the top level' is relegated to a small remark in the issues section noting that no such semantics is specified to only very briefly explain this separation. g. By specifying module descriptions separately from module source, reader extensions are *much* more easily isolable into the module description and pose no problems of needing to read a module's source before all the information necessary to do so is available. h. In my experience, I have found that the Scheme48 module system is very conducive to rapid development for a number of reasons, some of which I shall get into later, but chief among which here is the ability to load a module definition into an image and then to incrementally load definitions for that module's source into the image without having to reload the whole module for each tiny change: the system can determine what environment the new code should be evaluated in and acts accordingly. This is very beneficial to developing many initially small modules very rapidly and conveniently. 2. 'Language' module. I have always been puzzled by this peculiarity in MzScheme's module system. The only reason I have heard for it is that it allows for control of REQUIRE in module bodies by magical mystery macros in PLT, but I am very equivocal toward the inclusion of such limited-use extensibility in the standard. Now, clearly, if there were no language specifying the available names in the first place, there could be no IMPORT name bound in module bodies to import other names for actual usage, but this becomes irrelevant if the method of specifying module imports is moved to the level of a module description language explained above.(*) The imported modules can be uniformly listed without requiring some special position for a 'language' module. (*) The module language could still, in fact, be extensible in this way. In Scheme48, for instance, the module language is implemented as a set of macros that expand to Scheme procedure calls operating on module objects. Macros are also supported in the module language, so custom module system macros could be defined that control OPEN clauses. Also, module language environments, which are no different from regular Scheme environments except in what bindings are available, can be controlled as well to a similar effect. 3. Syntactic tower. For over a decade, I believe, Scheme48's module system has supported a simple mechanism, with a simple implementation, for specifying the environment of macro definitions in modules: i. The OPEN (IMPORT) clauses specify the names bound at the regular lexical environment of the module's body. ii. The OPEN clauses within one level of (FOR-SYNTAX <clause> ...) specify the lexical environment for the right-hand sides of macro bindings. iii. Within another FOR-SYNTAX level, OPEN clauses specify the lexical environment of right-hand sides of macro bindings within the right-hand sides of macro bindings. iv. ...and so on, arbitrarily deeply nested. (FOR-SYNTAX clauses can also contain other clauses; they could, for instance, contain FILES clauses for definitions that should be visible in the macro environment to be used in macros defined in the module.) Also, every module is instantiated once. After being compiled, it need not be compiled again; after being loaded, it need not be loaded again. (Actually, in the current implementation, modules are always immediately loaded after compilation, but this is not strictly necessary.) There is no strange tower of modules being loaded once for some other module using it at compilation time and being loaded (and compiled?) again if the same module uses it at run-time, and perhaps yet again if another module uses the second one at compilation time and run-time both. (I still don't understand the exact details of how this is all interrelated.) This is all very difficult to follow, and it makes it especially hard to work in a traditional image-based Lisp development environment if a module does not mean a single thing but rather a tower of possible instantiations of a module template. The ability to very quickly work in many different modules at once, editing & re-evaluating definitions incrementally, in a running system, has proven utterly invaluable in my experience to rapid development of complex systems. 4. Indirect exports. I think it would be much simpler if a macro's indirect exports were listed within the macro's definition. In some macros, this would not even be necessary; a SYNTAX-RULES compiler, for example, can determine quite easily what auxiliary names are needed by the macro. In instances where the list is statically undecidable, it is most convenient to list it along with the macro's definition, and it does not pollute the specification of the interface, since the list is really an implementation detail of the macro. So I propose that the INDIRECT-EXPORT form be eliminated in favour of adding an indirect export list to macro transformers where it is necessary; e.g.: (define-syntax foo (explicit-renaming-transformer (lambda (form rename compare) `(,(rename 'BAR) ,(caddr form) ,(cadr form))) (BAR))) (define-syntax baz (syntactic-closure-transformer (lambda (form usage-env transformer-env) (close-syntax `(QUUX ,(close-syntax (cadr form) usage-env) ZOT) transformer-env)) (QUUX ZOT))) 5. The issue of separating interface from implementation is clearly omitted as noted in the issues section. This is a very glaring omission which is easy to correct and whose inclusion would dramatically improve the usefulness and applicability of the module system, as well as expanding the room for extension, such as to parametric modules (discussed below). I propose that a top-level module language form be introduced for defining named interfaces, and that the LIBRARY form or equivalent be modified to have a single interface listed, not many EXPORTs scattered throughout the body. I have found that the separate specification of interfaces from implementations is very useful in designing complex systems for many reasons: a. It conduces extra thought to be put to the design of modular components, which may potentially be reused in more general ways or reimplemented easily preserving the interface. b. It simplifies cross-referencing and static analysis of programs by providing the static information of what names are made available by what modules, without, as noted earlier, needing to delve into the modules' bodies. This kind of static analysis can be very useful in the transition to the R6RS module system, because a simple compiler can be used whose input is the R6RS module language and whose output is module definitions for particular Scheme systems that have not yet adopted the R6RS module system. c. Interface definitions are very convenient places for locating API documentation. This could be on-line or off-line, and either embedded as (optional) strings in the interface definition or written as comments in the region of the interface definition. This better separates the interface documentation from internal comments, where previously the two would often be confounded. d. Interfaces can also be used to specify optional static types, for debugging or optimization convenience; Scheme48 allows this, for example. (This is somewhat of an extension of (a) and (c).) e. As interface definitions are separated from the modules that implement them, they can be redefined separately, too. In Scheme48, redefining an interface has the effect of making all new names specified by it available in all modules that opened modules that implement the redefined interface. This is very helpful when developing modules whose interfaces are in a state of flux and the developer has no inclination to reload all of his modules just to make a single newly exported name available in some already loaded module. f. Although I do not propose to include parametric modules in the base module system of R6RS, interface specifications make the concept a much more feasible and simple extension to the module system: the specification of interfaces separately from their implementations allows parametric modules to request parameters with particular interfaces, regardless of the implementation, which would not be possible, or which would be at least extremely cumbersome and repetitive, if all of the interfaces were implicitly specified within the modules that implemented them. Standard ML's functors are an example of this kind of parametric module which have shown themselves to be very useful in practice.