by Alex Shinn
This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-166@nospamsrfi.schemers.org
. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.
A library of procedures for formatting Scheme objects to text in various ways, and for easily concatenating, composing and extending these formatters efficiently without resorting to capturing and manipulating intermediate strings.
This SRFI is an updated version of SRFI 159, primarily with the difference that state variables are hygienic.
written-shared
, pretty-shared
as-italic
, as-color
, as-true-color
, on-color
background variants, and pretty-with-color
ambiguous-is-wide?
state variable and string-terminal-width/wide
utilitysubstring/width
state var for width-aware substring operations, with substring-terminal-width(/wide)
utilitiessubstring/preserve
state var used in trimming, with substring-terminal-preserve
utilitypretty-environment
state variableas-unicode
to terminal-aware
upcased
and downcased
There are several approaches to text formatting. Concatenating
strings to display is not acceptable, since it doesn't scale to very
large output. The simplest realistic idea, and what people resort to
in typical portable Scheme, is to interleave display
and write
and
manual loops, but this is both extremely verbose and doesn't compose
well. A simple concept such as padding space can't be achieved
directly without somehow capturing intermediate output.
The traditional approach in other languages is to use templates - typically strings, though in theory any object could be used and indeed Emacs's mode-line format templates allow arbitrary sexps. Templates can use either escape sequences (as in C's printf and Common Lisp's format) or pattern matching (as in Visual Basic's Format, Perl6's form, and SQL date formats). The primary disadvantage of templates is the relative difficulty (usually impossibility) of extending them, their opaqueness, and the unreadability that arises with complex formats. Templates are not without their advantages, but they are already addressed by other libraries such as SRFI 28 and SRFI 48.
Another important aspect of formatting is state. Common Lisp format provides a "fresh-line" format spec which outputs a newline only if the output stream is not already at the beginning of a line. C++ iostreams allow changing the radix and floating-point precision for numeric output, not just for a single value but as a persistent setting for all future output. Custom formatters which could manipulate their own state would allow for many new possibilities.
This SRFI takes a combinator approach to solving both problems. Formatters are defined, which are called to produce their output as needed, composed with other formatters, and refer to and update arbitrary state. The primary goal of this SRFI is to have a maximally expressive and extensible formatting library. The next most important goal is scalability — to be able to handle arbitrarily large output and not build intermediate results except where necessary. The third goal is brevity and ease of use.
Base show each each-in-list displayed written written-shared written-simply escaped maybe-escaped numeric numeric/comma numeric/si numeric/fitted nl fl space-to tab-to nothing joined joined/prefix joined/suffix joined/last joined/dot joined/range padded padded/right padded/both trimmed trimmed/right trimmed/both trimmed/lazy fitted fitted/right fitted/both fn with with! forked call-with-output make-state-variable port row col width output writer string-width substring/width substring/preserve pad-char ellipsis radix precision decimal-sep decimal-align sign-rule comma-rule comma-sep word-separator? ambiguous-is-wide? Pretty pretty pretty-shared pretty-simply pretty-with-color Columnar columnar tabular wrapped wrapped/list wrapped/char justified from-file line-numbers Unicode terminal-aware string-terminal-width string-terminal-width/wide substring-terminal-width substring-terminal-width/wide substring-terminal-preserve upcased downcased Color as-red as-blue as-green as-cyan as-yellow as-magenta as-white as-black as-bold as-italic as-underline as-color as-true-color on-red on-blue on-green on-cyan on-yellow on-magenta on-white on-black on-color on-true-color
We introduce two new types, formatters
, which are
disjoint from any type except possibly procedures, and state
variables
, which are distinct from any type except possibly
SRFI 39 parameters. These are in fact identical to the SRFI 165
computations and computation environment variables, respectively,
though knowledge of SRFI 165 is not required to use this SRFI.
In the prototypes below the following naming conventions imply type restrictions:
displayed
The naming of formatters and mappers is generally chosen such that they read as adjectives or adverbs describing how the objects they act on are formatted. This provides a natural reading of the code, and allows for a simple mapping between standard operations and their formatting counterparts:
write
: written
display
: displayed
string-pad
: padded
string-trim
: trimmed
string-join
: joined
The SRFI is divided into a core implementation and three utility libraries, which could be defined portably in terms of the core but are provided as convenience extensions. The libraries are as follows:
(srfi 166) ; composite of all of the following (srfi 166 base) ; all bindings not in one of the following (srfi 166 pretty) ; all bindings in Pretty Printing (srfi 166 columnar) ; all bindings in Columnar Formatting (srfi 166 unicode) ; all bindings in Unicode (srfi 166 color) ; all bindings in Formatting with Color
show
output-dest fmt ...)
The entry point for all formatting. Applies the fmt formatters in
sequence, accumulating the output to output-dest. As with SRFI
28 format
, output-dest can be an output port, #t
to
indicate the current output port, or #f
to accumulate the
output into a string and return that as the result of show
.
Each fmt should be a formatter as discussed below. As a
convenience, non-formatter arguments are also allowed and are
formatted as if wrapped with displayed
, described below, so
that
(show #f "π = " (with ((precision 2)) (acos -1)) nl)would return the string
"π = 3.14\n"
.
As mentioned, formatters are an opaque type and cannot directly be
applied outside of show
. Custom formatters are built on the
existing formatters, and as first-class objects may be named or
computed dynamically, so that:
(let ((~.2f (lambda (x) (with ((precision 2)) x)))) (show #f "π = " (~.2f (acos -1)) nl))produces the same result. For typical uses you only need to combine the existing high-level formatters described in the succeeding sections, but see the section Higher Order Formatters and State for control flow and state-manipulation primitives.
The return value of show
is the accumulated string if
output-dest is #f
and unspecified otherwise.
displayed
obj)
If obj is a formatter, returns obj as is.
Otherwise, outputs obj using display
semantics.
Specifically, strings are output as if by write-string
and
characters are written as if by write-char
. Other objects
are output as with written
(including nested strings and chars
inside obj). This is the default behavior for top-level formats
in show
, each
and most other high-level formatters.
It is an error if obj is a procedure which is not a formatter.
written
obj)
Outputs obj using write
semantics. Uses the current
numeric
formatting settings to the extent that the written
result can still
be passed to read
, possibly with loss of precision.
Specifically,
the current radix is used if set to any of 2, 8, 10 or 16, and the
fixed-point precision is used if specified and the
radix is 10.
(show #f (written (cons 0 1))) => "(0 . 1)"
(show #f 1.5 " " (with ((precision 0)) 1.5)) => "1.5 2"
(show #f 1/7 " " (with ((precision 3)) 1/7) " " (with ((precision 20)) 1/7)) => "1/7 0.143 0.14285714285714285714"
Implementations should allow arbitrary precision for exact rational numbers. For example:
(show #f (with ((precision 50)) 1/3)) => "0.33333333333333333333333333333333333333333333333333"
As a less obvious example, using string-segment
from
SRFI 152, the following code returns the first 100 Fibonacci
numbers:
(map string->number (string-segment (show #f (with ((precision 2500)) (/ 1000 (- #e1e50 #e1e25 1)))) 25))
If you don't know the type of an object and want to print it out
for debugging purposes, you should always wrap it with written
or one of its variants, in case the object is itself a formatter.
Note that, for debugging, a convenient idiom is to wrap the
object(s) in a quasiquote list:
(define (add x y) (show #t `(add x: ,x y: ,y) nl) (+ x y))
written-shared
obj)
Like written
, but using data labels for shared structures among all pairs
and vectors, analogous to write-shared
.
written-simply
obj)
Like written
, but doesn't handle shared structures, analogous to
write-simply
. Infinite loops can still be avoided if
used inside a formatter that truncates data (see
trimmed-lazy
below).
escaped
str [quote-ch esc-ch renamer])
Outputs the string str, escaping any quote or escape characters.
If esc-ch, which defaults to #\\
, is #f
,
escapes only the quote-ch, which defaults to #\"
,
by doubling it, as in SQL strings and CSV values. If renamer is
provided, it should be a procedure of one character which maps that
character to its escape value, e.g. #\newline => #\n
,
or #f
if there is no escape value.
(show #f (escaped "hi, bob!")) => "hi, bob!" (show #f (escaped "hi, \"bob!\"")) => "hi, \"bob!\""
maybe-escaped
str pred [quote-ch esc-ch renamer])
Like escaped
, but first checks if any quoting is required (by
the existence of either any quote or escape characters, or any
character matching pred
), and if so outputs the string in
quotes and with escapes. Otherwise outputs the string as is. This
is useful for quoting symbols and CSV output, etc.
(show #f (maybe-escaped "foo" char-whitespace? #\")) => "foo"
(show #f (maybe-escaped "foo bar" char-whitespace? #\")) => "\"foo bar\""
(show #f (maybe-escaped "foo\"bar\"baz" char-whitespace? #\")) => "\"foo\"bar\"baz\""
numeric
num [radix precision sign-rule comma-rule comma-sep decimal-sep])
Formats a single number num. You can optionally specify any radix from 2 to 36 (even if num isn't an integer). precision forces a fixed-point format.
A sign-rule of #t
indicates to output a plus sign (+) for
positive integers. However, if sign-rule is a pair of two strings, it
means to wrap negative numbers with the two strings. For example,
("(" . ")")
prints negative numbers in parentheses, financial
style: -1.99 => (1.99)
.
comma-rule is an integer specifying the number of digits between commas, or a list of integers representing the number of digits between each successive comma, with the first being the least significant digits and the last repeating.
comma-sep is the character to use for commas, defaulting to
#\,
.
decimal-sep is the character to use for decimals, defaulting
to #\.
, or to #\,
(European style) if comma-sep is
already #\.
.
These parameters may seem unwieldy, but they can also take their defaults from state variables, described below, if any are omitted.
numeric/comma
num [comma-rule radix precision sign-rule])
Shortcut for numeric
to print with commas.
(show #f (numeric/comma 123456789)) => "123,456,789" (show #f (numeric/comma 123456789 2)) => "1,23,45,67,89" (show #f (numeric/comma 123456789 '(3 2))) => "12,34,56,789"
numeric/si
num [base separator])
Abbreviates num with an SI suffix as in the -h or --si option to many GNU commands. The base defaults to 1000, using suffix names k, M, G, etc. If the base is 1024, the suffixes are Ki, Mi, Gi, etc. (note the capital "Ki" in this case). It is an error to specify a base other than 1000 or 1024. If separator is provided, it is inserted after the number, before any suffix, for example to allow a space.
(show #f (numeric/si 608)) => "608"
(show #f (numeric/si 608) "B") => "608B"
(show #f (numeric/si 608 1000 " ") "B") => "608 B"
(show #f (numeric/si 3986)) => "4k"
(show #f (numeric/si 3986 1024) "B") => "3.9KiB"
(show #f (numeric/si 1.23e-6) "m") => "1.2µm"
(show #f (numeric/si 1.23e-6 1000 " ") "m") => "1.2 µm"
See https://en.wikipedia.org/wiki/Metric_prefix for the complete list of abbreviations.
numeric/fitted
width n . args)
Like numeric
, but if the result doesn't fit in
width using the current precision
, output
instead a string of hashes rather than showing an incorrectly
truncated number. For example
(show #f (with ((precision 2)) (numeric/fitted 4 1.25))) => "1.25"
(show #f (with ((precision 2)) (numeric/fitted 4 12.345))) => "#.##"
(show #f (with ((precision 0)) (numeric/fitted 2 123.45))) => "##"
nl
Outputs a newline.
(show #f nl) => "\n"
fl
Short for "fresh line," outputs a newline only if we're not already at the start of a line.
(show #f fl) => ""
(show #f "hi" fl) => "hi\n"
(show #f "hi" nl fl) => "hi\n"
space-to
column)
Outputs spaces up to the given column. If the current column is already >= column, does nothing. The character used for spacing is the current value of pad-char, described below, which defaults to space. Columns are zero-based.
(show #f "a" (space-to 5) "b") => "a b"
(show #f "a" (space-to 0) "b") => "ab"
tab-to
[tab-width])
Outputs spaces up to the next tab stop, using tab stops of width
tab-width, which defaults to 8. If already on a tab stop,
does nothing. If you want to ensure you always tab at least one
space, you can use (each " " (tab-to width))
. Columns
are zero-based.
(show #f (tab-to 5) "b") => "b"
(show #f "a" (tab-to 5) "b") => "a b"
(show #f "abcdefghi" (tab-to 5) "b") => "abcdefghi b"
nothing
Outputs nothing (useful in combinators and as a default noop in conditionals).
(show #f "a" nothing "b") => "ab"
each
fmt ...)
Applies each fmt in sequence, as in the top-level of show.
(show #f (each "a" "b")) => "ab"
each-in-list
list-of-fmts)
Equivalent to (apply each list-of-fmts)
but may be more efficient.
joined
mapper list [sep])
Formats each element elt of list with (mapper elt)
,
inserting sep in between. sep defaults to the empty string, but
can be any format or string.
(show #f (joined displayed '(a b c) ", ")) => "a, b, c"
joined/prefix
mapper list [sep])
joined/suffix
mapper list [sep])
(show #f (joined/prefix displayed '(usr local bin) "/")) => "/usr/local/bin"
(show #f (joined/suffix displayed '(1 2 3) nl)) => "1\n2\n3\n"Like
joined
, but inserts sep before/after every element.
joined/last
mapper last-mapper list [sep])
Like joined
, but the last element of the list is formatted with
last-mapper instead.
(show #f (joined/last displayed (lambda (last) (each "and " last)) '(lions tigers bears) ", ")) => "lions, tigers, and bears"
joined/dot
mapper dot-mapper list [sep])
Like joined
, but if the list is a dotted list, then formats the
dotted value with dot-mapper instead.
(show #f "(" (joined/dot displayed (lambda (dot) (each ". " dot)) '(1 2 . 3) " ") ")") => "(1 2 . 3)"
joined/range
mapper start [end sep])
Like joined
, but counts from start (inclusive) to end
(exclusive), formatting each integer in the range with mapper. If
end is #f
or unspecified, produces an infinite stream of
output.
(show #f (joined/range displayed 0 5 " ")) => "0 1 2 3 4"
string-width
state variable, and trimming is then done by
calling substring/width
state variable on the desired left
and right widths. The default values for these are
string-length
and substring
, indicating the
widths are are equivalent to string indexes, however this may have
different semantics as in terminal-aware
discussed below,
and other extensions could be imagined such as enforcing trimming on
word boundaries.
padded
width fmt ...)
padded/right
width fmt ...)
padded/both
width fmt ...)
Analogs of SRFI-13 string-pad
, these add extra space to the
left, right or both sides of the output generated by the fmts
to pad it to width. If width is exceeded, has no effect.
padded/both
will include one more extra space on the right
side of the output if the difference is odd.
padded/right
is guaranteed not to accumulate any intermediate
data.
The padding can be controlled with the pad-char
state
variable described below, defaulting to space.
Note these are column-oriented padders, so won't necessarily work with multi-line output (padding doesn't seem a likely operation for multi-line output).
(show #f (padded 5 "abc")) => " abc"
(show #f (padded/right 5 "abc")) => "abc "
(show #f (padded/both 5 "abc")) => " abc "
trimmed
width fmt ...)
trimmed/right
width fmt ...)
trimmed/both
width fmt ...)
Analogs of SRFI-13 string-trim
, truncates the output of the
fmts to force it in under width columns.
trimmed
truncates on the left, trimmed/right
on the right, and trimmed/both
truncates on both the left
and right, truncating 1 more on the right if the width isn't
even.
If width is not exceeded, is equivalent to each
.
If a truncation ellipsis is set, then when any truncation occurs,
trimmed
and trimmed/right
will prepend and append the
ellipsis, respectively. trimmed/both
will both prepend and
append. The length of the ellipsis will be considered when
truncating the original string, so that the total width will never
be longer than width. It is an error if width is
less than the length of ellipsis, or double the length for /both.
If the state variable substring/preserve
is not
#f
, then this is called on the left and/or right
portions of the output which have been excluded by trim, and the
result of this is output in place of the trimmed text. The default
is #f
(do nothing), but it can be useful to override
this to preserve control sequences which are zero-width regardless
but affect the state of the output stream. In particular
substring-terminal/preserve
as enabled in
terminal-aware
preserves ANSI control sequences and
bidirectional overrides.
For example, consider
(show #f (with ((ellipsis "…")) (trimmed/both 5 "abcdef"))) => "…bcd…"Here we note that the output width of 6 exceeds the requested width of 5 and trimming must be done. Since the ellipsis width is 1, we must split the trimming to remove one character from the left and two from the right. Thus, the output is computed as:
(let ((str "abcdef")) (each (substring/preserve (substring/width str -1 1)) ellipsis (substring/width str 1 4) ellipsis (substring/preserve (substring/width str 4 6))))
Additional examples:
(show #f (trimmed 5 "abcde")) => "abcde"
(show #f (trimmed 5 "abcdef")) => "bcdef"
(show #f (trimmed/right 5 "abcdef")) => "abcde"
(show #f (trimmed/both 5 "abcdef")) => "abcde"
(show #f (trimmed/both 4 "abcdef")) => "bcde"
(show #f (with ((ellipsis "...")) (trimmed 5 "abcdef"))) => "...ef"
(show #f (with ((ellipsis "...")) (trimmed/right 5 "abcdef"))) => "ab..."
(show #f (with ((ellipsis "_")) (trimmed/both 5 "abcdef"))) => "_bcd_"
(show #f (trimmed/right 2 "日本語")) => "日本"
(show #f (terminal-aware (trimmed/right 2 "日本語"))) => "日"
trimmed/lazy
width fmt ...)
A variant of trimmed
which generates each fmt in
left-to-right order, and truncates and terminates
immediately if more than width characters are generated.
It does not output ellipsis.
Thus this is safe to use with an infinite amount of output,
e.g. from written-simply
on an infinite list.
fitted
width fmt ...)
fitted/right
width fmt ...)
fitted/both
width fmt ...)
A combination of padded
and trimmed
, ensures the output
width is exactly width, truncating if it goes over and padding if
it goes under.
pretty
obj)
Pretty-prints obj. The result should be identical to
written
except possibly for differences in whitespace to make
the output resemble formatted source code. Implementations should
print vectors and data lists (lists that don't begin with a (nested)
symbol) in a tabular format when possible to reduce vertical space.
As with written
, cyclic structure must be detected and
represented with datum labels.
pretty-shared
obj)
The same as pretty
but using data labels for shared structures among all pairs
and vectors, analogous to write-shared
.
pretty-simply
obj)
The same as pretty
, but without using any datum labels.
pretty-with-color
obj)
Equivalent to pretty
, but may optionally include ANSI control
sequences (as in Formatting with Color
below) to provide syntax highlighting. In such a case, the raw output
may not be directly parseable with read
.
(srfi 166 columnar)
library.
Although tab-to
, space-to
and padding/trimming can be
used to manually align columns to produce table-like output, these
can be tedious to use. The optional extensions in this section make
this easier.
columnar
column ...)
Formats each column side-by-side, i.e. as though each were formatted separately and then the individual lines concatenated together. The current line width (from the width state variable) is divided evenly among the columns (setting their state variables accordingly), and all but the last column are right-padded. For example
(show #t (columnar (displayed "abc\ndef\n") (displayed "123\n456\n")))outputs
abc 123 def 456assuming a 16-char width (the left side gets half the width, or 8 spaces, and is left aligned). Note that we explicitly use
displayed
instead of the strings directly. This is because
columnar
treats raw strings as literals inserted into the
given location on every line, to be used as borders, for example:
(show #t (columnar "/* " (displayed "abc\ndef\n") " | " (displayed "123\n456\n") " */"))would output
/* abc | 123 */ /* def | 456 */Padding ensures alignment only under the assumption that no columns are wider than their allocated width. You can use wrapping or trimming to enforce the underlying width.
You may also prefix any column with any of the symbols 'left, 'right or 'center to control the justification. The symbol 'infinite can be used to indicate the column generates an infinite stream of output.
You can further prefix any column with a width modifier. Any
positive integer is treated as a fixed width, ignoring the available
width. Any real number between 0 and 1 indicates a fraction of the
available width (after subtracting out any fixed widths). Columns
with unspecified width divide up the remaining width evenly. If the
extra space does not divide evenly, it is allocated column-wise left
to right, e.g. if the width of 78 is divided among 5 columns, the
column widths become 16, 16, 16, 15, 15 in order. Note that if explicit
widths are used, columnar
may not take up the full
available width.
The value of the col state variable is reset to 0 at the start of each line for each formatter (i.e. they each format as though they were the only column).
Note that columnar
builds its output incrementally,
interleaving calls to the column formatters until each has produced
a line, then concatenating that line together and outputting it.
When a formatter has been exhausted, it contributes empty lines
until all non-infinite columns are exhausted, at which point the
output is complete.
This is important because as noted above, some columns may produce
an infinite stream of output, and in general you may want to format
data larger than can fit into memory. Thus columnar would be
suitable for line numbering a file of arbitrary size, or
implementing the Unix yes(1)
command, etc.
The degenerate case of no columns produces a single blank line.
tabular
column ...)
Equivalent to columnar
except that each column is padded at
least to the minimum width required on any of its lines. Thus
(show #t (tabular "|" (each "a\nbc\ndef\n") "|" (each "123\n45\n6\n") "|"))outputs
|a |123| |bc |45 | |def|6 |This makes it easier to generate tables without knowing widths in advance. However, because it requires generating the entire output in advance to determine the correct column widths,
tabular
cannot format a table larger than would fit in memory.
Note that since tabular
computes explicit widths for all
columns, it will use the most compact width for unspecified columns
and not necessarily consume the full available width.
wrapped
fmt ...)
Behaves like each
, except text is accumulated and lines are
wrapped to fit in the current width as in the Unix fmt(1)
command. Specifically, words are tokenized by splitting on all
characters which satisfy the predicate in the parameter
word-separator?, which defaults to char-whitespace?
.
Words are grouped into lines separating them by space, and line
breaks are introduced to minimize the
sum of the cube of trailing whitespace on every line while
ensuring no line exceeds width (as measured with
the string-width state variable).
The last line is not appended with a newline, so that in the trivial case of a single line this is equivalent to each (but reducing whitespace).
wrapped/list
list-of-strings)
Like wrapped, but taking a pre-tokenized list of strings.
wrapped/char
fmt ...)
Like wrapped
, but splits simply on individual characters
as the current width is reached on each line. Thus there is
nothing to optimize and this formatter doesn't buffer output, as we only
need look ahead one character at a time to check its width.
justified
<format> ...)
Like wrapped
except the lines are full-justified.
(define func '(define (fold kons knil ls) (let lp ((ls ls) (acc knil)) (if (null? ls) acc (lp (cdr ls) (kons (car ls) acc)))))) (define doc (string-append "The fundamental list iterator. Applies KONS to each " "element of LS and the result of the previous application, " "beginning with KNIL. With KONS as CONS and KNIL as '(), " "equivalent to REVERSE.")) (show #t (columnar (pretty func) " ; " (justified doc)))outputs
(define (fold kons knil ls) ; The fundamental list iterator. (let lp ((ls ls) (acc knil)) ; Applies KONS to each element of (if (null? ls) ; LS and the result of the previous acc ; application, beginning with KNIL. (lp (cdr ls) ; With KONS as CONS and KNIL as '(), (kons (car ls) acc))))) ; equivalent to REVERSE.
from-file
pathname)
Displays the contents of the file pathname one line at a time, so
that in typical formatters such as columnar
only constant
memory is consumed, making this suitable for formatting files of
arbitrary size.
line-numbers
[start])
A convenience utility, just formats an infinite stream of numbers (in the current radix) beginning with start, which defaults to 1.
The Unix nl(1)
utility could be implemented as:
(show #t (columnar 4 'right 'infinite (line-numbers) " " (from-file "read-line.scm")))which might output:
1 2 (define (read-line . o) 3 (let ((port (if (pair? o) (car o) (current-input-port)))) 4 (let lp ((res '())) 5 (let ((c (read-char port))) 6 (if (or (eof-object? c) (eqv? c #\newline)) 7 (list->string (reverse res)) 8 (lp (cons c res)))))))
(srfi 166 color)
library.
as-red
fmt ...)as-blue
fmt ...)as-green
fmt ...)as-cyan
fmt ...)as-yellow
fmt ...)as-magenta
fmt ...)as-white
fmt ...)as-black
fmt ...)as-bold
fmt ...)as-italic
fmt ...)as-underline
fmt ...)Outputs the formatters fmt ... colored or (boldened, italicized or underline) with ANSI control sequences, for use when formatting to a terminal.
on-red
fmt ...)on-blue
fmt ...)on-green
fmt ...)on-cyan
fmt ...)on-yellow
fmt ...)on-magenta
fmt ...)on-white
fmt ...)on-black
fmt ...)Outputs the formatters fmt ... with ANSI control sequences to set the background color, for use when formatting to a terminal.
as-color
red green blue fmt ...)
Each of red, green, blue
should be an exact integer in the range [0, 5]
,
representing the corresponding components
of an RGB color model. Outputs the formatters colored
accordingly using 8-bit color ANSI control sequences.
as-true-color
red green blue fmt ...)
The 24-bit True Color equivalent of as-color
,
taking a range of [0, 255]
for each component,
for use with terminals supporting this.
on-color
red green blue fmt ...)
The equivalent of as-color
, setting the background color.
on-true-color
red green blue fmt ...)
The equivalent of as-true-color
, setting the background color.
It is up to the caller to ensure that the terminals support these ANSI control sequences.
(srfi 166 unicode)
library.
terminal-aware
fmt ...)
Equivalent to
(fn (ambiguous-is-wide?) (with ((string-width (if ambiguous-is-wide? string-terminal-width/wide string-terminal-width)) (substring/width (if ambiguous-is-wide? substring-terminal-width/wide substring-terminal-width)) (substring/preserve substring-terminal/preserve)) fmt ...))
Padding, trimming and tabbing, etc. will generally not do the right thing in the presence of zero-width and double-width Unicode characters, or ANSI control sequences. This formatter overrides the string-width and substring/width state variables used in column tracking to do the right thing in such cases, considering Unicode double or full width characters as 2 characters wide (as they typically are in fixed-width terminals), while treating combining and non-spacing characters as 0 characters wide.
;; 3 characters padded to 5 (show #f (with ((pad-char #\〜)) (padded/both 5 "日本語"))) => "〜日本語〜" ;; the 3 characters have a terminal width of 6 so are not padded (show #f (terminal-aware (with ((pad-char #\〜)) (padded/both 5 "日本語")))) => "日本語"
string-terminal-width
str)
A utility function which returns the integer number of columns str would require in a terminal, according to the following rules:
Implementations should support the properties from at least the current Unicode specification at the time of writing of this SRFI, 12.0.0.
string-terminal-width/wide
str [start end])
Equivalent to string-terminal-width
, except that
ambiguous characters are counted as 2 columns, as they are in
certain Japanese environments (notably kterm, fonts such as
MS Gothic and the system fonts of many Japanese feature phones).
substring-terminal-width
str from to)
Returns the substring of str, starting from the first
index where the total string width exceeds from width,
inclusive, to the first index exceeding
to width, exclusive, using the notion of width as defined
in string-terminal-width
. Note this naturally groups
grapheme clusters together by excluding leading modifiers and including
trailing modifiers. If str includes only single-width characters,
this definition is equivalent to substring
. A negative
start can be used to effectively include all leading
zero-width characters.
(substring-terminal-width "abc" 0 6) => "abc" (substring-terminal-width "abc" 0 4) => "ab" (substring-terminal-width "abc" 2 6) => "bc" (substring-terminal-width "abc" 1 4) => "ab" (substring-terminal-width "abc" 1 5) => "ab" (substring-terminal-width "abc" 2 4) => "b" (substring-terminal-width "abc" 2 3) => "" (substring-terminal-width "abc" -1 2) => "a"
substring-terminal-width/wide
str from to)
Equivalent to substring-terminal-width
, except that
ambiguous characters are counted as 2 columns
substring-terminal-preserve
str)
Returns only the substring sequences of str which would have non-local implications for rendering the text in a terminal. Specifically, preserves ANSI color control sequences, as well as the directional formatting characters described in the Unicode Bidirectional Algorithm.
upcased
fmt ...)
downcased
fmt ...)
Runs the formatters fmt ..., but with all output
translated as if first passed to string-upcase
or
string-downcase
, respectively. Note these should
also work correctly when combined with the ANSI control sequences from
formatting with color,
which includes ASCII letters in the control sequences.
Note there should be no internal buffering, which may have an effect on context-sensitive casing. For example, if an implementation correctly supports "ς" as the proper downcased form of a Greek sigma "Σ" at the end of a word, it may assume that the end of a string is in fact the end of a word, even if later succeeded by a word constituent character.
(show #f (upcased "abc")) => "ABC" (show #f (downcased "ΜΈΛΟΣ")) => "μέλος" (show #f (downcased "ΜΈΛΟΣ" "Μ")) => unspecified
Formatters up to this point have been simple accumulators of output,
with no control flow or handling of state. Both of these are
provided by fn
and with
for getting and setting
state, respectively.
fn
((id state-var) ...) expr ... fmt)
Short for "function," this is the analog to lambda
. Returns a
formatter which on application evaluates each expr and
fmt in left-to-right order, in a lexical
environment extended with each identifier id bound to the current
value of the state variable evaluated by state-var. The
result of the fmt is then applied as a formatter.
As a convenience, any (id state-var)
list may be abbreviated
as simply id
, indicating id is bound to the state
variable of the same identifier. Note this would then shadow the state
variable in any nested functions.
(show #f "column: " (fn (col) col)) => "column: 8" (show #f "column: " (fn ((col1 col)) (each col1 ", " (fn ((col2 col)) col2)))) => "column: 8, 11"
The trivial case of no state variables is often useful to allow for lazy applications of formatters, needed for conditional formatting and loops. For example:
(show #t (let lp ((ls ls)) (if (pair? ls) (each (car ls) (lp (cdr ls))) nothing)))would eagerly create a formatter concatenating every element of ls before starting to accumulate any output, whereas
(show #t (let lp ((ls ls)) (if (pair? ls) (each (car ls) (fn () (lp (cdr ls)))) nothing)))would lazily apply the formatters one at a time.
with
((state-var value) ...) fmt ...)
Conceptually the formatting equivalent of parameterize,
temporarily altering state variables. Applies each of the
formatters fmt with each state-var bound to the corresponding
value. The resulting state is then updated to restore each
state-var to its original value.
with!
(state-var value) ...)
Similar to with
but does not restore the original values,
changing the value of each state-var for any remaining formatters
in a sequence.
As the current formatting state can be captured or reentered with
continuations, with!
should be used with caution, and
may produce unexpected output in some cases.
forked
fmt1 fmt2)
Calls fmt1 on (a conceptual copy of) the current state, then fmt2 on the same original state as though fmt1 had not been called.
call-with-output
formatter mapper)
A utility, calls formatter on a copy of the current state (as with
forked
), accumulating the results into a string. Then calls
the formatter resulting from (mapper result-string)
on the original state.
make-state-variable
name default [immutable])
Returns a new state variable suitable for use in
fn
and with
, etc. The name
should be a string and is strictly for debugging purposes.
default is the default value when referenced in fn
if the value has not be set. If immutable is true,
the state variable can only be dynamically bound with with
,
and not set with with!
.
The following state variables have predefined meanings with the
formatters in this SRFI. These are all exported by the
(srfi 166 base)
library.
port
The textual port output is written to. This can be overridden to
capture intermediate output. If any output is made to port by
parallel computations and/or side-effecting Scheme procedures
during the dynamic extent of a call to show
, then
the values of row and col are unspecified.
row
The current row of output, starting at 0 regardless of what may previously have been written to port.
col
The current column of output, used for padding and spacing, etc., starting at 0 regardless of what may previously have been written to port.
width
The current line width, used for columnar, wrapping and pretty-printing. The default is implementation-defined.
output
The underlying standard formatter for writing a single string. The
default value outputs the string to port while tracking
the current row and col. This can be
overridden both to capture intermediate output and perform
transformations on strings before outputting, but should generally
wrap the existing output to preserve expected behavior.
You should not write to port except via output.
The default output procedure is exported as output-default
.
writer
The mapper for automatic formatting of non-string/char values in
top-level show
, each
and other formatters.
The default value is implementation-defined, but should format in
sexp notation. One could override this to format other programming
languages.
string-width
A procedure taking three args: (string [start end])
,
where start and end are optional, defaulting to the 0 and
(string-length string)
respectively. Returns the
length in columns of that string within the given range. The
default treats each character as a width of 1, returning (- end start)
.
substring/width
A procedure taking three args: (string from to)
,
which returns the substring of string whose width is
between from and to. The default value is
substring
, where from and to
correspond directly to indexes. This should generally be updated
in conjunction with string-width
.
substring/preserve
A procedure taking three args with the same semantics as
substring/width
: (string from to)
, which
returns any control characters or sequences which have non-local
implications and thus should not be removed by a
trimmed
operation. The default value is
#f
, indicating nothing needs to be preserved, but can
be overriden as in substring-terminal/preserve
.
pad-char
The character used by space-to
, tab-to
and other padding
formatters.
(define (print-table-of-contents alist) (define (print-line x) (each (car x) (space-to 72) (padded 3 (cdr x)))) (show #t (with ((pad-char #\.)) (joined/suffix print-line alist nl)))) (print-table-of-contents '(("An Unexpected Party" . 29) ("Roast Mutton" . 60) ("A Short Rest" . 87) ("Over Hill and Under Hill" . 100) ("Riddles in the Dark" . 115)))would output
An Unexpected Party.....................................................29 Roast Mutton............................................................60 A Short Rest............................................................87 Over Hill and Under Hill...............................................100 Riddles in the Dark....................................................115
ellipsis
The string used when truncating as described in trimmed
,
default the empty string.
radix
The radix for numeric output, defaulting to 10, as used in
numeric
and written
.
precision
The precision for numeric output, as described in
numeric
and written
. The precision
specifies the number of digits written after the decimal point. If
the numeric value to be written out requires more digits to
represent it than precision, the written representation is chosen
which is closest to the numeric value and representable with the
specified precision. If the numeric value falls on the midpoint of
two such representations, it is implementation-dependent which
representation is chosen.
When the numeric value is an inexact floating-point number, there is
more than one interpretation of this "rounding". One is to take
the effective value the floating-point number represents (e.g. if we
use binary floating-point numbers, we take the value of (*
sign mantissa (expt 2 exponent))
), and
compare it to the two closest numeric representations of the given
precision. Another way is to obtain the default notation of the
floating-point number and apply rounding to it. The former (we call
it effective rounding) is consistent with most floating-point number
operations, but may lead to a non-intuitive result than the latter (we
call it notational rounding). For example, 5.015 can't be represented
exactly in binary floating-point numbers. With IEEE754 floating-point
numbers, the floating point number closest to 5.015 is smaller than
exact 5.015, i.e. (< 5.015 5015/1000) => #t
. With
effective rounding with precision 2, it should result in "5.01".
However, users who look at the notation may be confused by "5.015"
not being rounded up as they usually expect. With notational rounding
the implementation chooses "5.02" (if it also adopts
round-half-to-infinity or round-half-up rule). It is up to the
implementation to choose which interpretation to adopt.
decimal-sep
The decimal separator for floating point output, default ".".
decimal-align
Specifies an alignment for the decimal place when formatting numbers, useful for outputting tables of numbers.
(define (print-angles x) (joined numeric (list x (sin x) (cos x) (tan x)) " ")) (show #t (with ((decimal-align 5) (precision 3)) (joined/suffix print-angles (iota 5) nl)))would output
0.000 0.000 1.000 0.000 1.000 0.842 0.540 1.557 2.000 0.909 -0.416 -2.185 3.000 0.141 -0.990 -0.142 4.000 -0.757 -0.654 1.158
sign-rule
comma-rule
comma-sep
Additional vars used for formatting as described in formatting numbers.
word-separator?
A character predicate used to tokenize words for
wrapped
and justify
. Defaults to
char-whitespace?
. More flexibility is
available with wrapped/list
.
ambiguous-is-wide?
Use to choose between string-terminal-width
and
string-terminal-width/wide
when formatting with
terminal-aware
. The default value is implementation-defined.
A reasonable approach might be to check if the TERM
environment variable is kterm when writing to a terminal.
One could also check if the recipient of an email were using
a .jp email address, however frequent users of the ambiguous
characters in such environments are likely to have changed
their fonts.
pretty-environment
An environment which may optionally be used for hints in the pretty
printing formatters, defaulting to (interaction-environment)
.
A sample implementation in portable R7RS will be available at https://github.com/ashinn/chibi-scheme/blob/master/lib/srfi/166.sld, and included files, depending on SRFI 1, 69, 117, 130, 165. It is mostly the same as in SRFI 159. Two alternative implementations are also available, one by Adam Nelson at https://github.com/ar-nelson/schemepunk/tree/show, and one by Marc Nieper-Wißkirchen at https://gitlab.com/nieper/show.
Note columnar
and trimmed/lazy
rely on first-class continuations, however an implementation written
in CPS-style would not require this.
The author would like to thank everyone who provided feedback for SRFI 159 and SRFI 166, in particular Marc Nieper-Wißkirchen for detailed feedback and bug reports and work on an alternate implementation, Adam Nelson for his implementation, Jim Rees who provided many bug fixes early on, John Cowan for early editorial comments, and Arthur Gleckler for his hard work and super-human patience in waiting for me to get my act together.
Alex Shinn, John Cowan, Arthur Gleckler, Revised^7 Report on the Algorithmic Language Scheme https://small.r7rs.org/attachment/r7rs.pdf
Guy L. Steele Jr., Common Lisp Hyperspec http://www.lispworks.com/documentation/HyperSpec/Front/
Scott G. Miller, SRFI 28 - Basic Format Strings https://srfi.schemers.org/srfi-28/
Ken Dickey, SRFI 48 - Intermediate Format Strings https://srfi.schemers.org/srfi-48/
Alex Shinn, SRFI 159 - Combinator Formatting https://srfi.schemers.org/srfi-159/
C++ iomanip https://www.cplusplus.com/reference/iomanip/
Damian Conway, Perl6 Exegesis 7 - formatting https://www.perl.com/pub/2004/02/27/exegesis7.html/
Alex Shinn, fmt - Combinator Formatting http://synthcode.com/scheme/fmt/
Mark Davis et al., Unicode® Standard Annex #9 - Unicode Bidirectional Algorithm https://www.unicode.org/reports/tr9/
Ken Lunde, Unicode® Standard Annex #11 - East Asian Width https://www.unicode.org/reports/tr11/
Copyright (C) Alex Shinn 2020. All Rights Reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.