267: Raw String Syntax

by Peter McGoron

Status

This SRFI is currently in draft status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-267@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

Raw strings are a lexical syntax for strings that do not interpret escapes inside of them and are useful in cases where the string data has a lot of characters such as \ or " that would otherwise have to be escaped. This SRFI proposes a raw string syntax that allows for a customized delimiter to enclose the character data. Importantly, for any string, there exists a delimiter such that the raw string using that delimiter can represent the string verbatim. The raw strings in this SRFI do not do any special whitespace handling.

Rationale

Many programming languages have raw string syntax: to name a few, Rust, C++, Python, Go, C#, and Zig. (For a more complete list of languages referenced while writing this proposal, see this wiki page.) Scheme implementations with raw strings include:

There is interest in adding standard raw strings to Scheme. Daphne Preston-Kendal proposed a syntax for strings similar to raw strings using #" in another document. The matter of raw string syntax in the R7RS-large was discussed in a WG2 meeting on November 21st, 2025, with no consensus on the features or syntax. Raw strings have also been discussed on the issue tracker for the R7RS-Large process.

This SRFI proposes the use of raw strings based on C++'s syntax. C++'s raw string syntax uses customized delimiters, where the start and end of the string are user-specified. In this SRFI, raw strings start with the sequence #"X" for any sequence of characters X that does not contain ", and are terminated by "X" for the same X.

The C++-style raw string syntax was chosen because:

This SRFI does not handle leading, trailing, or indentation whitespace in any special way: they are all preserved. This is the least surprising option, and further string processing can be done by the programmer. This is in contrast to some languages, like C#, that have special whitespace handling for raw strings.

This proposal does not include any support for interpolation. (String interpolation in Scheme is the subject of SRFI 109.) Some “raw” string syntaxes allow for interpolation, or have different types of escape sequences. Interpolation makes string processing much more complicated and is not extensible, while also not making the strings truly “raw.” Interpolation of strings in Scheme is better accomplished by macros that can inspect strings, such as syntax-case macros.

This SRFI also includes procedures to read and write raw strings to ports, which are useful when writing human-readable Scheme data, and when the raw string syntax is used in a non-Scheme data context.

Equivalent syntax for string-notated bytevectors in SRFI 207 or for vertical-bar identifiers is not included.

Specification

All text written in small text is non-normative.

Syntax

A raw string representing a string S is a sequence of characters #"X"S"X", where X is some sequence of characters not containing ", and the sequence of characters "X" appear exactly twice in the sequence, right after the first # and as a suffix of the sequence. It cannot appear anywhere else in the sequence.

The grammar of raw strings is not context-free and cannot be described in the context-sensitive EBNF used in the Reports. The following grammar informally describes the creation of a raw string literal for any valid delimiter X:

⟨raw string (X)⟩#" X " ⟨raw string internal (X)⟩ " X "

⟨raw string internal (X)⟩ ⩴ Any sequence of zero or more characters that does not contain (" X ") as a subsequence, and does not contain (" X) as its suffix

⟨valid delimiter⟩ ⩴ Any sequence of zero or more characters not including "

⟨raw string⟩ ⩴ every string produced from ⟨raw string (X)⟩ for each X produced from ⟨valid delimiter⟩

If the suffix clause of ⟨raw string internal (X)⟩ were not there, then #""""" (where the encoded string is ") would be a valid production, which is ambiguous for a limited lookahead parser. It cannot guess if that whole sequence is a raw string, or the raw string #"""" followed by the start of a new string. The suffix clause forces the second interpretation. The string " would have to be encoded with a delimiter, like #"-"""-".

Although valid delimiters look like strings when used, they do not interpret escape sequences inside of them.

The formal grammar of Scheme is modified so that the ⟨string⟩ production becomes

⟨string⟩" ⟨string element⟩* " | ⟨raw string⟩

When a raw string is read, the result is a string whose contents are the characters that make up ⟨raw string internal (X)⟩.

Per the grammar, raw strings are allowed wherever a regular Scheme string is allowed. For example, raw strings are allowed in the include and include-library-declaration forms described in the R7RS.

Procedures

These procedures are exported from (SRFI 267) in implementations of the R7RS. The SRFI 261 and SRFI 97 library name for this library is raw-strings.

It is an error to call:

Brackets denote optional arguments.

(raw-string-read-error? obj) — procedure

Predicate. See read-raw-string.

(raw-string-write-error? obj) — procedure

Predicate. See write-raw-string.

(read-raw-string [input-port]) — procedure

Read a raw string from the port and returns a string. The port defaults to (default-input-port). If there is no raw string at that input position, an error satisfying raw-string-read-error? is raised.

If an error is raised, the position of the port may be unknown.

(read-raw-string-after-prefix [input-port]) — procedure

Like read-raw-string, except that the reading starts from after #".

(can-delimit? string1 string2) — procedure

Returns true if string2 would be produced from ⟨valid delimiter⟩ and if string1 would be produced from ⟨raw string internal (string2)⟩.

(generate-delimiter string) — procedure

Returns a string such that (can-delimit? string (generate-delimiter string)).

Users should note that this procedure may have bad space or time behavior on pathological inputs, depending on the implementation strategy.

(write-raw-string string1 string2 [output-port]) — procedure

Write string1 as a raw string to the port. The port defaults to (default-output-port). The string2 will be used as the delimiter for the raw string. If (not (can-delimit? string1 string2) then an error satisfying raw-string-write-error? is raised.

This procedure returns an unspecified value.

Without the error check, this is equivalent to (write-string (string-append "#\"" string2 "\"" string1 "\"" string2 "\"") output-port).

Examples

This section is non-normative.
#"""" ; → ""
#""\begin{document}"" ; → "\\begin{document}"
#"--")")")"-""--" ; → ")\")\")-\")"
#""a"" ; → "a"
#""\"" ; → "\\"
#"-"""-" ; → "\""
#"-" " "-" ; → " \" "
#"-"#""a"""-" ; → "#\"\"a\"\""
#"-"ends with \""-" ; → "ends with \\\"
#""multiline
string"" ; → "multiline\nstring"
#""
    no whitespace stripping"" ; → "\n    no whitespace stripping
#""{"first_name" : "John",
"last_name" : "Doe"}"" ; → "{\"first_name\" : \"John\",\n\"last_name\" : \"Doe\"}"
#""\(?(\d{3})\D{0,3}(\d{3})\D{0,3}(\d{4})""
  ; → "\\(?(\\d{3})\\D{0,3}(\\d{3})\\D{0,3}(\\d{4})"
  ; Example from SRFI 264

The following example shows how a raw string can be used to embed other syntaxes as a string. The syntax inside of the string is SRFI 119 wisp syntax.

#"wisp-EOS"
define : hello-name name
  string-append "Hello," name "!"
"wisp-EOS"
; ⇒ "\ndefine: hello-name name\n  string-append \"Hello,\" name \"!\"\n"

One use of raw strings is in “docstring” documentation. The following example is from Daphne Preston-Kendal:

(define (parse-url url-string)
  #""Given a URL as a string, returns a Parsed-URL record with the
components of that URL.

(parse-url "https://example.org/~smith/?record")
=> #<Parsed-URL protocol: "https" domain: "example.org"
                path: "/~smith/" query: "?record">""
  ...)

Implementation

A portable implementation is impossible in general. However, some implementations, such as CHICKEN, Guile, Racket, and Sagittarius allow modifying the reader. A portable implementation of the procedures is possible, however.

The repository of this SRFI has a R7RS implementation of the procedures (requires SRFI 113 and SRFI 217), and some code to make the raw string syntax work in CHICKEN 6. It also contains a patch for the Chez 10.3.0 reader that implements the reader syntax portion of this SRFI.

Acknowledgments

The original C++ proposal was written by Bemen Dawes. The last C++ proposal was written by Bemen Dawes and Lawrence Crowl. The raw string syntax was ratified in C++11.

The choice of #" was copied from Daphne Preston-Kendal. John Cowan suggested double quotes in place of balanced parentheses for raw strings and also gave some editorial advice.

Thanks to the members of WG2 and others who discussed the issue. The idea of adding procedures for reading and writing raw strings came from WG2 meetings.

Thanks to Alex Shinn for writing the bulk of generate-delimiter.

Copyright

© 2025–2026 Peter McGoron, 2026 Alex Shinn (portions of the implementation)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler