Title

SRFI 168: Generic Tuple Store Database

Author

Amirouche Boubekki

Status

This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-168@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

This library is a generic approach to the database abstractions known as triplestore and quadstore. Generic Tuple Store Database implements n-tuple ordered sets and associated primitives for working with them in the context of data management.

Rationale

The industry standard for durable data storage, namely the Relational Database Management System (RDBMS), does not blend nicely into Scheme. In particular, the SQL programming language is very difficult to embed in Scheme without fallback to string interpolations. This could explain the poor support of RDBMS in Scheme implementations.

This SRFI proposes a powerful database abstraction for storing and querying data that integrates well with the existing Scheme landscape by reusing SRFI 9, SRFI 128, SRFI 146, and SRFI 158, and relying on SRFI 167, the Ordered Key Value Store SRFI. This SRFI comes with a memory storage engine that supports transactions. It can easily be adapted to implement durability relying on one of the various existing ordered key-value store libraries available in the wild.

This SRFI does not overlap with existing SRFIs.

Specification

The following specification defines two disjoint types:

Also, an implementation must rely on SRFI 158 generators to implement streams, SRFI 146 hash mappings to implement bindings and SRFI 173 hooks.

Generic Tuple Store Database

The database abstraction described in this section is an ordered set of tuples with n objects. A given store will always contain tuples of the same length.

(nstore engine prefix items)

Return an nstore object. ENGINE is an engine record instance as return by make-default-engine described in SRFI 167. DATABASE is an okvs record instance from SRFI 167. ITEMS describes the names given to a tuple’s items. It should be a list of symbols.

In the following, ITEMS is always a list of Scheme objects. What is accepted in the ITEMS list depends on the implementation of the packing procedures of the engine.

(nstore? object)

Return a #t if OBJECT is an nstore object and #f otherwise.

(nstore-ask? transaction nstore items)

Return #t if ITEMS is present in the store associated with TRANSACTION. Otherwise, return #f.

(nstore-add! transaction nstore items)

Add ITEMS to the store associated with TRANSACTION. If ITEMS is already in the associated store, do nothing. Return value is unspecified.

(nstore-delete! transaction nstore items)

Delete ITEMS from the store associated with TRANSACTION. Do nothing if ITEMS is not in the store. Return value is unspecified.

(nstore-var name)

Return an object of a disjoint type nstore-var associated with the symbol NAME.

(nstore-var? obj) → boolean

Return #t if OBJ is a variable as returned by nstore-var procedure. Otherwise, return #f.

(nstore-var-name variable) → symbol

Return the symbol name associated with VARIABLE. If VARIABLE is not a variable in the sense of var?, the returned value is unspecified.

(nstore-select transaction nstore pattern [config]) → generator

Return a generator of bindings where variables (in the sense of nstore-var?) of PATTERN are bound against one or more matching tuples from the store associated with TRANSACTION. The implementation must return a SRFI 158 generator of SRFI 146 hash mappings.

A tuple matches a pattern if the tuple has the same items that are not variables as the pattern. The variable will add a key-value pair in a binding that will associate the pattern name with the corresponding value in the tuple.

For example, given the following pattern:

(list "P4X432" "blog/url" (nstore-var 'blog/title))

And given the following tuple in the database:

(list "P4X432" "blog/url" "https://hyper.dev")

nstore-select will generate a binding that satisfies the following:

(assume (hashmap->alist binding) '((blog/title . "https://hyper.dev")))

Note: The generator is valid as long as the transaction is running. If the transaction is commited the generator will not be able to proceed.

The returned generator is called the seed generator because it doesn’t rely on an existing generator of bindings.

CONFIG is an optional association list that allows to configure the returned generator. It takes the following options:

key description
'offset specify how many bindings must be skipped.
'limit indicates the maximum number of bindings to return.

Note: The generator order must be stable to allow pagination.

Note: In the case where there is single variable in PATTERN the generator must be ordered.

(nstore-where transaction nstore pattern) → procedure → generator

Return a procedure that takes a generator of bindings as argument and returns a generator of bindings where variables of PATTERN are bound to one or more matching tuples from the store associated with TRANSACTION.

Note: The generator returned by nstore-where is flat. It is NOT a generator of generators.

Let's pick the previous example and imagine that we have a generator that contains a binding that satisfies the following:

(assume (hashmap->alist binding) '((blog/url . "https://hyper.dev")))
Given another pattern passed to nstore-where:
(list (nstore-var 'blog/url) "url/valid" (nstore-var 'url/valid))

nstore-where will bind the pattern using the binding that nstore-select generated as described above, which will result in the following pattern:

(list "https://hyper.dev" "url/valid" (nstore-var 'url/valid))

Then that pattern is used to look up matching tuples in the database similarly to how nstore-select generated the first binding. If there is a matching tuple, it generates a new binding. For instance, if there is a tuple like the following:

(list "https://hyper.dev" 'url/valid #t)

In that case, the original binding will generate one new binding that associates 'url/valid to #t

Otherwise there is no matching tuple, the original binding is discarded possibly making the generator returned by nstore-where empty.

Note: the generator is valid as long as the transaction is running. If the transaction is commited the generator will not be able to proceed.

(nstore-query <from> <where> ...) → generator syntax

nstore-query will chain the generators passing what is generated by the select clause to the where clauses in order.

Here the definition of nstore-query

    (define-syntax nstore-query
      (syntax-rules ()
        ((_ value) value)
        ((_ value f rest ...)
         (nstore-query (f value) rest ...))))

Hooks

SRFI 173 hooks are procedures that are executed at specific times during the life of an nstore. They allow one to sneak into the mutation procedures to do validation, keep indices synchronized or carry out other operations.

(nstore-hook-on-add nstore)

This procedure returns the hook object associated with nstore-add procedure.

The hook is run before the operation is executed. The procedures associated with the hook will take a SRFI 167 transaction object along with the tuple that is added. The arity of the hook is two.

(nstore-hook-on-delete nstore)

This procedure returns the hook object associated with nstore-delete! procedure.

The hook is run before the operation is executed. The procedures associated with the hook will take a SRFI 167 transaction object along with the tuple that is deleted. The arity of the hook is two.

Example

Here is an example use:

(import (scheme base))
(import (scheme generator))
(import (scheme mapping hash))

(import (okvs))
(import (nstore))


(define (make-triplestore prefix)

    (nstore engine prefix '(subject predicate object)))


(define triplestore (make-triplestore '(0)))

(define database (okvs "/data" #t))

(define (add-blog-post! transaction title body keywords)
  (nstore-add! transaction triplestore title 'post/body body)
  (let loop ((keywords keywords))
    (unless (null? keywords)
      (nstore-add! transaction triplestore title 'post/keyword (car keywords))
      (loop (cdr keywords)))))

(okvs-in-transaction database
                     (lambda (transaction)
                       (add-blog-post! transaction
                                       "Hello, world!"
                                       "First post."
                                       '(scheme))))

(okvs-in-transaction database
                     (lambda (transaction)
                       (add-blog-post! transaction
                                       "okvs for the win"
                                       "With okvs one can build powerful abstractions."
                                       '(okvs scheme database))))

(okvs-in-transaction database
                     (lambda (transaction)
                       (add-blog-post! transaction
                                       "Easy on-disk persistence"
                                       "nstore is a database abstraction."
                                       '(nstore scheme database))))

(okvs-in-transaction database
                     (lambda (transaction)
                       (add-blog-post! transaction
                                       "hoply"
                                       "hoply is an implementation in python."
                                       '(nstore python database))))

(okvs-in-transaction database
                     (lambda (transaction)
                       (generator-map->list
                        (lambda (binding) (hashmap-ref binding 'post/title))
                        (nstore-query
                         (nstore-select transaction triplestore
                                      (list (nstore-var 'post/title)
                                            'post/keyword
                                            'scheme))
                         (nstore-where transaction triplestore
                                       (list (nstore-var 'post/title)
                                             'post/keyword
                                             'database))))))

;; => '("Easy on-disk persistence" "okvs for the win")

Implementation

Here is the sample implementation.

Acknowledgements

Credit goes first to Cognitect for creating the Datomic database which inspired this work. StackOverflow user zhoraster helped pin the mathematics behind the generic implementation of tuple stores and Mike Earnest provided an algorithm to compute the minimal set of tuple items permutations that allows efficient querying. The author would like to thank Arthur A. Gleckler and Marc Nieper-Wißkirchen for getting together SRFI 146 and Shiro Kawai, John Cowan, and Thomas Gilray for working on SRFI 158.

Copyright

Copyright (C) Amirouche Boubekki (2019).

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler