Overhauling the library infrastructure (2015 March 05)

For Vicare releases in the series 0.4 I have done a full overhauling of the libraries infrastructure, along with code reorganisation. I have been working on this for some time and procrastinating the end of the restructuring for weeks; now I have whipped myself into finalising at least some of it. But there are still some design decisions to be made, so more entries will follow on this subject. This section is about what’s currently in the master branch, not the latest Vicare release.

Some of the changes are backwards incompatible. Here is a quick list:

Now some topic discussion.

No more caching of compiled libraries

Since the days of the Ikarus code base, there has been support for caching compiled libraries; other Scheme implementations have this feature (Guile, Mosh, Sagittarius, Ypsilon). It worked like this: we install source libraries in a system directory, for example:

/usr/local/lib/vicare/posix.sls

and import them as usual:

(import (prefix (vicare posix) px.))

the first time we import a library: Vicare loads the source file and stores a compiled version in the cache directory, for example:

~/.vicare/precompiled/usr/local/lib/vicare/posix.fasl

the second time we import a library Vicare loads the compiled file from the cache directory. Seems fine, but:

Back when I started getting the hang of preparing packages for Ikarus, I started including a Makefile rule to precompile all the installed source libraries in the cache; then I started to install both the source and binary files in system directories, side by side.

I nuked all of this. Starting with Vicare releases in the series 0.4 caching of compiled libraries will not be supported anymore. It is possible to reintroduce it using the api exported by the library (vicare libraries), but Vicare does not offer it anymore.

How library loading works now

The library (vicare libraries) is defined by Vicare’s boot image and exports an api to allow the customisation of the library loading process. Here is an overview:

  1. The import clause/syntax used in the source code instructs Vicare to search a library whose r6rs name matches the given r6rs library reference.
  2. To import a library we must first intern it. Libraries are interned by the function find-library-by-reference, which first searches the library in the internal collection of already–interned libraries, then searches for a matching library in some external repository.
  3. If none of the libraries already interned matches the given reference: find-library-by-reference makes use of the parameter current-library-loader. The function default-library-loader is the default value of the parameter current-library-loader.
  4. The function default-library-loader makes use of: the library locator referenced by the parameter current-library-locator; the source library loader referenced by the parameter current-source-library-loader; the binary library loader referenced by the parameter current-binary-library-loader.
    • The parameter current-library-locator is usually initialised to one of the functions:
          run-time-library-locator
      compile-time-library-locator
            source-library-locator
      

      either by default, or by direct selection with the command line option --library-locator. The library locators search the file system for a library file pathname matching a specified r6rs library reference.

    • The function default-source-library-loader is the default value for the parameter current-source-library-loader. Given a textual input port: it reads from it a library symbolic expression; it verifies that its version reference conforms to the library reference; using the expander procedure referenced by current-library-expander: loads and interns all its dependency libraries; expands it; compiles it; interns it.
    • The function default-binary-library-loader is the default value for the parameter current-binary-library-loader. Given a binary input port: it reads from it a serialised library; verifies that its library name conforms to the library reference; interns it along with all its dependency libraries.

Every parameter can be used to introduce some customisation in the process. The library locators need some discussion.

Run–time library file locator

The run–time library locator is the default; it can be selected explicitly with the command line option --library-locator run-time or by setting the parameter current-library-locator to run-time-library-locator; it is meant to be used by an installation of Vicare to run applications. The run–time locator scans the search path for compiled libraries in search of a matching binary file; if a matching compiled library is not found: it scans the search path for source libraries in search of a matching source file.

The reference scenario for the run–time library locator is this:

  1. We install the package Vicare Scheme, compiling bundled libraries and putting them in some system directory; the libraries might be installed with pathnames like:
    /usr/local/lib/vicare-scheme/vicare/posix.fasl
    
  2. We install additional packages, compiling distributed libraries and putting them in some system directory; the libraries might be installed with pathnames like:
    /usr/local/lib/vicare-scheme/vicare/something.fasl
    
  3. We configure the library binary search path to make sure that it includes the system directory:
    (library-binary-search-path)
    ⇒ (... "/usr/local/lib/vicare-scheme" ...)
    
  4. We configure the library binary file scanner parameter:
    (current-library-binary-search-path-scanner
       default-library-binary-search-path-scanner)
    

    which will scan the search path returned by (library-binary-search-path).

  5. We compose a Scheme program demo.sps which imports the libraries:
    (import (vicare)
      (prefix (vicare posix) px.)
      (vicare something))
    

    and we execute it selecting the run–time library locator:

    $ vicare --library-locator run-time --r6rs-script demo.sps
    

    the command line option --library-locator will put run-time-library-locator in the parameter current-library-locator. The run–time locator is the default, so we can just do:

    $ vicare --r6rs-script demo.sps
    

Compile–time library file locator

The compile–time library locator must be selected explicitly with the command line option --library-locator compile-time or by setting the parameter current-library-locator to compile-time-library-locator; it is meant to be used from the build directory of a package while compiling libraries for development or future installation. The compile–time locator does the following:

  1. Ask the file system search path scanner for the next library source file matching a given library reference.
  2. If a matching source library is found: look for an already compiled library file in the compiled-libraries-build-directory:
    1. If no compiled file exists or it if exists but it is older than the source file: accept the source file as matching.
    2. If a compiled file exists and it is newer than the source file: accept the compiled file as matching.
    3. Return to the caller the matching file pathname.
    4. If the caller rejects the binary file pathname: return to the caller the source file pathname.
    5. If the caller rejects the source file: loop to 1.
  3. If no source file exists: loop to 1.

when using this locator we instruct the compiler to put compiled libraries in the build directory.

The reference scenario for the compile–time library locator is this:

  1. We install the package Vicare Scheme, compiling bundled libraries and putting them in some system directory; the libraries might be installed with pathnames like:
    /usr/local/lib/vicare-scheme/vicare/posix.fasl
    
  2. We install additional packages, compiling distributed libraries and putting them in some system directory; the libraries might be installed with pathnames like:
    /usr/local/lib/vicare-scheme/vicare/something.fasl
    
  3. We unpack the distribution tarball of a package providing even more libraries. We have the source libraries under:
    $(srcdir)/lib/vicare/this.sls
    $(srcdir)/lib/vicare/that.sls
    

    we want to compile them under the build directory:

    $(builddir)/lib/vicare/this.fasl
    $(builddir)/lib/vicare/that.fasl
    

    and then install them in a system directory:

    /usr/local/lib/vicare-scheme/vicare/this.fasl
    /usr/local/lib/vicare-scheme/vicare/that.fasl
    

    This is the gist: in the package’s building infrastructure (for example a Makefile managed by the gnu Autotools) we need to write appropriate invocations of vicare to build the libraries locally and pick the appropriate source libraries and compiled libraries.

  4. It may be that the libraries in the source tree need to load installed libraries and also have local dependencies:
    (library (vicare this)
      (export)
      (import (vicare)
        (vicare that))
      ---)
    
    (library (vicare that)
      (export)
      (import (vicare)
        (prefix (vicare posix) px.)
        (vicare something))
      ---)
    
  5. It may be that an older version of the package is already installed, so there already exist installed binary libraries:
    /usr/local/lib/vicare-scheme/vicare/this.fasl
    /usr/local/lib/vicare-scheme/vicare/that.fasl
    

    we want the libraries under $(builddir)/lib to take precedence over the libraries under /usr/local/lib/vicare-scheme. It may be that there exist installed source libraries:

    /usr/local/lib/vicare-scheme/vicare/this.sls
    /usr/local/lib/vicare-scheme/vicare/that.sls
    

    we want the libraries under $(srcdir)/lib to take precedence over the libraries under /usr/local/lib/vicare-scheme.

At the Scheme level we want the following:

Assuming Makefiles generated by gnu Automake: to achieve the desired result, we have two options:

  1. For every library to be compiled locally, we write in the Makefile an explicit dependency rule:
    lib/vicare/that.fasl: lib/vicare/that.sls
            VICARE_SOURCE_PATH=; export VICARE_SOURCE_PATH;    \
            vicare --library-locator compile-time              \
               --library-path    /usr/local/lib/vicare-scheme  \
               --source-path     $(srcdir)/lib                 \
               --build-directory $(builddir)/lib               \
               -o $@ -c $<
    
    lib/vicare/this.fasl: lib/vicare/this.sls lib/vicare/that.fasl
            VICARE_SOURCE_PATH=; export VICARE_SOURCE_PATH;    \
            vicare --library-locator compile-time              \
               --library-path    /usr/local/lib/vicare-scheme  \
               --source-path     $(srcdir)/lib                 \
               --build-directory $(builddir)/lib               \
               -o $@ -c $<
    

    this is the solution to prefer, because it allows parallel builds.

  2. We write a script compile-all.sps that imports at least the local libraries that are leaves in the local package dependency tree:
    (import (only (vicare that))
            (only (vicare this)))
    

    we write a single Makefile rule that compiles in the build directory all the dependencies of the script:

    .PHONY: vfasl
    
    vfasl:
            VICARE_SOURCE_PATH=; export VICARE_SOURCE_PATH;    \
            vicare --library-locator compile-time              \
               --library-path    /usr/local/lib/vicare-scheme  \
               --source-path     $(srcdir)/lib                 \
               --build-directory $(builddir)/lib               \
               --compile-dependencies compile-all.sps
    

Source library file locator

The source library locator must be selected explicitly with the command line option --library-locator source or by setting the parameter current-library-locator to source-library-locator; it is meant to be used to search for source libraries first and then for compiled ones.

The reference scenario for the source library locator is this:

  1. We install the package Vicare Scheme, compiling bundled libraries and putting them in some system directory; the libraries might be installed with pathnames like:
    /usr/local/lib/vicare-scheme/vicare/posix.fasl
    
  2. We install additional packages, compiling distributed libraries and putting them in some system directory; the libraries might be installed with pathnames like:
    /usr/local/lib/vicare-scheme/vicare/something.fasl
    
  3. We checkout the source tree of a package repository to develop even more libraries. We have the source libraries under:
    $(srcdir)/lib/vicare/this.sls
    $(srcdir)/lib/vicare/that.sls
    

    we want to compile them under the build directory:

    $(builddir)/lib/vicare/this.fasl
    $(builddir)/lib/vicare/that.fasl
    

    and then install them in a system directory:

    /usr/local/lib/vicare-scheme/vicare/this.fasl
    /usr/local/lib/vicare-scheme/vicare/that.fasl
    

    In the package’s building infrastructure (for example a Makefile managed by the gnu Autotools) we need to write appropriate invocations of vicare to build the libraries locally and pick the appropriate source libraries and compiled libraries.

    This is the gist: we want to automatically generate an include Makefile holding the compilation and installation recipes correctly describing the dependencies among libraries. For this we need to load all the source libraries in the package’s source tree.

  4. It may be that the libraries in the source tree need to load installed libraries and also have local dependencies:
    (library (vicare this)
      (export)
      (import (vicare)
        (vicare that))
      ---)
    
    (library (vicare that)
      (export)
      (import (vicare)
        (prefix (vicare posix) px.)
        (vicare something))
      ---)
    
  5. It may be that an older version of the package is already installed, so there already exist installed binary libraries:
    /usr/local/lib/vicare-scheme/vicare/this.fasl
    /usr/local/lib/vicare-scheme/vicare/that.fasl
    

    we want these installed libraries to be ignored. It may be that there exist installed source libraries:

    /usr/local/lib/vicare-scheme/vicare/this.sls
    /usr/local/lib/vicare-scheme/vicare/that.sls
    

    we want the libraries under $(srcdir)/lib to take precedence over the libraries under /usr/local/lib/vicare-scheme.

At the Scheme level we want the following:

To achieve the desired result, in the Makefile we write rules as follows:

.PHONY: dependencies

DEPSCRIPT = $(srcdir)/scripts/build-makefile-rules.sps

dependencies:
        VICARE_SOURCE_PATH=; export VICARE_SOURCE_PATH; \
        vicare --library-locator source                 \
          --library-path /usr/local/lib/vicare-scheme   \
          --source-path  $(srcdir)/lib                  \
          --r6rs-script  $(DEPSCRIPT) --                \
          $(slsdir)/libraries.scm >$(slsdir)/dependencies.make

where the executed Scheme script generates the Makefile rules automatically. The package Vicare Scheme comes with a script build-makefile-rules.sps that does exactly that to generate the dependencies among libraries; the extension packages will use the same script with the same Makefile rule.

Native shared libraries? Not easy

The binary output from a compiler that I like most is the one produced by C compilers: standalone binary programs, with libraries statically linked in; native shared libraries whose loading, linking and sharing is performed by the Operative System. It would be awesome to have standalone binary programs and native shared libraries in Vicare, but, alas, it is not possible; at least not without a major overhauling of the internal architecture.

Machine code generated by Vicare is stored in proper Scheme objects: the code objects; they are subject to garbage collection and are handled like all the other objects; the only true difference is that they are allocated in memory pages with execute permissions.

Code objects have a field referencing a vector, the relocation vector, containing references to other Scheme objects used by the machine code (for example constant Scheme objects hard–coded in the source); such references between objects are built: at boot image load–time; at binary library load–time; at source library compilation–time. We can inspect the situation with the following code:

#!vicare
(import (vicare)
  (vicare system $codes))

(define V
  '#(1 2 3))

(define (mutate)
  (vector-set! V 0 #\a))

(define reloc-vector
  ($code-reloc-vector ($closure-code mutate)))

(print-gensym #t)

(write reloc-vector)
(newline)

(flush-output-port (current-output-port))
-| #(470 43 #<code> 272 #{vector-set! |sVCU2mUD/G&4T9<T|} 48 #{V |j1$J<JXhK2N2Kivt|})

we see that the last item in the relocation vector is a gensym with pretty name V; this gensym holds the current value of the variable V, it is a location gensym (or loc gensym) in Vicare’s jargon. The value stored in this loc gensym is:

(write (symbol-value (vector-ref reloc-vector 6)))
(newline)

(flush-output-port (current-output-port))
-| #(1 2 3)

If code objects are just ordinary Scheme objects: it is easy to handle such mutable references; code generated at run–time (by calling the core primitive eval or other compiler primitives) is automatically handled in the correct way. Putting machine code with these mutable references in a native shared object and having it interact transparently with other Scheme objects requires a significant change of Vicare’s run–time machinery.