Random stuff (2015 May 19)

Officially I am still doing the review of Vicare’s expander code (see Expander code review and apology (2015 April 18)), but I needed a break; so I have refactored the compiler in smaller libraries and merged the compiler review branch into master. In addition I did some random stuff.

Everything discussed here is in the head of the master branch, which is an unstable, development branch.

Dynamically loadable Scheme libraries

Let’s take, as reference, the scenario in which we compile and install both libraries and programs. When we import a library with import, the library is associated to the program and it is loaded whenever the program is run. This is somewhat like linking a host’s shared object to a C language program at compile–time.

It is also possible to dynamically load a Scheme library, so that the program itself contains the logic needed to load (or not) an external library. This is somewhat like loading a host’s shared object with dlopen() from a C language program at run–time. The code of this feature is really small, so it is in the boot image (yes, the already huge boot image, which is around 20 MiB on my 64-bit platform).

This is how it works; we prepare a program:

(import (vicare)
  (prefix (vicare libraries) libs.))

(define-values (pregexp-match)
  (let ((lib (libs.library-dynamic-load-and-intern
                 '(vicare pregexp))))
    (values (libs.library-dynamic-retrieve lib 'pregexp-match))))

(pretty-print (pregexp-match "[a-z]+" "ciao123"))
(flush-output-port (current-output-port))

compile it:

$ vicare -c demo.sps -o demo

and run it (I am on gnu+Linux and I use the binfmt_misc support to run Vicare programs):

$ ./demo
("ciao")

the function library-dynamic-load-and-intern loads a Scheme library using the usual search path and the function library-dynamic-retrieve retrieves the Scheme object bound to an exported syntactic binding. library-dynamic-load-and-intern is like dlopen() and library-dynamic-retrieve is like dlsym().

Only global variables exported by the library can be accessed this way: it makes no sense to access macro transformers when the program is already running; this api is not a substitute of eval and environment.

New program form

The r6rs document specifies a compliant top–level program as:

… a delimited piece of text, typically a file, that has the following form:

?import-form ?top-level-body

so it only specifies that it is “delimited”; it means:

Vicare gathers this freedom to accept two formats of top–level programs:

  1. The standalone delimited sequence:
    (import ?import-spec ...) ?body ...
    
  2. A program form with the following syntax:
    (program ?program-name
      ?config-form ...
      (import ?import-spec ...)
      . ?program-body)
    

where ?program-name is meant to be a descriptive list of symbols (currently unused) and the ?config-form clauses allow additional configuration and behaviour specification.

New host’s shared object loading

There are two ways to interface Vicare with foreign libraries: one is to use the ffi and the other is to write a C language library specifically designed to adapt the foreign interface to Vicare’s C language api. The second solution is more flexible and it is used by extensions like Vicare/CURL and Vicare/SQLite.

Adapting foreign libraries need special handling:

Until now there was a really ugly api for this, I will not describe it. Now there is a proper clause in program and library forms:

(program ?program-name
  (foreign-library ?shared-object-id)
  (import  ?import-spec ...)
  . ?program-body)

(library ?library-name
  (foreign-library ?shared-object-id)
  (export  ?export-spec ...)
  (import  ?import-spec ...)
  . ?library-body)

the ?shared-object-id form must be a string representing the identifier of a host’s shared object. There can be any number of foreign-library clauses, all before export for libraries and before import for programs.

The identifier is used to build the file name of a shared object; for example the identifier vicare-curl is used to build the following file names:

libvicare-curl.so

On Unix–like systems, including gnu+Linux.

libvicare-curl.dylib

On Darwin systems.

vicare-curl.dll

On Cygwin systems.

Let’s take a look at the mechanism for retrieving foreign function’s pointers.

The core macro foreign-call, exported by the library (vicare), is expanded to the core language syntax foreign-call, which in turn is compiled to code invoking a C function from the operating system’s process image; the first argument to foreign-call is a string naming the C function.

Whenever the code:

(foreign-call "function_name" ?arg ...)

is compiled, the C pointer referencing the entry point of the named function is retrieved with a C language call:

dlsym(RTLD_DEFAULT, "function_name");

so all the public functions from the running Vicare executable are available; also available are all the functions from host’s shared libraries loaded with dlopen() using the flags RTLD_NOW | RTLD_GLOBAL.

This api for C language functions call is meant to be used to interface with functions specifically written to be called from Scheme code; this api cannot be used to directly call a generic C language function from, say, libz.so or libgmp.so.

Start–up time

Recently there was a thread on Reddit with a subdiscussion about start–up times for different Scheme implementations. Ahem… Vicare is not the fastest Scheme implementation around, and I know it. Anyway, why shy away?

The “Hello World!” program under Vicare is this:

(import (vicare))
(display "Hello World!\n")
(flush-output-port (current-output-port))

I compile it:

$ vicare -c demo.sps -o demo

and run it:

$ ./demo
Hello World!

after running it (so that the Linux kernel loads the boot image from the file system and caches it in memory), I time it; a typical execution is this:

$ /usr/bin/time -p ./demo
Hello World!
real 0.13
user 0.08
sys 0.05

on my:

$ cat /proc/cpuinfo | head --lines 9
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 58
model name      : Intel(R) Core(TM) i5-3337U CPU  1.80GHz
stepping        : 9
microcode       : 0x15
cpu MHz         : 2682.000
cache size      : 3072 KB

There is no point in trying hard to heat up the host so that the cpu does its best, and stuff like that.