Unicon FFI

Well met,

Let 2017 be the year of the uniffi. Unicon Foreign Function Interface.

Ok, Unicon already has a Foreign Function Interface, loadfunc and similar C function interfacing has been in Unicon since its inception, dating back to at least Icon version 8.10, March of 1993. There were two C interfaces documented for that release, outbound, callout and inbound icon_call.

Sadly, the inbound code in Unicon for icon_call is no longer available, but read on for a possible future, perhaps better alternative.

The outbound interface callout is still in Unicon version 13, but requires a special build of the entire compiler/runtime system to replace an internal stub function called extcall, in src/runtime/extcall.r which by default just returns and error code 216. Anyone is free to dig into this interface, actually fairly well documented by Ralph Griswold in IPD217, http://www2.cs.arizona.edu/icon/ftp/doc/ipd217.pdf

It’s old, and usable, but all the recent activity has been focused on loadfunc. A small layer of code was added in version 9, (the base Icon used for Unicon core, much has changed in Unicon since then) to load C function entry points at runtime, from dynamic shared object libraries. And loadfunc was born. Foreign functions could/can be loaded into Unicon at runtime without need of special builds that extcall and callout require.

There are a lot of loadfunc examples peppered throughout the Unicon Programming document set. It opens up doors to C libraries, which are numerous and ubiquitous.

One issue with loadfunc is that the functions called have to comply with a Unicon calling convention. Routines are passed an argc argv style Unicon frame, using a count of passed in descriptors. These descriptors need to be manually converted to C native data, passed on to other C routines, and then converted back to Unicon data types for returning results. There are copious examples of managing this protocol, and support macros in ipl/cfuncs/icall.h that make this all pretty easy. But, it is still an extra layer of burden placed on a Unicon programmer aiming to use an existing C library solution to a problem, or for a speed boost.

And now a step up.

libffi

libffi is a foreign function interface library, that manages the call frame setup for all kinds of different calling conventions. 32bit, 64bit and many different operating systems are all supported. This layer was put to use to alleviate the need to use loadfunc for many/most/all C functions that a Unicon programmer may want to call. Once loaded (the experimental native(...) function is not built into Unicon, so it uses loadfunc to bootstrap), all a Unicon programmer needs to do is call native:

dlHandle := addLibrary("libraryName")

result := native("function", returnType, arguments,...)
more := native("otherFunction", returnType, argumens,...)
...

And that’s it. Under the covers the native function finds an entry point (usually after a supporting call to addLibrary which is the name of a Dynamic Shared Object module archive (a DLL)), marshals the Unicon arguments by for use by C, and dispatches a call/return sequence. Results from C are converted to the specified Unicon returnType and passed back to Unicon. Almost of this become invisible to the Unicon programmer. All you need to do is call native with a function link name and arguments. Almost all C native data types are supported.

And that is a wrinkle. C call frames need to know the exact type of each argument, and what type to return (including nothing, termed void). For many types, native can just convert to reasonable C types. Integer to int, Real to double, String to char * etc, using the handy macros built into icall.h. Sometimes this is wrong. C (currently) has two types of floating point values, 32 bit float, and 64 bit double. There are also distinctions for 8bit, 16bit, 32bit, 64bit integers, in both signed and unsigned forms. Unicon just has Integer and Real.

native allows for type overrides in the function call, using two element lists.

result := native("function", TYPEFLOAT, [x, TYPEFLOAT], [y, TYPEFLOAT])

The real values from Unicon are demoted to C float data, and the returning type is promoted from float to an acceptable Unicon Real numbers form.

These type specifications can be freely mixed

result := native("mixed", TYPEINT, [x, TYPEFLOAT], [y, TYPEDOUBLE])

That assumes that mixed has a C prototype of int mixed(float x, double y) and makes the proper arrangements for the function call, returning an Integer result back to Unicon.

Note

Please note that this experiment is at a very early stage, and some of the type constant names, and argument lists may change before this ever gets accepted into Unicon proper; if it ever gets accepted.

libharu

This entire exercise started with a desire to integrate PDF generation in Unicon by leveraging libharu, the PDF writer library. There are many tens of functions in libharu and each one would have required a small loadfunc call convention wrapper, written in C to accommodate. That led to an initial version of native() that took on the task of preparing a C call frame using inline assembler, which works, but is limited to x86_64 System V call conventions. See C Native for that blurb.

After finishing a trial of C Native, libffi was discovered. It does the same job and far more than C Native; there is a single interface, no burden to write umpteen dozen small pieces of assembler to support the various platforms that Unicon is currently built to run on, and is well supported by a team of experts in the area of foreign function calls.

Here is what the libharu integration example looks like:

#
# haru.icn, demonstrate a newer C FFI
#
$include "natives.inc"

$define HPDF_COMP_ALL 15
$define HPDF_PAGE_MODE_USE_OUTLINE 1
$define HPDF_PAGE_SIZE_LETTER 0
$define HPDF_PAGE_PORTRAIT 0
 
procedure main()
    local dlHandle, pdf, page1, rc, savefile := "harutest.pdf"

    # will be RTLD_LAZY | RTLD_GLOBAL (so add to the search path)
    addLibrary := loadfunc("./uniffi.so", "addLibrary")

    # allow arbitrary C functions, marshalled by a piece of assembler
    # assume float instead of double, changes the inline assembler
    # movsd versus movdd
    native := loadfunc("./uniffi.so", "ffi")

    # add libhpdf to the dlsym search path, the handle is irrelevant
    dlHandle := addLibrary("libhpdf.so")

    pdf := native("HPDF_New", TYPESTAR, 0, 0)

    rc := native("HPDF_SetCompressionMode", TYPEINT, pdf, HPDF_COMP_ALL)
    rc := native("HPDF_SetPageMode", TYPEINT, pdf,
                 HPDF_PAGE_MODE_USE_OUTLINE)

$ifdef PROTECTED
    rc := native("HPDF_SetPassword", TYPEINT, pdf, "owner", "user")
    savefile := "harutest-pass.pdf"
$endif

    page1 := native("HPDF_AddPage", TYPESTAR, pdf)

    rc := native("HPDF_Page_SetHeight", TYPEINT, page1,
                 [220.0, TYPEFLOAT]);
    rc := native("HPDF_Page_SetWidth", TYPEINT, page1,
                 [200.0, TYPEFLOAT]);

    #/* A part of libharu pie chart sample, Red*/
    rc := native("HPDF_Page_SetRGBFill", TYPEINT, page1,
                 [1.0, TYPEFLOAT], [0.0, TYPEFLOAT], [0.0, TYPEFLOAT]);
    rc := native("HPDF_Page_MoveTo", TYPEINT, page1,
                 [100.0, TYPEFLOAT], [100.0, TYPEFLOAT]);
    rc := native("HPDF_Page_LineTo", TYPEINT, page1,
                 [100.0,  TYPEFLOAT],[180.0, TYPEFLOAT]);
    rc := native("HPDF_Page_Arc", TYPEINT, page1,
                 [100.0, TYPEFLOAT], [100.0, TYPEFLOAT],
                 [80.0, TYPEFLOAT], [0.0, TYPEFLOAT],
                 [360 * 0.45, TYPEFLOAT]);
    
    #pos := native("HPDF_Page_GetCurrentPos (page);

    rc := native("HPDF_Page_LineTo", TYPEINT, page1,
                 [100.0, TYPEFLOAT], [100.0, TYPEFLOAT]);
    rc := native("HPDF_Page_Fill", TYPEINT, page1); 

    rc := native("HPDF_SaveToFile", TYPEINT, pdf, savefile);
    native("HPDF_Free", TYPEVOID, pdf);
end

../programs/uniffi/haru.icn

Fairly short, and sweet.

This sample barely scratches the surface of libharu features (simply drawing a partial arc, filled in red). What it highlights is that the calls occurred with no extra C source required.

This is where the excitement might start to build. Unicon programmers can focus on Unicon, leaving C to the C folk.

Here is a small GnuCOBOL program that was used during testing

      *>
      *> Demonstrate Unicon native call of COBOL modules
      *>
       identification division.
       program-id. cobolnative.

       data division.
       working-storage section.
       linkage section.
       01 one usage binary-long.
       01 two usage binary-long.

       procedure division using by value one two.
       display "GnuCOBOL got " one ", " two
       compute return-code = one + two
       goback.
       end program cobolnative.

../programs/uniffi/cobolnative.cob

The Unicon caller:

#
# cobffi.icn, test calling COBOL without wrapper with libffi
#
$include "natives.inc"

procedure main()
    # will be RTLD_LAZY | RTLD_GLOBAL (so add to the search path)
    addLibrary := loadfunc("./uniffi.so", "addLibrary")

    # allow arbitrary C functions, marshalled by libffi
    native := loadfunc("./uniffi.so", "ffi")

    # add the testing functions to the dlsym search path,
    #  the handle is somewhat irrelevant, but won't be soonish
    dlHandle := addLibrary("./cobolnative.so")

    # initialize GnuCOBOL
    native("cob_init", TYPEVOID)

    # pass two integers, get back a sum
    ans := native("cobolnative", TYPEINT, 40, 2)
    write("Unicon: called sample and got ", ans)

    # rundown the libcob runtime
    native("cob_tidy", TYPEVOID)
end

../programs/uniffi/cobffi.icn

And a sample run:

prompt$ cobc -m -Wno-unfinished cobolnative.cob
prompt$ unicon -s cobffi.icn -x
GnuCOBOL got +0000000040, +0000000002
Unicon: called sample and got 42

libffi makes calling GnuCOBOL modules from Unicon, a complete breeze.

Next steps

I plan on pestering Clinton and Jafar, and who ever else will listen to help polish this up, and hopefully get it added to the Unicon build system proper. It currently lacks some features; not all datatypes are properly supported and there needs to be some deep discussion about how indirect data references (C pointers) should be handled (they cannot be allowed to change immutable Unicon data, so an interstitial layer will need to be worked out).

I’d be honoured to continue this with a formal Unicon Technical Report, and will do so if that’s what it takes to advance this flag.

On the other side of the coin...

C calling Unicon

The unicon -C native compile sequence is pretty handy. It creates a native executable by generating C source code and compiling that intermediate into a native binary. The one point lacking is that it assumes a main is generated from the Unicon side, and does all the linking steps assuming that point of view. I’d like to extend unicon -C with a new compile time option (something like --no-main or --object or -c meaning compile/don’t link (but the -c idea was deemed to conflict with the current meaning of generate ucode), to produce object code, ready for linking to other programs.

Initial trials for this have been proven (in a hack sort of way) by changing the generated C code output by unicon -C to change the name of main to somecode and then removing the link phase from invocation of gcc that is used, to simply generate an object file with gcc -c. That code was then linked to a GnuCOBOL test program, and Unicon was called, data passed in, results returned.

The hack even went as far as returning a pointer to the Unicon global variable structure that is part of native executables, but that part would not be part of any production level release. First a shared memory space sequence would be worked out, instead of pointers into Unicon space (which can be garbage collected and moved at any time, outside normal control of a developer).

Unicon object files (meaning .o files, not class objects) will alleviate some of the need to resurrect call_icon to allow C programs to call Unicon programs. Unicon will then be able to take part in all forms of mixed language programming. Shareable libraries could be created that will allow foreign languages to enjoy direct benefit from Unicon language features without knowing anything about Unicon source code. Though one of the goals will be to demonstrate how easy that code is to read and write.

The first round of experiments relied on statically linking to the Unicon runtime system, but another phase may provide for a libunicon.so that could be dynamically linked into these callable Unicon modules. This would make for very small, easy to manage Unicon application level link libraries (or singleton object files).

Continuing this experiment has been given the nod by Clinton Jeffery, but there are many details to work out, and it won’t be part of Unicon until the entire sequence is ready at a level of quality expected by Unicon developers. There will be copious amounts of documentation available during the design, development and implementation stages.

There are lots of things to discuss, and many possibilities await.

You can follow along in the SourceForge Discussion pages at

https://sourceforge.net/p/unicon/discussion/contributions/

Have good, make well, happiest of 2017s

Previous: SourceForge