.. index:: blog; 2017/01/02 .. Modified: 2017-01-07/02:09-0500 Unicon FFI ========== Well met, Let 2017 be the year of the ``uniffi``. Unicon Foreign Function Interface. Ok, Unicon already has a Foreign Function Interface, `loadfunc` and similar C function interfacing has been in Unicon since its inception, dating back to at least Icon version 8.10, March of 1993. There were two C interfaces documented for that release, outbound, `callout` and inbound ``icon_call``. Sadly, the inbound code in Unicon for ``icon_call`` is no longer available, *but read on for a possible future, perhaps better alternative*. The outbound interface `callout` is still in Unicon version 13, but requires a special build of the entire compiler/runtime system to replace an internal stub function called ``extcall``, in :file:`src/runtime/extcall.r` which by default just returns and error code 216. Anyone is free to dig into this interface, actually fairly well documented by `ralph` in IPD217, http://www2.cs.arizona.edu/icon/ftp/doc/ipd217.pdf It's old, and usable, but all the recent activity has been focused on `loadfunc`. A small layer of code was added in version 9, (the base Icon used for Unicon core, much has changed in Unicon since then) to load C function entry points at runtime, from dynamic shared object libraries. And `loadfunc` was born. Foreign functions could/can be loaded into Unicon at runtime without need of special builds that ``extcall`` and `callout` require. There are a lot of `loadfunc` examples peppered throughout the Unicon Programming document set. It opens up doors to C libraries, which are numerous and ubiquitous. One issue with `loadfunc` is that the functions called have to comply with a Unicon calling convention. Routines are passed an ``argc argv`` style Unicon frame, using a count of passed in descriptors. These descriptors need to be manually converted to C native data, passed on to other C routines, and then converted back to Unicon data types for returning results. There are copious examples of managing this protocol, and support macros in :file:`ipl/cfuncs/icall.h` that make this all pretty easy. But, it is still an extra layer of burden placed on a Unicon programmer aiming to use an existing C library solution to a problem, or for a speed boost. And now a step up. libffi ------ ``libffi`` is a foreign function interface library, that manages the call frame setup for all kinds of different calling conventions. 32bit, 64bit and many different operating systems are all supported. This layer was put to use to alleviate the need to use `loadfunc` for many/most/all C functions that a Unicon programmer may want to call. Once loaded (the experimental ``native(...)`` function is not built into Unicon, so it uses `loadfunc` to bootstrap), all a Unicon programmer needs to do is call ``native``: .. sourcecode:: unicon dlHandle := addLibrary("libraryName") result := native("function", returnType, arguments,...) more := native("otherFunction", returnType, argumens,...) ... And that's it. Under the covers the ``native`` function finds an entry point (usually after a supporting call to ``addLibrary`` which is the name of a Dynamic Shared Object module archive (a DLL)), marshals the :t:`Unicon` arguments by for use by :t:`C`, and dispatches a call/return sequence. Results from :t:`C` are converted to the specified Unicon ``returnType`` and passed back to :t:`Unicon`. Almost of this become invisible to the Unicon programmer. All you need to do is call ``native`` with a function link name and arguments. Almost all :t:`C` native data types are supported. And that is a wrinkle. :t:`C` call frames need to know the exact type of each argument, and what type to return (including nothing, termed ``void``). For many types, ``native`` can just convert to reasonable :t:`C` types. Integer to ``int``, Real to ``double``, String to ``char *`` etc, using the handy macros built into :file:`icall.h`. Sometimes this is wrong. :t:`C` (currently) has two types of floating point values, 32 bit ``float``, and 64 bit ``double``. There are also distinctions for 8bit, 16bit, 32bit, 64bit integers, in both signed and unsigned forms. :t:`Unicon` just has Integer and Real. ``native`` allows for type overrides in the function call, using two element lists. .. sourcecode:: unicon result := native("function", TYPEFLOAT, [x, TYPEFLOAT], [y, TYPEFLOAT]) The real values from :t:`Unicon` are demoted to :t:`C` ``float`` data, and the returning type is promoted from ``float`` to an acceptable :t:`Unicon` `real` form. These type specifications can be freely mixed .. sourcecode:: unicon result := native("mixed", TYPEINT, [x, TYPEFLOAT], [y, TYPEDOUBLE]) That assumes that mixed has a :t:`C` prototype of ``int mixed(float x, double y)`` and makes the proper arrangements for the function call, returning an Integer result back to Unicon. .. note:: Please note that this experiment is at a very early stage, and some of the type constant names, and argument lists may change before this ever gets accepted into Unicon proper; if it ever gets accepted. libharu ....... This entire exercise started with a desire to integrate PDF generation in Unicon by leveraging ``libharu``, the PDF writer library. There are many tens of functions in ``libharu`` and each one would have required a small ``loadfunc`` call convention wrapper, written in :t:`C` to accommodate. That led to an initial version of ``native()`` that took on the task of preparing a :t:`C` call frame using inline assembler, which works, but is limited to x86_64 System V call conventions. See `cnative` for that blurb. After finishing a trial of `cnative`, ``libffi`` was discovered. It does the same job and far more than `cnative`; there is a single interface, no burden to write umpteen dozen small pieces of assembler to support the various platforms that Unicon is currently built to run on, and is well supported by a team of experts in the area of foreign function calls. Here is what the ``libharu`` integration example looks like: .. literalinclude:: ../programs/uniffi/haru.icn :language: unicon :start-after: ##+ .. only:: html .. rst-class:: rightalign :download:`../programs/uniffi/haru.icn` Fairly short, and sweet. This sample barely scratches the surface of ``libharu`` features (simply drawing a partial arc, filled in red). What it highlights is that the calls occurred with no extra :t:`C` source required. This is where the excitement might start to build. :t:`Unicon` programmers can focus on :t:`Unicon`, leaving :t:`C` to the :t:`C` folk. Here is a small GnuCOBOL program that was used during testing .. literalinclude:: ../programs/uniffi/cobolnative.cob :language: cobol :start-after: *>+<* .. only:: html .. rst-class:: rightalign :download:`../programs/uniffi/cobolnative.cob` The Unicon caller: .. literalinclude:: ../programs/uniffi/cobffi.icn :language: unicon :start-after: ##+ .. only:: html .. rst-class:: rightalign :download:`../programs/uniffi/cobffi.icn` And a sample run: .. command-output:: cobc -m -Wno-unfinished cobolnative.cob :cwd: ../programs/uniffi/ .. command-output:: unicon -s cobffi.icn -x :cwd: ../programs/uniffi/ ``libffi`` makes calling GnuCOBOL modules from Unicon, a complete breeze. Next steps .......... I plan on pestering Clinton and Jafar, and who ever else will listen to help polish this up, and hopefully get it added to the Unicon build system proper. It currently lacks some features; not all datatypes are properly supported and there needs to be some deep discussion about how indirect data references (:t:`C` pointers) should be handled (they cannot be allowed to change immutable Unicon data, so an interstitial layer will need to be worked out). I'd be honoured to continue this with a formal Unicon Technical Report, and will do so if that's what it takes to advance this flag. On the other side of the coin... C calling Unicon ---------------- The ``unicon -C`` native compile sequence is pretty handy. It creates a native executable by generating C source code and compiling that intermediate into a native binary. The one point lacking is that it assumes a ``main`` is generated from the Unicon side, and does all the linking steps assuming that point of view. I'd like to extend ``unicon -C`` with a new compile time option (something like ``--no-main`` or ``--object`` or ``-c`` meaning compile/don't link (but the ``-c`` idea was deemed to conflict with the current meaning of *generate ucode*), to produce object code, ready for linking to other programs. Initial trials for this have been proven (in a hack sort of way) by changing the generated :t:`C` code output by ``unicon -C`` to change the name of *main* to *somecode* and then removing the link phase from invocation of ``gcc`` that is used, to simply generate an object file with ``gcc -c``. That code was then linked to a GnuCOBOL test program, and :t:`Unicon` was called, data passed in, results returned. The hack even went as far as returning a pointer to the :t:`Unicon` `global` variable structure that is part of native executables, but that part would not be part of any production level release. First a shared memory space sequence would be worked out, instead of pointers into :t:`Unicon` space (which can be garbage collected and moved at any time, outside normal control of a developer). :t:`Unicon` object files (meaning ``.o`` files, not class objects) will alleviate some of the need to resurrect ``call_icon`` to allow :t:`C` programs to call :t:`Unicon` programs. :t:`Unicon` will then be able to take part in all forms of mixed language programming. Shareable libraries could be created that will allow foreign languages to enjoy direct benefit from :t:`Unicon` language features without knowing anything about :t:`Unicon` source code. *Though one of the goals will be to demonstrate how easy that code is to read and write*. The first round of experiments relied on statically linking to the :t:`Unicon` runtime system, but another phase may provide for a :file:`libunicon.so` that could be dynamically linked into these callable :t:`Unicon` modules. This would make for very small, easy to manage :t:`Unicon` application level link libraries (or singleton object files). Continuing this experiment has been given the nod by `clint`, but there are many details to work out, and it won't be part of :t:`Unicon` until the entire sequence is ready at a level of quality expected by :t:`Unicon` developers. There will be copious amounts of documentation available during the design, development and implementation stages. There are lots of things to discuss, and many possibilities await. You can follow along in the SourceForge Discussion pages at https://sourceforge.net/p/unicon/discussion/contributions/ *Have good, make well, happiest of 2017s* .. post:: Jan 02 2017 :tags: uniffi :category: extension :author: Brian Tiffin :location: on.ca :language: en