Unicon FFI¶
Well met,
Let 2017 be the year of the uniffi
. Unicon Foreign Function Interface.
Ok, Unicon already has a Foreign Function Interface, loadfunc and similar C
function interfacing has been in Unicon since its inception, dating back to
at least Icon version 8.10, March of 1993. There were two C interfaces
documented for that release, outbound, callout and inbound icon_call
.
Sadly, the inbound code in Unicon for icon_call
is no longer available,
but read on for a possible future, perhaps better alternative.
The outbound interface callout is still in Unicon version 13, but requires a
special build of the entire compiler/runtime system to replace an internal
stub function called extcall
, in src/runtime/extcall.r which by
default just returns and error code 216. Anyone is free to dig into this
interface, actually fairly well documented by Ralph Griswold in IPD217,
http://www2.cs.arizona.edu/icon/ftp/doc/ipd217.pdf
It’s old, and usable, but all the recent activity has been focused on
loadfunc. A small layer of code was added in version 9, (the base Icon used
for Unicon core, much has changed in Unicon since then) to load C function
entry points at runtime, from dynamic shared object libraries. And loadfunc
was born. Foreign functions could/can be loaded into Unicon at runtime
without need of special builds that extcall
and callout require.
There are a lot of loadfunc examples peppered throughout the Unicon Programming document set. It opens up doors to C libraries, which are numerous and ubiquitous.
One issue with loadfunc is that the functions called have to comply with a
Unicon calling convention. Routines are passed an argc argv
style Unicon
frame, using a count of passed in descriptors. These descriptors need to be
manually converted to C native data, passed on to other C routines, and then
converted back to Unicon data types for returning results. There are copious
examples of managing this protocol, and support macros in
ipl/cfuncs/icall.h that make this all pretty easy. But, it is still
an extra layer of burden placed on a Unicon programmer aiming to use an
existing C library solution to a problem, or for a speed boost.
And now a step up.
libffi¶
libffi
is a foreign function interface library, that manages the call
frame setup for all kinds of different calling conventions. 32bit, 64bit and
many different operating systems are all supported. This layer was put to use
to alleviate the need to use loadfunc for many/most/all C functions that a
Unicon programmer may want to call. Once loaded (the experimental
native(...)
function is not built into Unicon, so it uses loadfunc to
bootstrap), all a Unicon programmer needs to do is call native
:
dlHandle := addLibrary("libraryName")
result := native("function", returnType, arguments,...)
more := native("otherFunction", returnType, argumens,...)
...
And that’s it. Under the covers the native
function finds an entry point
(usually after a supporting call to addLibrary
which is the name of a
Dynamic Shared Object module archive (a DLL)), marshals the Unicon
arguments by for use by C, and dispatches a call/return sequence.
Results from C are converted to the specified Unicon returnType
and
passed back to Unicon. Almost of this become invisible to the Unicon
programmer. All you need to do is call native
with a function link name
and arguments. Almost all C native data types are supported.
And that is a wrinkle. C call frames need to know the exact type of each
argument, and what type to return (including nothing, termed void
). For
many types, native
can just convert to reasonable C types. Integer
to int
, Real to double
, String to char *
etc, using the handy
macros built into icall.h. Sometimes this is wrong. C
(currently) has two types of floating point values, 32 bit float
, and 64
bit double
. There are also distinctions for 8bit, 16bit, 32bit, 64bit
integers, in both signed and unsigned forms. Unicon just has Integer and
Real.
native
allows for type overrides in the function call, using two element
lists.
result := native("function", TYPEFLOAT, [x, TYPEFLOAT], [y, TYPEFLOAT])
The real values from Unicon are demoted to C float
data, and the
returning type is promoted from float
to an acceptable Unicon Real numbers
form.
These type specifications can be freely mixed
result := native("mixed", TYPEINT, [x, TYPEFLOAT], [y, TYPEDOUBLE])
That assumes that mixed has a C prototype of int mixed(float x, double
y)
and makes the proper arrangements for the function call, returning an
Integer result back to Unicon.
Note
Please note that this experiment is at a very early stage, and some of the type constant names, and argument lists may change before this ever gets accepted into Unicon proper; if it ever gets accepted.
libharu¶
This entire exercise started with a desire to integrate PDF generation in
Unicon by leveraging libharu
, the PDF writer library. There are many tens
of functions in libharu
and each one would have required a small
loadfunc
call convention wrapper, written in C to accommodate. That
led to an initial version of native()
that took on the task of preparing a
C call frame using inline assembler, which works, but is limited to
x86_64 System V call conventions. See C Native for that blurb.
After finishing a trial of C Native, libffi
was discovered. It does the
same job and far more than C Native; there is a single interface, no burden to
write umpteen dozen small pieces of assembler to support the various platforms
that Unicon is currently built to run on, and is well supported by a team of
experts in the area of foreign function calls.
Here is what the libharu
integration example looks like:
#
# haru.icn, demonstrate a newer C FFI
#
$include "natives.inc"
$define HPDF_COMP_ALL 15
$define HPDF_PAGE_MODE_USE_OUTLINE 1
$define HPDF_PAGE_SIZE_LETTER 0
$define HPDF_PAGE_PORTRAIT 0
procedure main()
local dlHandle, pdf, page1, rc, savefile := "harutest.pdf"
# will be RTLD_LAZY | RTLD_GLOBAL (so add to the search path)
addLibrary := loadfunc("./uniffi.so", "addLibrary")
# allow arbitrary C functions, marshalled by a piece of assembler
# assume float instead of double, changes the inline assembler
# movsd versus movdd
native := loadfunc("./uniffi.so", "ffi")
# add libhpdf to the dlsym search path, the handle is irrelevant
dlHandle := addLibrary("libhpdf.so")
pdf := native("HPDF_New", TYPESTAR, 0, 0)
rc := native("HPDF_SetCompressionMode", TYPEINT, pdf, HPDF_COMP_ALL)
rc := native("HPDF_SetPageMode", TYPEINT, pdf,
HPDF_PAGE_MODE_USE_OUTLINE)
$ifdef PROTECTED
rc := native("HPDF_SetPassword", TYPEINT, pdf, "owner", "user")
savefile := "harutest-pass.pdf"
$endif
page1 := native("HPDF_AddPage", TYPESTAR, pdf)
rc := native("HPDF_Page_SetHeight", TYPEINT, page1,
[220.0, TYPEFLOAT]);
rc := native("HPDF_Page_SetWidth", TYPEINT, page1,
[200.0, TYPEFLOAT]);
#/* A part of libharu pie chart sample, Red*/
rc := native("HPDF_Page_SetRGBFill", TYPEINT, page1,
[1.0, TYPEFLOAT], [0.0, TYPEFLOAT], [0.0, TYPEFLOAT]);
rc := native("HPDF_Page_MoveTo", TYPEINT, page1,
[100.0, TYPEFLOAT], [100.0, TYPEFLOAT]);
rc := native("HPDF_Page_LineTo", TYPEINT, page1,
[100.0, TYPEFLOAT],[180.0, TYPEFLOAT]);
rc := native("HPDF_Page_Arc", TYPEINT, page1,
[100.0, TYPEFLOAT], [100.0, TYPEFLOAT],
[80.0, TYPEFLOAT], [0.0, TYPEFLOAT],
[360 * 0.45, TYPEFLOAT]);
#pos := native("HPDF_Page_GetCurrentPos (page);
rc := native("HPDF_Page_LineTo", TYPEINT, page1,
[100.0, TYPEFLOAT], [100.0, TYPEFLOAT]);
rc := native("HPDF_Page_Fill", TYPEINT, page1);
rc := native("HPDF_SaveToFile", TYPEINT, pdf, savefile);
native("HPDF_Free", TYPEVOID, pdf);
end
Fairly short, and sweet.
This sample barely scratches the surface of libharu
features (simply
drawing a partial arc, filled in red). What it highlights is that the calls
occurred with no extra C source required.
This is where the excitement might start to build. Unicon programmers can focus on Unicon, leaving C to the C folk.
Here is a small GnuCOBOL program that was used during testing
*>
*> Demonstrate Unicon native call of COBOL modules
*>
identification division.
program-id. cobolnative.
data division.
working-storage section.
linkage section.
01 one usage binary-long.
01 two usage binary-long.
procedure division using by value one two.
display "GnuCOBOL got " one ", " two
compute return-code = one + two
goback.
end program cobolnative.
../programs/uniffi/cobolnative.cob
The Unicon caller:
#
# cobffi.icn, test calling COBOL without wrapper with libffi
#
$include "natives.inc"
procedure main()
# will be RTLD_LAZY | RTLD_GLOBAL (so add to the search path)
addLibrary := loadfunc("./uniffi.so", "addLibrary")
# allow arbitrary C functions, marshalled by libffi
native := loadfunc("./uniffi.so", "ffi")
# add the testing functions to the dlsym search path,
# the handle is somewhat irrelevant, but won't be soonish
dlHandle := addLibrary("./cobolnative.so")
# initialize GnuCOBOL
native("cob_init", TYPEVOID)
# pass two integers, get back a sum
ans := native("cobolnative", TYPEINT, 40, 2)
write("Unicon: called sample and got ", ans)
# rundown the libcob runtime
native("cob_tidy", TYPEVOID)
end
And a sample run:
prompt$ cobc -m -Wno-unfinished cobolnative.cob
prompt$ unicon -s cobffi.icn -x
GnuCOBOL got +0000000040, +0000000002
Unicon: called sample and got 42
libffi
makes calling GnuCOBOL modules from Unicon, a complete breeze.
Next steps¶
I plan on pestering Clinton and Jafar, and who ever else will listen to help polish this up, and hopefully get it added to the Unicon build system proper. It currently lacks some features; not all datatypes are properly supported and there needs to be some deep discussion about how indirect data references (C pointers) should be handled (they cannot be allowed to change immutable Unicon data, so an interstitial layer will need to be worked out).
I’d be honoured to continue this with a formal Unicon Technical Report, and will do so if that’s what it takes to advance this flag.
On the other side of the coin...
C calling Unicon¶
The unicon -C
native compile sequence is pretty handy. It creates a
native executable by generating C source code and compiling that intermediate
into a native binary. The one point lacking is that it assumes a main
is
generated from the Unicon side, and does all the linking steps assuming that
point of view. I’d like to extend unicon -C
with a new compile time
option (something like --no-main
or --object
or -c
meaning
compile/don’t link (but the -c
idea was deemed to conflict with the
current meaning of generate ucode), to produce object code, ready for
linking to other programs.
Initial trials for this have been proven (in a hack sort of way) by changing
the generated C code output by unicon -C
to change the name of main
to somecode and then removing the link phase from invocation of gcc
that
is used, to simply generate an object file with gcc -c
. That code was
then linked to a GnuCOBOL test program, and Unicon was called, data
passed in, results returned.
The hack even went as far as returning a pointer to the Unicon global variable structure that is part of native executables, but that part would not be part of any production level release. First a shared memory space sequence would be worked out, instead of pointers into Unicon space (which can be garbage collected and moved at any time, outside normal control of a developer).
Unicon object files (meaning .o
files, not class objects) will
alleviate some of the need to resurrect call_icon
to allow C programs
to call Unicon programs. Unicon will then be able to take part in
all forms of mixed language programming. Shareable libraries could be created
that will allow foreign languages to enjoy direct benefit from Unicon
language features without knowing anything about Unicon source code.
Though one of the goals will be to demonstrate how easy that code is to read
and write.
The first round of experiments relied on statically linking to the Unicon runtime system, but another phase may provide for a libunicon.so that could be dynamically linked into these callable Unicon modules. This would make for very small, easy to manage Unicon application level link libraries (or singleton object files).
Continuing this experiment has been given the nod by Clinton Jeffery, but there are many details to work out, and it won’t be part of Unicon until the entire sequence is ready at a level of quality expected by Unicon developers. There will be copious amounts of documentation available during the design, development and implementation stages.
There are lots of things to discuss, and many possibilities await.
You can follow along in the SourceForge Discussion pages at
https://sourceforge.net/p/unicon/discussion/contributions/
Have good, make well, happiest of 2017s