-
Calling a Haskell function in R - a float expansion example
2016-08-11
Source(latest update : 2016-08-23 13:32:08)
In the previous article, I wrote a R function returning the binary expansion of a real number \(u \in [0,1]\). In the present article, I will:
- write a similar function in Haskell;
- write this function in a way compatible with R, inside a module;
- compile this module in a dynamic linker suitable for R (
dll
with Windows,so
with Linux); - call the function from R through the dynamic linker.
The creation of a Haskell function compatible with R is allowed by the Foreign Function Interface (FFI), in other words the
Foreign
module.I learnt how to do such things with the help of this blog post by Neil Mitchell.
Binary (and more) expansion in Haskell
Let’s go to Haskell. The
floatExpansion
function below is obtained by a small modification of thefloatToDigits
function of theNumeric
package. It returns the expansion of a real number \(u \in [0,1]\) in a given integer base.import Numeric let floatExpansion :: RealFloat a => Integer -> a -> [Int]; floatExpansion base u = replicate (- snd expansion) 0 ++ fst expansion where expansion = floatToDigits base u floatExpansion 2 0.125 ## [0,0,1]
First dynamic linker: string output
Firstly, I show how to make this function compatible with R when its output is a string instead of a list. It is easy to convert a list to a string in Haskell:
show [0, 0, 1] ## "[0,0,1]"
To get the output as a vector in R, more work is needed, and I will do it in the next section.
Make the function compatible with R
To make the function compatible with R, there are two rules:
- Every argument must be a pointer (
Ptr
) to a C compatible type:CInt
,CDouble
orCString
. - The result must be
IO ()
.
A value of type
Ptr
represents a pointer to an object. This type is provided by theForeign.Ptr
module, which is imported via theForeign
module. The typesCInt
,CDouble
andCString
are provided by theForeign.C
module.We end up with this module:
-- FloatExpansionString.hs {-# LANGUAGE ForeignFunctionInterface #-} module FloatExpansion where import Foreign import Foreign.C import Numeric foreign export ccall floatExpansionR :: Ptr CInt -> Ptr CDouble -> Ptr CString -> IO () floatExpansionR :: Ptr CInt -> Ptr CDouble -> Ptr CString -> IO () floatExpansionR base u result = do base <- peek base u <- peek u expansion <- newCString $ show $ floatExpansion (toInteger base) u poke result $ expansion floatExpansion :: RealFloat a => Integer -> a -> [Int] floatExpansion base u = replicate (- snd expansion) 0 ++ fst expansion where expansion = floatToDigits base u
Compilation
We need the following C file to do the compilation, as explained in the GHC users guide.
// StartEnd.c #include <Rts.h> void HsStart() { int argc = 1; char* argv[] = {"ghcDll", NULL}; // argv must end with NULL // Initialize Haskell runtime char** args = argv; hs_init(&argc, &args); } void HsEnd() { hs_exit(); }
Then we compile the library with the command:
ghc -shared -fPIC -dynamic -lHSrts-ghc7.10.3 FloatExpansionString.hs StartEnd.c -o FloatExpansionString.so
This creates the dynamic linker
FloatExpansionString.so
.Call in R
We firstly load the library with:
dyn.load("FloatExpansionString.so") .C("HsStart") ## list()
And we invoke the function with the help of the
.C
function, as follows:.C("floatExpansionR", base=2L, x=0.125, result="")$result ## [1] "[0,0,1]"
It works. But it would be better to have a vector as output.
Second dynamic linker: vector output
To get the output as a vector, the additional modules we need are:
Foreign.R
,Foreign.R.Types
andData.Vector.SEXP
. They are provided by theinline-r
package. The[Int]
type of the output list of thefloatExpansion
function must be converted to[Int32]
. We write a simple functionintToInt32
to help us to do the conversion. It works with the help of theData.Int
module which is imported via theForeign
module.We end up with this module:
-- FloatExpansion.hs {-# LANGUAGE ForeignFunctionInterface #-} {-# LANGUAGE DataKinds #-} module FloatExpansion where import Foreign import Foreign.C import Foreign.R (SEXP) import qualified Foreign.R.Type as R import qualified Data.Vector.SEXP as DV import Numeric foreign export ccall floatExpansionR :: Ptr CInt -> Ptr CDouble -> Ptr (SEXP s R.Int) -> IO () floatExpansionR :: Ptr CInt -> Ptr CDouble -> Ptr (SEXP s R.Int) -> IO () floatExpansionR base u result = do base <- peek base u <- peek u let expansion = map intToInt32 $ floatExpansion (toInteger base) u poke result $ DV.toSEXP $ DV.fromList expansion intToInt32 :: Int -> Int32 intToInt32 i = fromIntegral (i :: Int) :: Int32 floatExpansion :: RealFloat a => Integer -> a -> [Int] floatExpansion base u = replicate (- snd expansion) 0 ++ fst expansion where expansion = floatToDigits base u
We compile the library as before. And we load it in R as before:
dyn.load("FloatExpansion.so") .C("HsStart") ## list()
And we invoke the function with the help of the
.C
function, as follows:.C("floatExpansionR", base=2L, x=0.125, result=list(0L))$result ## [[1]] ## [1] 0 0 1
In fact, the output is a list with one element, the desired vector.
Let’s write a user-friendly function:
floatExpand <- function(x, base=2L){ .C("floatExpansionR", base=as.integer(base), x=as.double(x), result=list(integer(1)))$result[[1]] }
Let’s compare it with my R function
num2dyadic
:u <- runif(5000) system.time(lapply(u, floatExpand)) ## user system elapsed ## 0.146 0.003 0.148 system.time(lapply(u, num2dyadic)) ## user system elapsed ## 0.743 0.000 0.743
It is faster. And I have checked that the two functions always return the same results.
Moreover the “RHaskell” function allows more than the binary expansion, for example the ternary expansion:
floatExpand(1/3+1/27, base=3) ## [1] 1 0 1
Quite nice, isn’t it ?