thev.net

Invert the Inversion of Control

Introduction

The term Inversion of Control refers to handing over the control of your program to an external entity, be it the OS, a framework, or any other component in your system, often in the form of setting callback functions, which will be called by the external entity at runtime. In this tutorial, we’ll look at how inversion of control affects programming practice, and how to best deal with it.

> {-# LANGUAGE GADTs, RankNTypes #-}
> import Control.Monad.Trans.Free
> import Control.Monad.Cont
> import Control.Concurrent (MVar, newMVar, modifyMVar_, modifyMVar)
> import Control.Monad.State
> import qualified Graphics.UI.GLFW as GLFW
> import qualified Graphics.Rendering.OpenGL as GL
> import Graphics.Rendering.OpenGL (($=))
> import qualified Graphics.UI.GLUT as GLUT
> import Data.IORef
> import Data.Either
> import Prelude hiding (lines)

The content of this page can be copied and pasted into a literal haskell (.lhs) file and loaded into GHCi, but you need to write your own main function in order to compile and run.

A graphics application

In 2007, I gave a sample program of using the GLFW OpenGL library when I wrote its Haskell binding. This program allows a user to draw lines by clicking and moving the mouse while holding the mouse button. Lines are represented as a list of points, where each consecutive pair represents a single line.

> type Lines = [(GL.GLint, GL.GLint)]
> 
> drawLines :: Lines -> IO ()
> drawLines lines = do
>   GL.clear [GL.ColorBuffer]
>   GL.color $ GL.Color3 1 0 (0::GL.GLdouble)
>   GL.renderPrimitive GL.Lines $ mapM_ 
>     (\ (x, y) -> GL.vertex (GL.Vertex3 (fromIntegral x) 
>                                        (fromIntegral y) 
>                                        (0::GL.GLdouble))) lines

The program handles user interaction by setting callback functions to the GLFW framework. Its application logic can be described informally as:

  1. Wait for a button press event, and then add a line segment to the list of lines, with both the start and end points set to the current mouse position;
  2. While mouse is moving, update the end point of the line just drawn with the current mouse position;
  3. Wait for a button release event, and then stop updating the lines already drawn.

Despite a very procedural nature, the above steps cannot be directly coded into programs. This is because a graphics application has to handle many other things besides application logic, just to name a few:

  • When the application window is minimized and then restored, the screen has to be re-drawn;
  • When the user wants to close the application window, the application has to free up resources and quit.

Often we say such applications are event driven or reactive, because they must respond to user input as well as system events. Further more they must also be rewritten in way set by the conventions of the graphics framework we use, GLFW in this case mandates an inversion of control. The following is the actual code that implements our application logic:

> waitForPress lines dirty = do
>     GLFW.mousePosCallback    $= \_   -> return ()
>     GLFW.mouseButtonCallback $= \b s -> 
>         when (b == GLFW.ButtonLeft && s == GLFW.Press) $ do
>             GL.Position x y <- GL.get GLFW.mousePos
>             modifyIORef lines (((x,y):) . ((x,y):))
>             waitForRelease lines dirty
> 
> waitForRelease lines dirty = do 
>     GLFW.mousePosCallback $= \(GL.Position x y) -> do
>         modifyIORef lines (((x,y):) . tail)
>         writeIORef dirty True
>     GLFW.mouseButtonCallback $= \b s ->
>         when (b == GLFW.ButtonLeft && s == GLFW.Release) $
>           waitForPress lines dirty

It is easy to see that the above code implements a 2-state automata: waitForPress transits to waitForRelease when we detect a mouse press event, and waitForRelease transits back to waitForPress when we detect a mouse release event. In the waitForRlease state, we also have to setup mousePosCallback to in order to move the end point of the line. Some extra effort went into marking a dirty bit so that we know when the screen needs to be be re-drawn, and when it needs not to.

After setting up initial callbacks, th main body of the applications runs in a loop that repeatedly waits for next events, draw screen when the dirty bit is set, and quits when window is closed:

> linesApp = do
>   -- three state variables
>   lines <- newIORef []
>   dirty <- newIORef True
>   quit  <- newIORef False
>   -- refresh and close callback will set the dirty and quit state.
>   GLFW.windowSizeCallback    $= set2DViewport
>   GLFW.windowRefreshCallback $= writeIORef dirty True
>   GLFW.windowCloseCallback   $= (writeIORef quit True >> return True)
>   -- by default start with waitForPress
>   waitForPress lines dirty
>   let loop = do
>         -- wait for next user or system event
>         GLFW.waitEvents
>         -- redraw screen if dirty
>         d <- readIORef dirty
>         when d $ readIORef lines >>= drawLines >> GLFW.swapBuffers
>         writeIORef dirty False
>         -- check if we need to quit the loop
>         q <- readIORef quit
>         unless q loop
>   loop
> 
> set2DViewport size@(GL.Size w h) = do
>   GL.viewport   $= (GL.Position 0 0, size)
>   GL.matrixMode $= GL.Projection
>   GL.loadIdentity
>   GL.ortho2D 0 (realToFrac w) (realToFrac h) 0

Many event driven programs are written this way. They implement an automata that in response to user inputs makes transition from one state to another, sets up callbacks accordingly, and in the mean time, updates global states (the lines and dirty IORefs in our case). The main program runs in a loop: receive user input, draw the screen, and repeat.

Reinversion of Control

It all worked. So what is the problem?

The first problem is that inversion of control is a b*tch to program with. Turning complex program logics into a big interwoven automata is not only error prone, but also difficult to debug and hard to understand. Luckily, fearless functional programmers have more than one magic tricks at hand, we can actually invert the inversion of control back into a normal control flow by using continuations.

Almost a year ago, Dan “sigfpe” Piponi gave an example of an animation program written using GLUT, where he used continuation transformer ContT to turn the callback riddled program back into a normal one. He called it “reinversion of control”.

GLUT is actually a more rigid framework than GLFW. It insists everything must be done in callbacks, including screen rendering. But with continuation, it is easy to turn everything up-side-down, which means we can still write very procedural looking program in a beloved monad, while counting on CPS (Continuation Pass Style) transformations to turn it into an event-driven one. Here is the animation function that Dan gave:

> imperative :: IO ()
> imperative = flip runContT return $ do
>   -- this monad is of type ContT () IO ()
>   forever $ do
>     forM_ [-1, -0.992 .. 1.0] $ \y -> do
>       render $ display y
>       yield
>     liftIO $ print "Bounce!"
>     forM_ [-1, -0.992 .. 1.0] $ \y -> do
>       render $ display (-y)
>       yield
>     liftIO $ print "Bounce!"
>     yield
>   where
>     yield :: ContT () IO ()
>     yield = ContT $ \f -> GLUT.idleCallback $= Just (f ())
>     render f = liftIO $ GLUT.displayCallback $= f
> 
> display :: GL.GLdouble -> IO ()
> display y = do
>   GL.clear [GL.ColorBuffer]
>   GL.renderPrimitive GL.LineStrip $ do
>     GL.vertex (GL.Vertex2 (-1) (-y))
>     GL.vertex (GL.Vertex2 1 y)
>     GLUT.swapBuffers
>     GLUT.postRedisplay Nothing 

The function imperative repeatedly renders an animation based on the changing value of y, and everytime it finishes rendering a single frame, it calls yield. The actual trick is in yield, where the rest of program is passed as to a continuation function in its argument f, and this continuation function does nothing but setting the idleCallback to call f. This means the evaluation of imperative will actually return after it encounters the first yield. It is only when the system actually calls the idle callback that the rest of program after yield (but within the continuation monad) would resume, until it hits the next yield.

A very neat trick indeed. Continuations do wonders, and Lisp folks have certainly enjoyed them for a long time. Giving the uprise of asynchronous programming for Web applications (the latest being nodejs, which does all I/O asynchronously), techniques such as continuations are being re-discovered. But I digress.

Handling User Input

The story would have ended here if we were only doing an animation program, but event driven applications have more than idle events to worry about. In a response to Dan’s blog, Anders Leino gave a gist on handling user inputs in a similar fashion to yield:

> imperative' = flip runContT return $ do
>   forever $ do
>     -- ...
>     liftIO $ print "yielding for input now"
>     (key, _, _, _) <- yieldInput
>     liftIO $ putStrLn $ "user pressed: " ++ show key
>     -- ...
> 
> yieldInput = do
>   ContT $ \k -> do
>     GLUT.keyboardMouseCallback $= Just (
>       \key state mod pos -> do
>         GLUT.keyboardMouseCallback $= Nothing -- unhook
>         k (key, state, mod, pos)
>         )

So in the middle of the animation loop, it will actually pause to wait for a key event from the user. With these techniques, we can now try to rewrite our line drawing application using ContT. Here is what it might look like under GLFW:

> interaction lines dirty = forever $ do
>    watch buttonPress
>    (GL.Position x y) <- liftIO $ GL.get GLFW.mousePos
>    liftIO $ modifyIORef lines (((x,y):) . ((x,y):)) 
>    liftIO $ writeIORef dirty True
>    repeatUntil buttonRelease $ do
>      (GL.Position x y) <- liftIO $ GL.get GLFW.mousePos
>      liftIO $ modifyIORef lines (((x,y):) . tail) 
>      liftIO $ writeIORef dirty True

The watch function is like yieldInput, except that it takes an extra predicate function, and will only return when the predicate is satisfied. The repeatUntil function will repeat a sub-procedure until its predicate is no longer satisfied. Comparing this to our earlier code written in callbacks, we immediately realize how closely this implementation matches up to the informal application logic. Very readable and easy to maintain.

Escaping the IO Monad

Again, the story would have ended here if we just implement watch and yieldInput and be done about it. But looking back, we may find ourselves (at least I do) not satisfied with IORefs: they represent some kind of global states, through which we exchange information between the user interaction and the main loop. Global states and overuse of IORefs would hurt modularity because they don’t compose well, and most importantly, we all know the solution to this problem – monads.

Besides, many great libraries are already providing re-usable components in the form of monads or monad transformers. It’ll be great if a graphics application can just make use of them by composing monads or by stacking monad transformers together. So the question is: can we replace the IO in ContT () IO () with an arbitrary monad m?

It turns out that modifying the type of yield is rather difficult. We can’t just change its type like below:

-- This definition has a type error!
yield :: Monad m => ContT () m ()
yield = ContT $ \f -> GLUT.idleCallback $= Just (f ())

Since we want yield to have a more general type Monad m => ContT () m (), the variable f needs to be of type () -> m (), but idleCallback can only be set to an IO function:

type IdleCallback = IO ()
idleCallback :: SettableStateVar (Maybe IdleCallback)

So immediately we have a type error in yield. Can we somehow map m () it into an IO () type? There would be some hope if m is not just any monad, but a transformer over IO. So how about giving yield the following type:

yield :: MonadTrans t => ContT () (t IO) ()

But even this is difficult. Michael Snoyman explained how to use the monad-control library to tackle a similar problem about a year ago. Here is a snippet of what he wrote:

type RunInBase m base = forall b. m b -> base (m b)

class MonadIO m => MonadControlIO m where
  liftControlIO :: (RunInBase m IO -> IO a) -> m a

… where base is the underlying monad, and m is a stack build on top of it. RunInBase is a function that takes a value of the entire stack pops out that base, and puts it on the outside.

This actually looks very similar to what we wanted. If m is of MonadIO class, and base is IO, then we can effectively turn any MonadIO instances into an IO monad and back with the help of liftControlIO!

If this sounds too good to be true, you may start to wonder where is the catch. Michael further explained the cautions of using monad-control towards the end of his article. Edward Z. Yang later criticised the tricky semantics of monad-control, and it raised quite a bit of discussions. I won’t go into the details here, but it’s fair to say that monad-control makes no guarantee that all side effects of a monad stack will be kept except for its IO effect, as stated in its package documentation.

But even without this kind of tricky business to watch out for, liftControlIO is unsuited to setting callbacks in an arbitrary monad for a more obvious reason: the callback function may not be invoked immediately, so any side effect of running it will not be carried through at setting the callback! Wikipedia delivers the following punch line about inversion of control:

“The Hollywood Principle: Don’t call us, we’ll call you”

So we are kind of stuck. We were trying to get around the types of callback functions, but we cannot. What if we take a step back, can we get around the callback business altogether?

Mixing Events and Threads

Truth be told, callbacks are not the only way to program event-driven applications, we can use a concurrency monad for them. Threads can be forked to listen to event channels, and we only need to setup callback functions to deliver events to these channels. For this purpose GHC already gives us light-weight threads, but since we want to escape the IO monad, we’ll roll our own concurrency monad with co-operative threads (also called co-routines), which is not a difficult job as demonstrated by Koen Claessen more than a decade ago. The following code is loosely based on the paper Combining Events And Threads For Scalable Network Services, by Peng Li and Steve Zdancewic, in PLDI, 2007.

> type TaskT e m a = ContT (Trace m e) m a
> 
> exit    :: Monad m => TaskT e m ()
> yield   :: Monad m => TaskT e m ()
> fork    :: Monad m => TaskT e m a -> TaskT e m ()
> watch   :: Monad m => (e -> Maybe a) -> TaskT e m a
> signal  :: Monad m => e -> TaskT e m ()

The type TaskT e m is a monad over event type e and base monad m. It is effectively equivalent to a ContT over monad m whose continuation returns a Trace type (defined below). The operators for the concurrent task monad are: exit the entire monad, yield to other tasks, fork a new task, watch for an event before continuing, signal an event to all tasks that are on watch.

> data Trace m e where
>   EXIT   :: Trace m e
>   RET    :: Trace m e                                           
>   YIELD  :: m (Trace m e) -> Trace m e                         
>   FORK   :: m (Trace m e) -> m (Trace m e) -> Trace m e        
>   WATCH  :: (e -> Maybe v) -> (v -> m (Trace m e)) -> Trace m e
>   SIGNAL :: e -> m (Trace m e) -> Trace m e                    
> 
> runTrace :: Monad m => m (Trace m e) -> m ()
> runTrace prog = loop [prog] []
>   where
>     -- loop runs a simple round robin scheduler.
>     loop [] _ = return ()
>     loop (m:ms) ss = m >>= run
>       where
>         run EXIT         = return ()
>         run RET          = loop ms ss
>         run (YIELD t)    = loop (ms ++ [t]) ss
>         run (FORK t1 t2) = loop (t1:t2:ms) ss
>         run (WATCH f g)  = loop ms (WATCH f g : ss)
>         run (SIGNAL e t) = loop (ms' ++ [t] ++ ms) ss'
>           where (ms', ss') = partitionEithers evs
>                 evs = [ maybe (Right x) (Left . g) (f e) 
>                       | x@(WATCH f g) <- ss ]

The Trace type represents the control flow of executing a task monad. Its constructors corresponds to task operators. Function runTrace execute a task trace with a simple round-robin scheduler. Its helper function loop takes two arguments, an active task queue that are run in FIFO order, and an suspended task queue which will be resumed upon next event.

We can translate a task monad to its trace with little effort.

> runTask :: Monad m => TaskT e m a -> m ()
> runTask = runTrace . taskToTrace
> 
> taskToTrace :: Monad m => TaskT e m a -> m (Trace m e)
> taskToTrace = (`runContT` (\_ -> return RET))
> 
> exit     = ContT $ \_ -> return EXIT 
> yield    = ContT $ return . YIELD . ($())
> fork p   = ContT $ return . FORK (taskToTrace p) . ($())
> watch f  = ContT $ return . WATCH f 
> signal e = ContT $ return . SIGNAL e . ($())

Function taskToTrace converts a task monad to its trace by running it with a continuation function that returns RET. The other task operators are coded using continuation functions that return their respective Trace counterparts. Function runTask excutes a task monad in its base monad.

In less than 50 lines of code, we have implemented TaskT, a monad transformer that can be used to program event-driven applications with co-operative threads.

EDIT: As later pointed out by Gabriel “tekmo” Gonzalez on reddit, the only thing that the continuation transfomer does here is to get a syntax tree represented by Trace. Alternatively, we could have used a more direct approach by defining TaskT as a free monad transformer FreeT, or simply adding a Pure constructor to Trace and make it an instance of Monad. Interested readers may give these a try.

To use it with GLFW, we also need some helper functions to set up callbacks to deliver events to a task monad:

> data Event 
>   = MouseButtonEvent GLFW.MouseButton GLFW.KeyButtonState
>   | MousePosEvent GL.Position
>   | WindowRefreshEvent 
>   | WindowCloseEvent 
>   | WindowSizeEvent GL.Size
>   -- ... other event types not shown here for brevity
> 
> registerTaskCallbacks :: IO (M ())
> registerTaskCallbacks = do
>   q <- newMVar []
>   let enqueue x = modifyMVar_ q $ \l -> return (x:l)
>       dequeue = modifyMVar q $ \l -> return ([], l)
>   GLFW.windowCloseCallback   $= (enqueue WindowCloseEvent >> return True)
>   GLFW.windowSizeCallback    $= enqueue . WindowSizeEvent
>   GLFW.windowRefreshCallback $= enqueue WindowRefreshEvent
>   GLFW.mouseButtonCallback   $= \ b s -> enqueue (MouseButtonEvent b s)
>   GLFW.mousePosCallback      $= enqueue . MousePosEvent
>   -- ...
>   return (liftIO (GLFW.waitEvents >> dequeue) >>= mapM_ signal . reverse)

Function registerTaskCallbacks maintains an internal event queue to temporarily hold events received from the system. All callbacks are setup to put events into this queue. Finally it returns a task that will wait for next available events before signaling them to other tasks.

We also need a few event predicates to use with watch.

> buttonPress, buttonRelease :: Event -> Maybe GLFW.MouseButton 
> buttonPress   (MouseButtonEvent b GLFW.Press)   = Just b
> buttonPress   _                                 = Nothing
> buttonRelease (MouseButtonEvent b GLFW.Release) = Just b
> buttonRelease _                                 = Nothing
> 
> onRefresh, onClose :: Event -> Maybe ()
> onRefresh WindowRefreshEvent = Just ()
> onRefresh _                  = Nothing
> onClose   WindowCloseEvent   = Just ()
> onClose   _                  = Nothing
> onSize   (WindowSizeEvent s) = Just s
> onSize    _                  = Nothing

And a helper function that loops over a task until an event predicate is satisfied.

> repeatUntil f m = loop
>   where loop = m >> watch Just >>= maybe loop return . f

Now we are ready to program the full application of line drawing. Instead of using IORefs, we store lines and dirty in a state monad.

> data S = S { lines :: Lines, dirty :: Bool }
> type M a = TaskT Event (StateT S IO) a
> 
> getLines      = fmap lines get
> modifyLines f = modify $ \x -> x { lines = f (lines x), dirty = True }
> getDirty      = fmap dirty get
> putDirty y    = modify $ \x -> x { dirty = y }
> 
> lineTask :: IO ()
> lineTask = (`evalStateT` (S [] False)) . runTask $ do
>   -- here the monad is of type M ()
>   waitForEvents <- liftIO registerTaskCallbacks
>   fork $ forever $ watch onSize >>= liftIO . set2DViewport
>   fork $ forever $ watch onRefresh >> putDirty True
>   fork $ forever $ watch onClose >> exit
>   fork $ forever $ interaction
>   forever $ do
>     waitForEvents
>     d <- getDirty
>     when d $ getLines >>= liftIO . drawLines >> liftIO GLFW.swapBuffers
>     putDirty False
>     yield              -- give other tasks chance to run
>   where
>     interaction = do
>       watch buttonPress
>       (GL.Position x y) <- liftIO $ GL.get GLFW.mousePos
>       modifyLines (((x,y):) . ((x,y):)) 
>       repeatUntil buttonRelease $ do
>         (GL.Position x y) <- liftIO $ GL.get GLFW.mousePos
>         modifyLines (((x,y):) . tail) 

The main program lineTask runs the task/state monad combo with an initial state. What the monad does can be summarized as: as:

  1. Fork tasks to watch for window size, refresh and close events, and do the right thing at handling them.
  2. Fork a task to handle user interaction, where we implement our application logic.
  3. Enter a main loop that will wait until all events are delivered, then draw the screen if the dirty bit is set, and repeat.

The interaction task is basically the same as we looked at before except for the state monad operations. We have successfully turned a callback and automata based program into a concurrent program with task monad, and at the same time mitigated the IO restriction to allow a more flexible composition of monad transformers.

Conclusion

Inversion of control is not necessarily a bad thing, and in many occasions, it cannot be avoided. With the right set of tools, functional programmers not only want to get the job done, but are also concerned with the level of abstraction, modularity and re-usability of the code they write. For event-driven applications, a hybrid model of event and thread gets us the best of both worlds: smooth data flow and natural control flow. And with co-operative threads, we even get to use our favorite monads!

I’ve put up the new code in this tutorial as two separate packages on HackageDB: monad-task and GLFW-task. The monad-task package introduces TaskT as a new type instead of just a type synonym to ContT, as well as a class called MonadTask that in the same vein as MonadIO encompasses the set of task operators. The GLFW-task package includes a set of utility functions to work with TaskT in GLFW, as well as the source code for the improved line drawing application discussed here.

As an epilogue, it must be pointed out that making TaskT work with GLUT is not as easy as with GLFW, because screen rendering in GLUT must be done in a callback instead of the main loop. This requires an extra layer of ContT either on top of or under TaskT. I’ll leave it as an exercise for daring souls to adapt Dan Piponi’s animation program to TaskT, and maybe even to write a GLUT-task package.

Comments powered by Disqus