Luminescent Dreams

Immutable Lists and Function Pointers in C

January 01, 0001

Immutable Lists API

Lists, lists, lists…

I have written a lot of Clojure code recently. With Clojure inheriting from Lisp, and List standing for List Processor, I have thought quite a lot about lists recently.

I have also written a huge amount of C code recently. I took a piece of code that I wrote five years ago, noted that I would need to do a lot of work to add the new feature, so I decided to overhaul it and rewrite it The Correct Way. In this case, that means using MySQL’s prepared statement API instead of string hackery. Trust me when I say that string hackery sucks, especially since I know about SQL injection and had to protect against that, too.

I found that I had a lot of isomorphisms in the code, so like the abstraction astronaut that I aspire to become, I started templating the code. No, not the horrific C++ templates, but something more Lispy: creating functions that I could pass around easily. On and on I went, and then I started struggling with parameters to all of these various functions that I needed in order to build up and populate my queries.

“Lists!”, I said!

“No!”, said C.

C said it loudly, so I ultimately used a list data structure that a co-worker wrote for the internal library we use. It weighs a lot and does not allow for certain forms of elegance that I wanted.

So, independently, I played with the concept quite a lot and ultimately wrote my own list library.

Ah. Yes. Another one. And here I’d sworn to never do it again.

List Representations

I have a list of data. I need to store it, pass it around, maybe even modify it. I have some options.

First… a list object. I have implemented several of them over the years, starting all the way back in my data structures classes. Singly-linked lists, doubly-linked lists, circular lists, and on and on. Then later I even did some optimizations and I took up the Standard Template Library, because using it was less painful than implementing Yet Another List.

I want a list object when I plan to work with a lot of data, and I really want the object if the list itself will change a lot. But, list objects fall outside of this because they end up so thoroughly customized and optimized. Further, they do not lend themselves well to language-level idiom.n C The rest of this article will explain what I mean by that by showing what I have tried to accomplish.

Given that, I can enumerate three different common list representations:

counted lists
terminated lists
vararg lists

You know about these. You have a terminated list every time you declare a string:

const char my_str[] = "this list has a zero character terminating it"

However, your list could look more like this:

const char *my_strings[5];
my_strings[0] = "String 0";
my_strings[1] = "String 1";
my_strings[2] = "String 2";
my_strings[3] = NULL;

In this case, the list has a NULL pointer at the end of it. If you know this, you can iterate over the list like so:

ssize_t i;
for (i = 0; my_strings[i] != NULL; i++)
    printf("[%d] %s\n", i, my_strings[i];

But, what if you want to embed a NULL in the middle of the list?

const char *name[3];
name[0] = "Savanni";
name[1] = NULL;
name[2] = "D'Gerinel";

At this point, you have to count the string. You know that the string contains three values and that some of them may contain NULL. With that in mind, your downstream code will need to understand that it may encounter NULL pointers, but you will also need to explicitely tell it how many values to look at.

You will also encounter this if you deal with binary data. Technically, all data in a computer is binary, but in this case, I mean a string of data in which each cell could contain any value, including 0.

size_t img_size = 1024;
const unsigned char *img = <1024 bytes of an image>

Now for the really big huge catch.

For one thing, I have never seen any functions for list manipulation, except sort(), in glibc.

For another, functions written for one list representation will not work for another.

Let’s table this for a short time and explore a use case near and dear to my heart.

Anonymous Lists

In Clojure, nobody bats an eyelash if I do this (assuming print-names has already been defined):

(print-names "Savanni" "D'Gerinel")

In C, I have a few options:

print_name("Savanni", "D'Gerinel");
print_name((char *[]){"Savanni", "D'Gerinel"});

char * lst[] = {"Savanni", "D'Gerinel"};
print_name(lst);

Assuming I got the code right, print_name((char *[])…) and print_name(lst) are the same. This shows an example of declaring a list inline, anonymously. I can use this trick in other cases, too, such as declaring a structure. For instance:

struct timespec rem;
nanosleep(&(const struct timespec){5, 0},
          &rem);

In this case, I anonymously declared the timespec that nanosleep needs. Sometimes declaring a data structure that exists solely to become a parameter to a function is a total pain, so being able to do this anonymously adds elegance to my code. Nobody ever taught me this, nor have I seen it in example code or tutorials anywhere. I just discovered it one day digging through some of the code at my office.

DANGER: From here on out we leave behind the realms of Type Safety. We begin to play firmly in the realm of void *, and static type-checking becomes a dream of the past. You have been warned.

Returning to print_name, though, I have a problem. I did not create a terminate list, nor did I actually tell print_name how many elements the list contains. This creates disasters in C, so I have to do one or the other. Since I want the most generalized code I can possibly create, I have opted for a counted list:

print_name(2, (int []){"Savanni", "D'Gerinel"});

The other way I want to create such a list would be like so:

print_name(2, "Savanni", "D'Gerinel");

The point is to be able to call print_name() with as many different varieties of lists as possible. The problem, though, is that the two examples here require separate implementations, one accepting a counted list, and one accepting C’s vararg system.

void print_name_cnt (ssize_t cnt, int *lst);
void print_name_var (ssize_t cnt, ...);

While this would be just fine for one function, I was doing stuff like this for many different functions, and it all became very unweildy very quickly.

Enter my solution

The solution lay not in making the functions flexible, but in flexibly constructing the lists. This is something object oriented programmers probably knew about. I needed a single coherent list implementation and functions for building that list. Along with such, I needed a library of list operations, because what is the point of having such a list abstraction unless I have functions to go along with it.

Enter, my Immutable Lists API.

Through some trial and error, I developed the basic implementation:

typedef struct im_list_s {
  size_t cnt;
  const void *data[0];
} im_list_t;

In my implementation, I assume that the size of the list Will Not Change. The “im_” in “im_list_s” stands for “immutable”. Further, the entire list gets allocated with a single malloc() operation, so I can use free() to deallocate it no matter which list operation I use to call it.

Note: I do not provide ironclad guarantees of immutability. The length of the original list cannot change, but you could always hack cnt. You could change the pointers that the list points to. Be sure you understand what you are doing in that case, because I cannot protect you.

Now, the basic construction functions:

im_list_t * im_list_create_vararg (ssize_t cnt, va_list vap);
im_list_t * im_list_create_cnt (ssize_t cnt, ...);
im_list_t * im_list_create_term (void *term, ...);

Given these three functions, I can build a list in three different ways:

im_list_create_cnt(2, "Savanni", "D'Gerinel");
im_list_create_vararg (2, vap); // in this case, vap was built by some other function accepting a vararg
im_list_create_term (NULL, "Savanni", "D'Gerinel", NULL);

Given these three, I need only a single implementation for print_name:

void print_name (im_list_t *lst);

print_name(im_list_create_cnt(2, "Savanni", "D'Gerinel"));

There’s Always A Catch

This is all well and good, but what about Memory Management?

These lists that I am creating cannot be made on the stack. I have no choice but to allocate some heap space for them because I never know, until the list gets created at run time, how much space the list will need. As always, I must deallocate any allocated heap space. If I delegate that to the calling function, though, I lose all of the elegance of inline construction. I may as well go with a heavier solution.

im_list_t *lst = im_list_create_cnt(2, "Savanni", "D'Gerinel");
print_name(lst);
free(lst);

No good.

So, in my library, I decided that lists would be allocated, but managed in static variables.

im_list_create_cnt keeps a static variable internally, builds the list into that static, and returns it. If that variable already has a value, it will deallocate the old value first. This means that you cannot be totally haphazard in when you create lists – you must be done using one list before you can create another – but it does allow you the flexibility of creating a list anonymously as I wanted to do without leaking memory.

However, sometimes I do want to keep that list, so I created im_list_dup to duplicate any list. The original list remains under the control of im_list_create_cnt, but I now have a new copy that I can safely use as I desire.

im_list_t *lst = im_list_dup(im_list_create_cnt(2, "Savanni", "D'Gerinel"));
print_name(im_list_create(2, "Jim", "Thorpe"));
print_name(lst);
free(lst);

Conclusion

I am not satisfied with such a small number of operations. These operations form the basis for a list, and they certainly provide the most interesting part of my list solution, but they are not enough to be fully useful. In my code, I have already implemented a small set of additional helpful operations, and I will add more as I use the list in real projects. This article, however, describes how we can actually accomplish a little elegance in C, extending the language without going all the way to macro hackery.

Be sure to examine all of the code in more detail. I have already included an example to show my use cases and full documentation of the code proper.

Have fun, and tell me about other awesomely elegant hacks! C could really use elegant dictionary and set data structures, giving it a nice rounding of the structures that make Clojure so powerful.

Immutable Lists and Function Parameters in C by Savanni D'Gerinel is licensed under a Creative Commons Attribution-NonCommercial-SharAlike 3.0 Unported License. You can link to it, copy it, redistribute it, and modify it, but don't sell it or the modifications and don't take my name from it.

Memoru Amelia

January 01, 0001

Memoru Amelia

Mental Health Care and Transgender Populations

January 01, 0001

**Content Warning: open discussion of mental health care, suicide, and religion**

Here’s why psychiatric hospitals are so bad at helping transgender patients

Amelia Perry, Kwama Taylor, and Luna Inanis Umbra behind a transgender pride flag overlay

Amelia Perry was family to me. Four months later I get by only by forgetting that she’s gone.

Every day, on social media, I see people telling me and others like me that being transgender is a mental illness or that we need to “get help”. Fascinatingly, many of us get a lot of help if we can afford it. Many more would get help if they had the money to afford it, but transgender people outside of tech tend to be close to destitute, so I have no idea where they are expected to “get help”.

But let us for a moment say that money is no object and that we all can get help. We still cannot get care because care for mental health has been slashed so far that we have less space for psychiatric patients than we did before the Civil War. And, worse, practitioners are not being trained to deal with us, though our needs very modest and easy to honor, at least for human beings capable of listening and respecting other human beings. Further, medical practitioners are not only able to get away with refusing to treat us for ordinary medical problems, the Republican Party under Trump is pushing to ensure that it is legal everywhere to deny us service.

The people who tell me to “get help”, don’t actually want me to get help.

They want me to shut up and die.

The Catholic Church is Morally Bankrupt

The Catholic church controls so many hopsitals that 1/6th of patients end up in a Catholic hospital.

The Church has rejected us.

...Catholic hospitals that must follow a set of ethical and religious directives set by the U.S. Conference of Catholic Bishops. Indeed, one in six medical patients in the United States receive care at a Catholic hospital. The conference last year posted an open letter rejecting the legitimacy of transgender identities and urging parents to not accept their trans kids. The pop has denounced trans-affirming approaches as ideological colonization — Summary of the stand from the Catholic Conference of Bishops

The Catholic Church has invaded or blessed the invasions of native populations on every populated continent in the world. They have conducted wholesale torture over the course of a millenia. They have hidden the ways in which priests abuse and rape parishioners. They accuse us of “ideological colonization”, but we have never murdered anyone, much less entire populations, to push our ideology. We have simply tried to protect our people from harm and live normal lives in a society in which we face violent hatred.

Here are some examples of medical malpractice against transgender people, and these are simply a small number that made it into the news:

I also know at least one transgender woman who went into the hospital with broken bones after a car crash and reports that the staff stopped treaoting her as soon as they found out she was trans. They discharged her with pain medications, and she now lives with permanent pain.

If your “right” to practice your faith ends the life of human beings you are honor-bound to serve, you have committed murder, no matter the tools you used or the legality of the situation.

You who want me to shut up and die can fuck right off.

FUCK OFF.

Remember Amelia.

Please email me if you know the original creator of the resistance fist logo and the transgender variant. I would like to provide proper credit.

Nix Development Environments

January 01, 0001

nix-shell, the command that creates a subshell after evaluating any nix expression, has a lot of uses. I found it very useful in my devops work when I had multiple environments to administer, but had to use different tools for each. The shell provides excellent help in isolating my required tools to the environments involved.

A trick, though, lay in learning how to acquire those tools when the tools were not available in the nixos channel. I figured it out, and so here is the example for one of the environments I was administering. Note that I include both Linux and Darwin builds, because I wanted to offer the nix environment to my replacement at the company.

Packer – 0.10.1
Terraform – 0.7.4
Ansible 2
Python 2.7

We were deploying in Amazon AWS. I used Packer to build the custom images that we were deploying. Autoscaling works a lot better if it has a complete image that only has to be started (the Crops, i.e., the systems that can be replaced almost instantly and thus do get replaced regularly). I love Terrafrom because I was able to describe everything I was doing in AWS using the tool. Ansible is present for those systems that get reconfigured regularly (primarily the Cattle machines, things that can be rebuilt from just the devops scripts, but that I do not want to terminate). Python 2.7 is present to support Ansible, though it is sometimes convenient to have at hand.

Neither Packer nor Terraform were available in my Nix channel, so I had to build derivations for those. The process is non-obvious until it is done. Here are my scripts for them. At the time I wrote these scripts, I was running NixOS 16.03, however I still use the same scripts after having upgraded to NixOS 16.09.

nix-deps/packer.nix

{ pkgs ? import <nixpkgs> {},
  stdenv ? pkgs.stdenv }:

let
  # suggestion from @clever of #nixos
  package =
         if stdenv.system == "x86_64-linux" then "packer_0.10.1_linux_amd64.zip"
    else if stdenv.system == "x86_64-darwin" then "packer_0.10.1_darwin_amd64.zip"
    else abort "unsupported platform";
  checksum =
         if stdenv.system == "x86_64-linux" then "7d51fc5db19d02bbf32278a8116830fae33a3f9bd4440a58d23ad7c863e92e28"
    else if stdenv.system == "x86_64-darwin" then "fac621bf1fb43f0cbbe52481c8dfda2948895ad52e022e46f00bc75c07a4f181"
    else abort "unsupported platform";
in
stdenv.mkDerivation rec {
  name = "packer-${version}";
  version = "0.10.1";

  buildCommand = ''
  mkdir -p $out/bin
  unzip $src
  mv packer $out/bin/packer
  echo Installed packer to $out/bin/packer
  '';

  src = pkgs.fetchurl {
    url = "https://releases.hashicorp.com/packer/0.10.1/${package}";
    sha256 = checksum;
    name = package;
  };

  buildInputs = [ pkgs.unzip ];
}

nix-deps/terraform.nix

{ pkgs ? import <nixpkgs> {},
  stdenv ? pkgs.stdenv }:

let
  # suggestion from @clever of #nixos
  package =
         if stdenv.system == "x86_64-linux" then "terraform_0.7.4_linux_amd64.zip"
    else if stdenv.system == "x86_64-darwin" then "terraform_0.7.4_darwin_amd64.zip"
    else abort "unsupported platform";
  checksum =
         if stdenv.system == "x86_64-linux" then "8950ab77430d0ec04dc315f0d2d0433421221357b112d44aa33ed53cbf5838f6"
    else if stdenv.system == "x86_64-darwin" then "21c8ecc161628ecab88f45eba6b5ca1fbf3eb897e8bc951b0fbac4c0ad77fb04"
    else abort "unsupported platform";
in
stdenv.mkDerivation rec {
  name = "terraform-${version}";
  version = "0.7.4";

  buildCommand = ''
  mkdir -p $out/bin
  unzip $src
  mv terraform $out/bin/terraform
  echo Installed terraform to $out/bin/terraform
  '';

  src = pkgs.fetchurl {
    url = "https://releases.hashicorp.com/terraform/0.7.4/${package}";
    sha256 = checksum;
    name = package;
  };

  buildInputs = [ pkgs.unzip ];
}

The structure of each script is relatively straightforward.

declare that pkgs and stdenv are both required, as well as how to get them if they are absent
based on the OS, declare what package I want to download and the relevant checksum
declare the name and version of the derivation
create the custom build command

In many cases, the default build commands works perfectly, but that only works for projects that have to be built with autoconfig or with Stack (and possibly some other languages). Both Terraform and Packer are binaries, and so it is necessary for me to specify the build for the derivation.

In this case, the build is simply to unzip the downloaded package (specified in $src) and copy the executable into the destination (which has a root at $out). It is vital that the executable end up in the bin/ directory. I am not sure of the mandated directory structure of a derivation, but I know that derivations that did not include the bin/ directory would fail. I assume that they failed because there was no executable to add to the path.
specify precisely how to get the source package. In this case, through the fetchurl tool.
specify additional build inputs. These have to be somewhere in the nix namespace. pkgs.unzip refers to nixpkgs.unzip in the standard channel.

Both of the files above must go in a subdirectory. I named the subdirectory nix-deps/. Some subtle interaction will cause an infinite recursion if the two files are included in the root directory of your project.

With those present, it is time to build the nix-shell command:

shell.nix

let
  pkgs = import <nixpkgs> {};
  stdenv = pkgs.stdenv;
  terraform = import nix-deps/terraform.nix {};
  packer = import nix-deps/packer.nix {};

in stdenv.mkDerivation {
  name = "v2-devops";

  buildInputs = [ pkgs.ansible2
                  terraform
                  packer
                  pkgs.python
                  pkgs.python27Packages.alembic
                  pkgs.python27Packages.boto
                  pkgs.python27Packages.psycopg2
                  pkgs.awscli
                ];
}

The only difficult part here was for me to figure out how to import my Terraform and Packer derivations. I handle that with the import nix-deps/<package>.nix {} lines. The result of each import statement is a derivation, and so it is valid to include in buildInputs.

buildInputs again just lists the packages that must be included in this derivation. So, I included all of the packages that I use directly.

Thus, from the root directory of my devops folder, I can simply run nix-shell and have exactly the version of Terraform, Packer, Ansible, and Python that I want. This also means that I can have completely different versions for a different devops repository (I was actually administering three different clouds, all with different standards). And, possibly best of all, if I could convince my co-workers to use Nix (the tool, not the operating system), they would have had a trivial way to set up their development environments, also.

Processing SqlValues in Haskell

January 01, 0001

Let us assume that, whether you are just learning Haskell for the first time or have long since mastered Haskell the language, you suddenly need to hook up to a database for the first time.

Don’t laugh. Many applications work extremely well without a database.

My applications, however, almost always have a relational database somewhere in their guts. I work with a lot of relational data. I have to learn Database.HDBC early, and many of you will, too.

You should know that I have actually practiced Haskell several times over the last six years. I probably have about one year of active development scattered across that time. This time I was finally able to comprehend some concepts that really kicked my ass in previous attempts. In that time, libraries have changed and Database.HDBC has emerged as the primary interface to SQL.

Putting data into the database is really easy. Even getting data back out is easy, up to a point. You can find basic instructions on connecting for the first time, putting data into the database, and running queries, over in Chapter 21 of Real World Haskell. My article does not cover that. Instead, it covers what to do with the data that HDBC hands back to you.

I have not written much about it, but I am building up a general Fitness Application that covers workouts of the form “x repetions, y sets”. This application is currently distinct from my Cycling application, but I may merge them some time in the future.

The application

I am slowly working my way through the 100 Pushup Challenge. I have not purchased the Android app or anything and have been keeping logs of my progress in a normal text-based journal. But, I want to do some data analysis so that I can see trend lines and whether and how quickly I am making progress.

Several different workout types all follow this data structure. Whether I am doing the 100 Pushup Challenge or the 200 Situp Challenge or any of the other challenges, all of the workouts share identical data structures. I call this a SetRepWorkout (a workout consisting of multiple sets of repetitions with rest time in between) and modelled it as such:

data WorkoutType = Pushups | Situps deriving (Show, Read, Eq)
data SetRep = CSet { reps :: Integer, rest :: Integer, next :: SetRep }
            | FSet { reps Integer }
data SetRepWorkout = SetRepWorkout { uuid :: Data.UUID.UUID,
                                     date :: Data.Time.Calendar.Day,
                                     workout_type :: WorkoutType,
                                     description :: String,
                                     sets :: SetRep }

In theory, I could have represented things like so:

data SetRep = CSet { reps :: Integer, rest :: Maybe Integer }
            | FSet { reps :: Integer }
data SetRepWorkout = SetRepWorkout { uuid :: Data.UUID.UUID,
                                     date :: Data.Time.Calendar.Day,
                                     workout_type :: WorkoutType,
                                     description :: String,
                                     sets :: [SetRep] }

In fact, you will see later that I use this representation when storing the data. Arguably, this could have worked out better, but I used the original recursive definition. Even more, I probably could have left out the rest time as not informative.

Now, given the representation, we have to work on the ever-annoying functions to move the data into and out of the database.

When I started, I investigated the Persistent database architecture. I will likely investigate it again later, but I discovered that Persistent does not handle recursive data structures. When I told Persistent to save a SetRep structure, it saved the first CSet just fine, but everything in the next field ended up serialized into the next column in the database. I find this representation unacceptable as it forces all data queries to load the data into the objects, making ad-hoc queries impossible.

Changing the representation to non-recusive would have fixed the problem, but I would not have created nearly so much fodder to explain. And, ultimately, when doing complex data modelling we will all find a case in which the ORM breaks down and we must use SQL instead.

For the final application, though, I do not rule out the possibility of changing the representation. I may even drop the rest time by dropping the SetRep structure completely and making sets a list of Integers. If I make this change, Persistent becomes more viable, but still has some problems. I will discuss those down in my Conclusions.

The really easy way

Most documentation online covers running queries and retrieving data, but it rarely focuses on retrieval beyond the basic fetchRow or fetchAllRows functions. Once you have called these functions (or fetchAllRowsAL or fetchAllRowsMap), you have SqlValues, not values relevant to your domain model. Consider this:

conn <- connectSqlite3 "example.sqlite"
stmt <- prepare conn "SELECT * FROM Workout WHERE uuid=?"
execute stmt [toSql "9d4f98cd-63eb-4fad-8d9d-070f191b72db"]
rows <- fetchAllRowsAL' stmt

This looks distressingly procedural, but I could have rearranged it into a »= pipe to make it feel less so. When we examine rows, in this case we see:

[[("uuid",SqlByteString "9d4f98cd-63eb-4fad-8d9d-070f191b72db"),
  ("day",SqlByteString "2012-04-18"),
  ("workout_type",SqlByteString "Pushups"),
  ("description",SqlByteString "Week 3, Day 3, Column 2")]]

Now, remember that I called fetchAllRowsAL'. This will fetch all of the rows (hence the outermost list that has only a single element), and return each row as an association list of (column name, value). Further, this particular function fetches all data strictly, so I don’t have to worry about laziness issues when I finalize the stmt.

So, extract just the interesting workout with

let row = head rows

Then we need to convert from [(String, SqlValue)] to Workout:

let uuid = fromJust $ fromString $ fromSql $ fromJust $ lookup "uuid" row
    day = parseTime defaultTimeLocale "%Y-%m-%d" $ fromSql $ fromJust $ lookup "day" row
    workout_type = workouttypeFromSql $ fromJust $ lookup "workout_type" row
    description = fromSql $ fromJust $ lookup "description" row
SetRepWorkout uuid day workout_type description (FSet 0)

This works, but it catches no errors. fromJust will throw an exception if it gets Nothing as an input. fromString :: String -> Maybe UUID return Nothing if it receives a string in the wrong format. So, two classes of errors:

column not present
invalid data

I feel comfortable with allowing a column not present error to crash the application. The application expects a particular structure for the database and does not get it, so I consider it a programming error. But data in the database can get corrupted, and I do not want the program to crash because of an invalid data format. I would flag the error and do something with it in a higher level of the application.

A more robust (and almost as easy) way

It turns out that a more robust implementation is almost as easy to implement, but figuring that out took me seven hours of experimenting.

Notice that HDBC contains safeFromSql :: Convertible SqlValue a => SqlValue -> ConvertResult a. This leads naturally to the definition of Convertible and to ConvertResult.

The code does not include an instance for Convertible SqlValue UUID, but such an instance would help. If you had such an instance, you could say this:

let uuid = safefromSql $ fromJust $ lookup "uuid" row

(Note: I left a fromJust. lookup will return Nothing if the “uuid” column does not exist, and I consider this a programming error. So, I actually want fromJust there to blow up the program in that event.)

In fact, with proper instances, you could replace much of the above code like this:

let uuid = safeFromSql $ fromJust $ lookup "uuid" row
    day = safeFromSql $ fromJust $ lookup "day" row
    workout_type = safeFromSql $ fromJust $ lookup "workout_type" row
    description = safeFromSql $ fromJust $ lookup "description" row

Now, each of the fields will contain Left (ConvertError ...) instead of exploding when the code encounters invalid data for that field. You cannot feed that directly into a SetRepWorkoutand unravelling would be tedious, butConvertResult bis just a type alias forEither ConvertError b`, and so it functions in the Either Monad. So the entire function above can look like this:

rowToWorkout :: [(String, SqlValue)] -> ConvertResult SetRepWorkout
rowToWorkout row = do
    uuid <- safeFromSql $ fromJust $ lookup "uuid" row
    day <- safeFromSql $ fromJust $ lookup "day" row
    workout_type <- safeFromSql $ fromJust $ lookup "workout_type" row
    description <- safeFromSql $ fromJust $ lookup "description" row
    return $ SetRepWorkout uuid day workout_type desciption (FSet 0)

To make this work, you must ensure that each of the above conversions has a Convertible a b instance.

As it turns out, HDBC already provides Convertible SqlValue Day and Convertible SqlValue String. I simply had to provide Convertible SqlValue UUID and Convertible SqlValue WorkoutType. Here I provide those implementations, complete with the compiler flags necessary to even make them possible. I put the compile flags at the top of the file, but you could put them on the command line or in your Cabal file.

{-# LANGUAGE TypeSynonymInstances, FlexibleInstances, MultiParamTypeClasses #-}

import Data.ByteString.UTF8 a BUTF8 (toString)
import Data.UUID
import Data.Convertible.Base


instance Convertible SqlValue UUID where
    safeConvert (Sqlstring a) = case fromString a of
        Just b -> Right b
        Nothing -> Left $ ConvertError (show a) "SqlValue" "UUID" "Could not parse UUID"
    safeConvert (SqlByteString a) = safeConvert $ SqlString $ BUTF8.toString a
    safeConvert a = Left $ ConvertError (show a) "SqlValue" "UUID" "No conversion available"

instance Convertible SqlValue WorkoutType where
    safeConvert (SqlString a) = case a of
        "Pushups" -> Right Pushups
        "Situps" -> Right Situps
        _ -> Left $ ConvertError (show a) "SqlValue" "WorkoutType" "Unrecognized value"
    safeConvert (SqlByteString a) = safeConvert $ SqlString $ BUTF8.toString a
    safeConvert a = Left $ ConvertError (show a) "SqlValue" "WorkoutType" "No conversion available"

Given these implementations, and the rowToWorkout function, the selectWorkoutByID function is not bad, though it could probably use some more refinement.

selectWorkoutByID :: IConnection a => UUID -> a -> IO (Either String SetRepWorkout)
selectWorkoutByID w_id conn = do
    stmt <- prepare conn "SELECT * FROM Workout WHERE uuid=?"
    execute stmt [toSql $ show w_id]
    workout <- fetchAllRowsAL' stmt >>= mapM (return . rowToWorkout)
    return $ case length workout of
        0 -> Left "No workout found"
        1 -> case (head workout) of
            Right w -> Right w
            Left err -> Left $ show err
        _ -> Left "Multiple hits for a single workout Id"

Conclusions

You will need this information. For one reason or another, you will one day encounter a case in which an ORM cannot function, and then you will need to go to SQL. The needs of the database absolutely should not dictate your domain model. Persistent annoys me because, in an effort to stick with “Don’t Repeat Yourself”, it actually forces database knowledge into the domain model, which I feel is a violation of the Onion Architecture. Ultimately, though, this is a simple application and such a violation may not be so bad.

This case, however, I used for practice and for teaching. The program is so simple that I really can explore the implications of the Onion Architecture and force in a level of architectural purity. I would like you to closely examine the code and consider for yourself whether such a setup makes sense in your application. Perhaps, when I learn TemplateHaskell, I will even develop a set of templates to describe this code and allow the compiler to figure it out for me.

More concerning to me, though, is that I am not sure how Persistent can be made to follow even First Normal Form when dealing with a composite data structure without modelling the data in some very odd ways. Now, I would accept this if I were going to a key-value store or a document database, but I’m going to a relational database and I want the option of running queries based on proper relations. In almost every relational database I build, I demand at least Second Normal Form, frequently also Third Normal Form, and possibly also some insights from even higher normal forms.

Model <-> Database code is annoying, tedious, and error-prone. I recently read an article by Patrick Loi, Dependency Injection Ceremony, and really hooked on the image of the Devil on one shoulder, the Angel on the other shoulder, and the Angel saying “You should be listening to that other guy”. Read the article, as it is hilarous and informative. It also is relevant here.

Martin Fowler has some objections to ORM Hate, and he is right in that rolling your own ORM almost never makes sense. On the other hand, the ORM always (no, really, ALWAYS) violates a good Onion Architecture because it forces you to define database representation in the middle of your domain model. Ew. Gross.

Your solution is going to vary. For now, due to the extreme pain of a project at work in which I definitely did not properly isolate the domain model from the database, I will practice quite a lot with the purest architecture I can get. After I have experienced the benefits and drawbacks of this purity, I will revisit the compromises I want to make on my projects.

I have the full set of code for this fitness project (which will slowly grow into an application) on Gitlab. If you have comments or feedback, please email them directly to me at savanni@alyra.org.

Processing SqlValues in Haskell by Savanni D'Gerinel is licensed under a Creative Commons Attribution-NonCommercial-SharAlike 3.0 Unported License. You can link to it, copy it, redistribute it, and modify it, but don't sell it or the modifications and don't take my name from it.