Wanderson Ferreira

March 21, 2021

Flexible Software - Day 1: Writing DSL

Continuing on Flexible Software, if you want to tackle difficult problems in your domain in an easy way -  write your own language.

Writing your own language is a very bold statement and many of us can be caught by surprise when I say that we are creating our own little language every day at work. The real mystery here is more about the definition of language, what are Domain Specific Languages (DSL)?

Many materials cover DSL as very complex, exemplified by great DSL created in the 80s, like AWK designed for text processing inside the dungeons of Bell Labs by bright scientists and you will quickly find terms like parsing, lexing, abstract syntax tree, formal grammar, and software like yacc, and lex.

No! We don't have time to learn all of this at work, but yet there is nothing better than AWK for that particular domain (I had the pleasure to work with AWK in college for my Geophysics major... very impressive). We want to leverage our domain knowledge and spend our days talking and writing in terms of our domain.

"Build up the language to the vocabulary of your domain, and you won't have to think about the language any more, you will just be thinking about your problem" - Nate and Christoph, Functional Design in Clojure.

I believe this is a better definition of DSL for the work environment - instead of "build your own programming language", we have "build your own vocabulary". Let's look at some code now and see this in action.

The domain will be loans in a financial company and the Product Manager comes along with one new task for this week:

"A logged user must be able to fill a form and receive a loan offer
after she clicks in GET ME SOME MONEY. These are the steps to compute a loan offer:

1. verify that the user is eligible
2. verify that the user bank statement is always greater than 10k every month
3. compute the loan with 1% interest rate at year."

Great, let's do this:

(defn compute-loan-offer
  [data]
  (let [user-id (:user-id data)
	user (get-user-from-db! user-id)
	user-stmt-id (-> user :financials :bank :statement)
	bank-stmt (get-user-bank-statement! user-stmt-id)
	valid-bank-stmt? (reduce
			  (fn [valid? stmt-entry]
			   (if (> (:value stmt-entry) 10000)
			      true
			     (reduced false)))
			  false
			  bank-stmt)
	new-data (cond-> data
		  (> (:age user) 18) 
                  (assoc :user-eligible true)
		  
                  valid-bank-stmt? 
                  (assoc :user-bank-stmt :valid))]
    (if (and (:user-eligible new-data)
	     (:user-bank-stmt new-data))
      {:status 200
       :body {:message (get-loan-offer 
                          (:desired-amount new-data)
	                  (:desired-terms new-data))
              {:status 200
               :body {:message "Unfortunately, you are not qualified for any loan today"}})))

(not idiomatic formatting of this code to fit in Hey.world layout)

This should work, but now you need to know clojure because you don't know exactly what reduce, cond->, reduced, ->, and assoc, even means. This code is not so bad because at least there are some helper functions communicating with database and so on... This could be better, even without our DSL, but this will be a good example to make my point and I wished all the code I've written and read in the past was "bad" like this one.

Let's create a new solution to the same problem

;; data model

(defn user-id
  [data]
  (:user-id data))

(defn desired-amount
  [data]
  (:desired-amount data))

(defn desired-terms
  [data]
  (:desired-terms data))


;;; user model

(defn user-statement-id
  [user]
  (-> user :financials :bank :statement))

(defn user-age
  [user]
  (:age user))

(defn user-eligible?
  [user]
  (> (user-age user) 18))


;;; statement model

(def valid-statement-threshold 10000)

(defn statement-entry-value
  [statement-entry]
  (:value statement-entry))

(defn valid-statement-entry?
  [statement-entry]
  (> (statement-entry-value statement-entry)
     valid-statement-threshold))

(defn valid-statement?
  [statement]
  (reduce
   (fn [valid? entry]
     (if (valid-statement-entry? entry)
       true
       (reduced false)))
   false
   statement))


;;; response model

(defn respond-successfully
  [loan]
  {:status 200
   :body {:message loan}})


(defn respond-with-failure
  []
  {:status 200
   :body {:message "Unfortunately, you are not qualified for any loan today"}})


;;; public function

(defn compute-offer-loan
  [data]
  (let [user (get-user-from-db! (user-id data))
	bank-statement (get-user-bank-statement! 
                          (user-statement-id user))]
    (if (and (valid-statement? bank-statement)
	     (user-eligible? user))
      (respond-successfully
       (get-loan-offer (desired-amount data)
		      (desired-terms data)))
      (respond-with-failure))))


Looks like a lot more code to do the same thing, but my point is about the flexibility and clarity here:

  1. You know beforehand that you will deal with 4 different data models
  2. Access functions in data models show you what fields are required
  3. You know data shapes
  4. If shape of data changes, you only change access functions
  5. If more data comes along, you don't care
  6. New requirement to deal with the same models but have different shape, you promptly raise your hand.. problem in the air
  7. Public functions have a nice little vocabulary that requires only English skills to be read and understood.

Also notice something interesting about the first solution: we added the :user-eligible and :user-bank-statement flags in the data we received from the request and in the second version these questions are asked directly to the user and bank-statement models. This could trigger a nice discussion about who is responsible for this information? What model? How the business thinks about this? For computational perspective it does not matter, but might matter a lot for future improvements and mainly for the team to keep improving their understanding of the business domain.

This seems like redundant work sometimes, but in the long run the pays off is great. However, as programmers we tend to try to "generalize" too much in the name of "code re-use" or whatever. The following question might be over your mind already: "Nice, can I use these models in the whole application?"

I hardly think so because a user might not be the same for everyone. A sales team might have a definition for user very different than the data science team. This is just fine, in the proper place add explicitly the necessary user model and be happy (even though some fields might be repetitive).

We tend to abstract too much, get real!

I would like to point something about Clojure too, I find very hard to explain to people why clojure is great and why I like it and I realized the difficulty is because programmers expect me to say something magical like a shiny new feature that solves all the problems, but the truth is that Clojure is very boring and its main feature is not getting in your way while you build your own language.

Happy hacking!