September 15, 2015

Generate Random Data in Clojure

I need to generate fake data that looks sort of like real data pretty often. This comes in handy, for example, when building a screen mockup for a webapp and when writing and testing code without access to a customer's database.

There are 2 clojure libraries that I've used that make this really easy and fun: data.generators and test.check.

Here's a quick example of how to use both to generate a new password. Say we need a password that is 15 characters long, and must contain at least 1 lowercase letter, 1 uppercase letter, 1 digit, and 1 of the following: !, $, %, ^, or &.

Here's an example using data.generator

(require '[ :as gen])

(defn lowercase-char []
   (char (gen/uniform 97 122)))
 (defn uppercase-char []
   (char (gen/uniform 65 90)))
 (defn digit-char []
   (char (gen/uniform 48 57)))
 (defn special-char
   "Generate special character except can't be any of these #%/:?@\\"
   (char (gen/rand-nth [\! \$ \^ \& \* \?])))
 (defn gen-pwd
   [& [{:keys [len]}]]
   "Generate a random password"
   (let [length  (or len 15)
         lower   (gen/reps 1 lowercase-char)
         upper   (gen/reps 1 uppercase-char)
         special (gen/reps 1 special-char)
         digit   (gen/reps 1 digit-char)
         remain  (gen/reps (- length 4)
                         #(gen/one-of lowercase-char uppercase-char
                                    special-char digit-char))
     (apply str (gen/shuffle (concat lower upper special digit remain)))

One note about using test-generators is that when combining generators, like using gen/reps, for example, sometimes you need to return a function instead of chaining generators together. For example, the following will give you 5 of the same thing (which is probably not what you want):

(gen/reps 5 (gen/one-of gen/char gen/int))
;;=> (\湪 \湪 \湪 \湪 \湪)

This is probably what you want:

(gen/reps 5 #(gen/one-of gen/char gen/int))
;;=> (\볬 -688280928 970998496 1182876593 \?)

Here's an example using test.check.generators

 (require '[clojure.test.check.generators :as gen])
 (defn char-upper []
   (gen/fmap char
         (gen/one-of [(gen/choose 65 90)])))
 (defn char-lower []
   (gen/fmap char
             (gen/one-of [(gen/choose 97 122)])))
 (defn char-special []
   (gen/one-of [(gen/elements [\! \$ \% \^ \&])]))
 (defn gen-pwd [& [{:keys [len] :as opts}]]
   (let [len (or len 15)]
     (apply str
             (gen/frequency [[25 (char-upper)]
                             [25 (char-lower)]
                             [25 (char-special)]
                             [25 gen/s-pos-int]])

As it currently stands, I think test.check.generators has the slight edge over data.generators because it can be used both in clojure and clojurescript.

clojure.test.check.generators also has a few extra functions such as frequency and such-that that come in handy.

I haven't used test.check for property based testing but hope to have time to dig more into that soon.

The clojure toolbox site lists 2 other libraries: faker and re-rand which both look really useful as well.

Happy fake data generating!

Tags: clojure boot-clj tech