Dark theme | Light theme

April 1, 2020

Clojure Goodness: Splitting Strings

In Clojure we can use the clojure.string/split function to split a string, based on a regular expression, into a vector with string values. Optionally we can also specify a limit on the maximum number of returned string values we want. If we want to split a string based on the newline characters we can use the function clojure.string/split-lines that returns a vector where each element is a line from the original multi-line string.

The following example shows several usages of the split and split-lines functions:

(ns mrhaki.string.split
  (:require [clojure.string :as str]
            [clojure.test :refer [is are]]))

;; Sample string to split.
(def issue "CLJ-90210: Subject")

;; Split on - and : to get vector with string values.
(is (= ["CLJ" "90210" "Subject"] (str/split issue #"-|: ")))

;; The split function accepts a third argument that is
;; a limit on the number of splits that are returned.
(is (= ["CLJ" "90210" "Subject"] 
       (str/split issue #"-|: " 0)
       (str/split issue #"-|: " 3)))

(is (= [issue] (str/split issue #"-|: " 1)))
(is (= ["CLJ" "90210: Subject"] (str/split issue #"-|: " 2)))

;; Multiline sample string to split per line and
;; the split each line.
(def itinerary "loc:LVG time:15h-16h activity:Binge-watching
loc:DNR time:18h-19h activity:Eating
loc:MBR time:23h-7h activity:Sleeping")

;; Using split-line function we get a vector
;; where each line is an element.
;; Then for each line we split on : and \s+ and 
;; convert it to a map.
;; E.g. first line is 
;; {"loc" "LVG" "time" "15h-16h" "activity" "Binge-watching"}
(def agenda (map #(apply hash-map (str/split % #":|\s+"))
                 (str/split-lines itinerary)))

(is (= "LVG" ((first agenda) "loc")))
(is (= "15h-16h" ((first agenda) "time")))
(is (= "Binge-watching" ((first agenda) "activity")))

(is (= "DNR" ((nth agenda 1) "loc")))
(is (= "18h-19h" ((nth agenda 1) "time")))
(is (= "Eating" ((nth agenda 1) "activity")))

(are [value m key] (= value ((last m) key))
                   "MBR" agenda "loc"
                   "23h-7h" agenda "time"
                   "Sleeping" agenda "activity")

Written with Clojure 1.10.1.