The reader should have read something about monads in order not to get lost, to keep some contact with the theory. But we shall cover the essential, omitting altogether the "true monads", such as IO from the discussion. The idea of the monad in the context we want to treat, is the following: this is an abstraction of a computation, which replaces the piece of data. In "normal" functional programming we have pieces of data, and we apply functions to them, this is all. Expressions are built from nested applications of functions to some internal data. And we call it pompously "the trivial monad". Now, the remaining of the discussion is a bit limited and incomplete. But, the programs included here are to be read and understood, not just to put them in your personal library.
Moreover, if the function is non-deterministic, a singular, partial solution gets again converted into a list. We must flatten-out this proliferation of "embeddings", and generate always one flat list of potential solutions.
The easiest problem is the instruction: pick any element of the list l. The answer is obviously... l itself. Any questions?
Let's pass to a more difficult problem, how to return a list without one, chosen arbitrarily, element. So, if l=[1,2,3,4], the non-det answer will be a list of lists
[[2,3,4],[1,3,4],[1,2,4],[1,2,3]] -- or any other orderLet's begin with a model in which the non-determinism is a kind of primitive. Suppose that we have at our disposal a mythical language, which has an operator of logical alternative, OR. We may formulate the answer to the original question in the following manner: either remove the head of the list, or keep the head, but remove an element from the tail:
remove (x:xq) = xq OR x:(remove xq)
remove [] = FAIL
where FAIL means that this is impossible. No answer. Now, recall what we have said: the system should yield all answers. If there are none, it should return an empty list. On the other hand, OR means, the answer at the left or the answer at the right, so, both of them. But this is not trivial. At the left we have one concrete answer. We may put it into a list, and replace OR by (++), but at the right we have an abomination: the form (x:) adds an element to one concrete list, one possible answer, not to all of them.
So, the conclusion of this experience may be boiled down to the following observation:
If we need to apply a "normal" function to a non-det value, we map the function through it (the list of all solutions).We code thus
remove (x:xq) = [xq] ++ map (x :) (remove xq) remove [] = []It is silly to write [xq]++ .... We have thus finally, with a bit of real, but also of some dubious optimization
remove [_] = [[]] remove (x:xq) = xq : map (x :) (remove xq)Why this "optimization", which make the expression remove [] bomb (launch an exception) instead of failing (returning [])?
In fact there is no 'real' reason, just a reminder of some useful truths:
Another example, the permutations. We begin with the following sub-problem: how to insert an item non-deterministically, in any place, of a list, e.g., to insert 99 in [1,2,3,4], returning [99,1,2,3,4], [1,99,2,3,4], [1,2,99,3,4], [1,2,3,99,4], [1,2,3,4,99]]. The algorthm goes as follows: either insert it at the beginning, or keep the head, and insert it into the tail, restoring the head after.
ndinsert x l@(y:yq) = (x:l) OR y : ndinsert x yq
ndinsert x [] = [x]
Now, we know the song:
ndinsert x [] = [[x]] -- Yes! Don't forget additional [] ndinsert x l@(y:yq) = (x:l) : map (y :) (ndinsert x yq)Now, the permutation problem itself. The idea is: detach the head, find a permutation of the tail; put the detached head somewhere in the permuted list. Non-deterministically:
permut [] = []
permut (x:xq) = ndinsert x (permut xq)
... and we notice another structural problem. This time we apply a non-det function ndinsert to a non-det argument, produced by permut.This last, makes a list of answers, so ndinsert must be mapped. But this does not suffice, since for each concrete instance of the response, ndinsert produces a new non-det collection. With just a map we would obtain a list of list of answers, not a list of answers. So, it is necessary to flatten the resulting list, using concat.
permut [] = [[]] permut (x:xq) = concat (map (ndinsert x) (permut xq))For permut [0,1,2,3] Haskell responds joyfully:
[[0,1,2,3],[1,0,2,3],[1,2,0,3],[1,2,3,0],[0,2,1,3],[2,0,1,3], [2,1,0,3],[2,1,3,0],[0,2,3,1],[2,0,3,1],[2,3,0,1],[2,3,1,0], [0,1,3,2],[1,0,3,2],[1,3,0,2],[1,3,2,0],[0,3,1,2],[3,0,1,2], [3,1,0,2],[3,1,2,0],[0,3,2,1],[3,0,2,1],[3,2,0,1],[3,2,1,0]]and that's it. Exercice. Make another variant of the permutation algorithm, where first one removes an element non-deterministically from the list, and keeps this element separately. Then, generate a permutation of a truncated list, and re-insert the stored element at the front.
subset (x:xq) = (subset xq) OR x:(subset xq)
So, this is easy. Concatenate the two alternatives, which after a mild optimisation, and completion of the empty case gives
powerset [] = [[]] powerset (x:xq) = l ++ map (x:) l where l = powerset xqExercice. Without testing this function on a computer, find out in which order the algorithm generates the powerset of [0 .. 3].
comb 0 l = []
comb m (x:xq) = (x : comb (m-1) xq) OR (comb m xq)
comb _ _ = FAIL
so,
comb 0 _ = [[]] comb m (x:xq) = map (x:) (comb (m-1) xq) ++ comb m xq comb _ _ = []Any questions? The translation here is quite straightforward.
7
6 1
5 2
5 1 1
4 3
4 2 1
4 1 1 1
3 3 1
3 2 2
3 2 1 1
3 1 1 1 1
2 1 1 1 1 1
1 1 1 1 1 1 1
The number of partition algorithms is quite considerable... We shall analyze just one, which encourages the use of the non-deterministic reasoning, and recursivity (of course). Since the partition is ordered (this is the meaning of "order does not count"; all orderings are equivalent, so choose one representant), we may begin with the choice of the first element of the list, say, M, subtract it from N, and arbitrarily generate the partition of the remaining value N-M. There is one constraint which must be respected: further partitions must limit the choice of the first element to be inferior or equal to M. We might thus introduce an auxiliary function pmax, which takes a second argument, the maximum value of the first component:
partit n = pmax n n
and now recur. We introduce a trivial non-deterministic function choice a b which "generates" a number between a and b:
pmax n mx = let m = choice 1 mx
k = n-m
b = min k m
in m : pmax k b
The minimum function is necessary in order to rationalize the choice - either the maximum value imposed, or just the remaining number, if it is smaller. We see that there is no OR pseudo-operator here. The non-determinism is dictated by choice, and passing from the conceptual "any", to implemented in Haskell "all", we shall map the further decisions over the values of the list representing the choice. Now we must be careful, since there will be two maps. One is related to (m :) since pmax is non-deterministic. The second is over the choice. In this latter case there is a non-deterministic embedding, so we shouldn't forget about the flattening of the list. With some experience, the coding is straightforward:
partit n = pmax n n pmax 0 _ = [[]] pmax n mx = let seg m = map (m :) (pmax k (min k m)) where k=n-m in concat (map seg [mx,mx-1 .. 1])(where the choice list has been arranged in the decreasing order, in order to get from (partit 7) the result: [[7], [6,1], [5,2], [5,1,1], [4,3], [4,2,1], [4,1,1,1], [3,3,1], [3,2,2], [3,2,1,1], [3,1,1,1,1], [2,2,2,1], [2,2,1,1,1], [2,1,1,1,1,1], [1,1,1,1,1,1,1]]).
This will end the combinatorial examples. They may help you in solving some AI problems using the Prolog-style reasoning, but coded in a "classical", deterministic language, without tricks ad hoc. Of course, the laziness of Haskell is quite important, if the solution collection is long; in a strict language it is too easy to overflow the memory. And, needless to underline, the blind "generate-and-test" programming is usually so inefficient, that it is not useful for serious AI tasks.
When the last queen is placed, we store the solution as correct, and we continue anyow with the same backtracking as above, in order to generate another solution. So, conceptually everything is clear. But now, take into account that while programming we cannot "remove a queen", this would be a side-effect, requiring several global variables to store temporarily the last configuration, plus the data permitting to restart the process. We are constructing a logical/functional algorithm, where we only can generate another configuration, with all the book-keeping kept local.
First thing to do is to establish the representation of the data used. It would be too inefficient to keep in the memory the full "matrix": a list of 8 lists of 8 elements (rows of columns, or columns of rows). It suffices to generate a permutation of [1 .. 8] with the interpretation: the value of the i-th element is the column of the queen placed on the i-th row. Or, more confortable convention: the i-th element of the list l is the column of the queen placed on the (length(l)-i+1)th row. I.e., the list l=[a,b,c,d] is constructed as follows: first, we put d on the first row. Then, c on the second, b on the third, and a on the fourth. In such a way, adding new elements at the head of l follows the same conventions.
Let begin with a construction of a predicate which tests whether a new (next) queen placed on the column n is safe wrt. all queens already placed, and specified by the list l
safe n l | elem n l = False -- same column | otherwise = let sf (x:q) m | (x-m/=n && x+m/=n) = sf q (m+1) | otherwise = False -- (same diagonal) sf _ _ = True -- all rows tested in sf l 1The cooking of the non-deterministic algorithm goes as follows. Begin. Choose a row. Verify whether the queen is safe; if yes, put another queen, otherwise fail. If the ordinal of the to-be-placed queen is bigger than 8, stop. So, symbolically:
queens = qput 1 [] where
qput 9 l = l -- done.
qput k l = let n = choice 1 8
in if safe n l then qput (k+1) (n:l)
else FAIL
The failures will truncate all 'illegal' branches of the non-deterministic process. Now comes the coding, a bit different from what we have seen, but not too much.
queens = qput 1 [[]] where qput 9 ll = ll qput k ll = qput (k+1) (concat (map addq ll)) addq l = [n:l | n <- [1 .. 8], safe n l]It is easy to check that the length of queens is equal to 92, all the symmetry variants are considered different. The function addq adds non-determinitically one queen to one current solution (the same as (n:l) with non-deterministic n).
The function qput is, as previously, just an iterator. Note the extrapolating recursion style: we do not simplify the argument while recurring, but we progress, and finally we break the recursion by a "hard-break", k=9. Of course, generalizing the algorithm to, say, 15, is trivial, but be aware that there are 2 279 184 solutions. Ok, here is the generalization, and a test:
queens m = qput 1 [[]] where qput k ll | k==m+1 = ll | otherwise = qput (k+1) (concmap addq ll) addq l = [n:l | n <- [1 .. m], safe n l] concmap f [] = [] concmap f (x:xq) = f x ++ concmap f xq queens 7 [[6,4,2,7,5,3,1],[5,2,6,3,7,4,1],[4,7,3,6,2,5,1],[3,5,7,2,4,6,1], [6,3,5,7,1,4,2],[7,5,3,1,6,4,2],[6,3,7,4,1,5,2],[6,4,7,1,3,5,2], [6,3,1,4,7,5,2],[5,1,4,7,3,6,2],[4,6,1,3,5,7,2],[4,7,5,2,6,1,3], [5,7,2,4,6,1,3],[1,6,4,2,7,5,3],[7,4,1,5,2,6,3],[5,1,6,4,2,7,3], [6,2,5,1,4,7,3],[5,7,2,6,3,1,4],[7,3,6,2,5,1,4],[6,1,3,5,7,2,4], [2,7,5,3,1,6,4],[1,5,2,6,3,7,4],[3,1,6,2,5,7,4],[2,6,3,7,4,1,5], [3,7,2,4,6,1,5],[1,4,7,3,6,2,5],[7,2,4,6,1,3,5],[3,1,6,4,2,7,5], [4,1,3,6,2,7,5],[4,2,7,5,3,1,6],[3,7,4,1,5,2,6],[2,5,7,4,1,3,6], [2,4,1,7,5,3,6],[2,5,1,4,7,3,6],[1,3,5,7,2,4,6],[2,5,3,1,7,4,6], [5,3,1,6,4,2,7],[4,1,5,2,6,3,7],[3,6,2,5,1,4,7],[2,4,6,1,3,5,7] ]The functional concmap is an extra bonus for you. We have seen already a few times the construct (concat (map ... )), and it is sufficiently frequent, that it would be useful to have it as a basic brick.
The presented framework is not the only one possible. It realizes what is usually called the "data backtracking". But, there is another view of the logical non-determinism, which could be called "control backtracking", based on alternative continuations. Stay tuned, one day we shall describe it in this collection of notes.