Page 1 of 1

my-union (Touretzky / exercise 8.52.)

Posted: Tue May 12, 2015 12:17 pm
by chuchana
Not really homework, but because of the kind of question I guess this is probably the right place:

I am going through Common Lisp: A Gentle Introduction to Symbolic Computation. In exercise 8.52., I do not understand why the solution in the book uses a helper function.

The book's solution:

Code: Select all

(defun my-union (x y)
      (append x (union-recursively x y)))
(defun union-recursively (x y)
      (cond ((null y) nil)
            ((member (first y) x)
             (union-recursively x (rest y)))
            (t (cons (first y)
                     (union-recursively
                        x
                        (rest y))))))
My solution:

Code: Select all

(defun my-union (x y)
  (cond ((null x) y)
        ((member (car x) y)
         (my-union (cdr x) y))
        (t (cons (car x)
                 (my-union (cdr x) y)))))
Mine seems to behave correctly. Does the book's solution have any advantages?

I found Lisp function: union while looking for an explanation and saw that my solution could not deal with non-list inputs, but the one from the book cannot either, and the Hyperspec says that the arguments of union should be proper lists.

Re: my-union (Touretzky / exercise 8.52.)

Posted: Sat May 16, 2015 1:55 am
by edgar-rft
The difference is subtle and not visible on the first view, but APPEND makes copies of all of its arguments, except of its last argument. For the last argument the elements are copied by the recursive function from the y-list. The result is a list that is completely independent from the structure of the original input lists. This sounds complicated, but I will show what that means by a few simple examples.

David Touretzky's and your code, both seem to produce the same result, here the list (1 2 3 4) stored in the z variable:

Code: Select all

(defparameter x (list 1 2 3))
(defparameter y (list 2 3 4))
(defparameter z (my-union x y))

x => (1 2 3)
y => (2 3 4)
z => (1 2 3 4)
The invisible difference is that with David Touretzky's code the z-list contains copies of the elements in the original y-list, while with your code the z-list contains a direct reference (a pointer) to the original y-list:

Code: Select all

(defun my-union (x y)
  (cond ((null x) y)  ; <- y returns a pointer to the first cons cell of the y-list
        ((member (car x) y)
         (my-union (cdr x) y))
        (t (cons (car x)
                 (my-union (cdr x) y)))))
This difference becomes visible as soon as you try to modify elements in the z-list that come from the y-list.

With David Touretzky's code:

Code: Select all

(setf (fourth z) 5)

x => (1 2 3)
y => (2 3 4)
z => (1 2 3 5)  ; <- fourth element modified (desired effect)
With your code:

Code: Select all

(setf (fourth z) 5)

x => (1 2 3)
y => (2 3 5)    ; <- third element modified (unwanted effect)
z => (1 2 3 5)  ; <- fourth element modified (desired effect)
The y-list gets modified because the z-list contains a pointer to the y-list and not independent copies of the elements in the y-list. The last three elements in the z-list are the elements of the y-list, not copies of the elements. Changing one of these elements in one list also changes the element in the other list:

Code: Select all

(setf (first y) 0)

x => (1 2 3)
y => (0 3 5)    ; <- first element modified (desired effect)
z => (1 0 3 5)  ; <- second element modified (unwanted effect)
You can avoid this unwanted behaviour by using COPY-LIST like this:

Code: Select all

(defun my-union (x y)
  (cond ((null x) (copy-list y))
        ((member (car x) y)
         (my-union (cdr x) y))
        (t (cons (car x)
                 (my-union (cdr x) y)))))
Now your function works like David Touretzky's code.

I assume that David Touretzky did not want to discuss the subtle differences how "copy by value" and "copy by reference" work with Common Lisp lists because teaching the basics is already complicated enough and APPEND was already introduced in the book.

Practical Common Lisp, chapter 12: List Processing explains the difference in more detail but if you're still fighting with the basics save these links for later.

As a rule of thumb one can say that Common Lisp generally passes arguments by reference (pointers) and does not make copies from complex data structures unless explicitely told to do so by functions like COPY-LIST. The behaviour of APPEND is a very rare exception.

BTW, I've learned Common Lisp from the Touretzky book, too. :D

- edgar

Re: my-union (Touretzky / exercise 8.52.)

Posted: Wed May 20, 2015 12:27 am
by chuchana
Thank you for your detailed explanation! I started with Practical Common Lisp, but I went to this book when I was not sure about something and I stayed with it because of the exercises.

- pascal

Re: my-union (Touretzky / exercise 8.52.)

Posted: Wed May 20, 2015 10:04 am
by edgar-rft
I had worked with several other programming languages before and started learning Lisp with David Tourtzky's Gentle Introduction, but found it "too basic" (what is not true, but I didn't realized it at that time). Then I read Peter Seibel's Practical Common Lisp, David Lamkin's Sucessful Lisp and several other books but still didn't see the full picture. The breakthrough came approx. two years later when I re-read Gentle Introduction and finally understood why it's not basic...

- edgar

Re: my-union (Touretzky / exercise 8.52.)

Posted: Sun May 31, 2015 11:07 am
by sylwester
It's common for function to share structure so I wouln't say your solution is wrong.

The main differences perhaps is that the elements in the result are not in the same order. Sets are not usually ordered so I can't see how that would be wrong either. You can also make a tail recursive solution like this:

Code: Select all

(defun my-union (x y)
  (labels ((aux (x acc)
             (cond ((null x) acc)
                   ((member (car x) y)
                    (my-union (cdr x) acc))
                   (t (my-union (cdr x) 
                                (cons (car x) acc))))))
    (aux x y)))

(my-union '(1 2 3 4 5) '(4 5 6 7 8 9 10))
;==> (3 2 1 4 5 6 7 8 9 10)