Is this legal Lisp?

Discussion of Common Lisp
Post Reply
lispamour
Posts: 18
Joined: Wed Jun 02, 2010 12:29 am

Is this legal Lisp?

Post by lispamour » Mon Jun 14, 2010 2:12 pm

The following comes again from Genlt introduction to Symbolic Computation p. 257, Exercise 4. Given a list of symbols (A T G C), representing a strand of DNA, count the number of DNA bases in the strand in a function called COUNT-BASES. The catch is that the DNA strand could be either a single strand or a double strand (since DNA is double-stranded, except during replication). Thus the function could be called as

Code: Select all

(COUNT-BASES '((G C) (A T) (T A) (T A) (C G)))
which would return

Code: Select all

((A 3) (T 3) (G 2) (C 2))
or it could be called as

Code: Select all

(COUNT-BASES '(A G T A C T C T))
which would return

Code: Select all

((A 2) (T 3) (G 1) (C 2))
I wrote it up as

Code: Select all

(defun count-bases (dna)
  (let ((num-a 0)
        (num-t 0)
        (num-g 0)
        (num-c 0))
    (defun count-nucleotides (x)
      (cond ((eq x 'a)  (incf num-a))
            ((eq x 't)  (incf num-t))
            ((eq x 'g)  (incf num-g))
            ((eq x 'c)  (incf num-c))))
    (dolist (base-pair dna
                       (list (list 'a num-a) (list 't num-t) (list 'g num-g) (list 'c num-c)))
      (cond ((listp base-pair)   (count-nucleotides (car base-pair))
                                 (count-nucleotides (cadr base-pair)))
             (t                  (count-nucleotides base-pair))))))
I defined the function COUNT-NUCLEOTIDES within the body of the function COUNT-BASES in order to take advantage of the lexical environment of the LET block; the alternative would have been to either use as global variables NUM-A. NUM-T, NUM-G, NUM-C; or to define a

Code: Select all

(defstruct base-count
   (num-a 0)
   (num-t 0)
   (num-g 0)
   (num-c 0))
and pass the structure by value along with the variable X to update the count of the components in each iteration. For simplicity's sake I chose to define the sub-function COUNT-NUCLEOTIDES within COUNT-BASES.

The function works properly under both Clozure and LispWorks, though I get a warning from LispWorks when I compile it:

Code: Select all

The following function is undefined:
COUNT-NUCLEOTIDES which is referenced by COUNT-BASES
Is the warning legitimate? Are nested function definitions allowed in Common Lisp?

Moreover, is it good Lisp style to do something like this?

ramarren
Posts: 613
Joined: Sun Jun 29, 2008 4:02 am
Location: Warsaw, Poland
Contact:

Re: Is this legal Lisp?

Post by ramarren » Mon Jun 14, 2010 3:03 pm

lispamour wrote:Are nested function definitions allowed in Common Lisp?
There are, but they are not doing what you most likely think they are doing. What is happening when done like that is that the inner function is defined as global function when the outer one is run. Which is why Lispworks is complaining, since when the outer function is compiled they inner one is not yet defined, and will not be until the outer one is run for the first time.

While allowed, this is a very bad style and it stops many compiler optimizations. The way to define a local function is through FLET or LABELS special operators.

Also, your COND form in this would be better expressed as CASE. Or, even better, as symbol keyed hash-table (or even for this sizes an alist), since those value have identical behaviour.

Code: Select all

(defun count-bases (dna)
  (let ((counts (list (list 'a 0) (list 't 0) (list 'g 0) (list 'c 0))))
    (dolist (base-pair dna counts)
      (labels ((count-nucleotide (x)
                 (incf (second (assoc x counts)))))
        (cond ((listp base-pair)
               (mapc #'count-nucleotide base-pair))
              (t (count-nucleotide base-pair)))))))

Suroy
Posts: 46
Joined: Sat Dec 19, 2009 11:20 am

Re: Is this legal Lisp?

Post by Suroy » Mon Jun 14, 2010 4:14 pm

I couldnt resist :twisted:
My totally unhelpful code for this (it uses other libraries, but i just wanted to show that you dont have to stick with vanilla lisp, make your own 'cl' library named something else which contains extremely convenient functions for everything. For this, i like to use the conduit-packages library which will not only import symbols but will also export them, so you could export a subset of the :cl or any otherpackage)

Code: Select all

(defun count-bases (dna)
  (let ((counts (hash (a 0) (t 0) (g 0) (c 0))))
    (iter
     (for base in dna)
     (mapc #(incf (@ % counts)) base))
    counts))

lispamour
Posts: 18
Joined: Wed Jun 02, 2010 12:29 am

Re: Is this legal Lisp?

Post by lispamour » Tue Jun 15, 2010 1:14 pm

Ramarren wrote:
lispamour wrote:Are nested function definitions allowed in Common Lisp?
There are, but they are not doing what you most likely think they are doing. What is happening when done like that is that the inner function is defined as global function when the outer one is run. Which is why Lispworks is complaining, since when the outer function is compiled they inner one is not yet defined, and will not be until the outer one is run for the first time.

While allowed, this is a very bad style and it stops many compiler optimizations. The way to define a local function is through FLET or LABELS special operators.
Thank you for that very complete and lucid explanation. Your code for COUNT-BASES was great: very succinct and idiomatic Lisp. The book GISC did cover association lists, and I'm a little chagrined that I did not think of it as it presents the solution in a clear and compact manner. I'm starting to see how experienced Lispers find and factor out certain patterns in organizing their code.
Suroy wrote:For this, i like to use the conduit-packages library which will not only import symbols but will also export them, so you could export a subset of the :cl or any otherpackage)
Thank you also for this example. I'll have to come back to this a little later, after I'm a little more experienced, and have learned how to use code from other Lisp packages. It's remarkable how malleable the Lisp language can be, that Lisp programmers can invent and re-use syntactic forms to apply to common types of problems.

death
Posts: 17
Joined: Sat Jun 28, 2008 1:44 am

Re: Is this legal Lisp?

Post by death » Tue Jun 15, 2010 4:57 pm

I don't know why proposed solutions used an alist/hash-table. I think lispamour got it basically right. My proposed solution:

Code: Select all

(defun count-bases (tree)
  (let ((as 0) (ts 0) (gs 0) (cs 0))
    (labels ((rec (x)
               (etypecase x
                 ((eql a) (incf as))
                 ((eql t) (incf ts))
                 ((eql g) (incf gs))
                 ((eql c) (incf cs))
                 (null)
                 (cons (rec (car x))
                       (rec (cdr x))))))
      (rec tree))
    `((a ,as) (t ,ts) (g ,gs) (c ,cs))))

ramarren
Posts: 613
Joined: Sun Jun 29, 2008 4:02 am
Location: Warsaw, Poland
Contact:

Re: Is this legal Lisp?

Post by ramarren » Tue Jun 15, 2010 10:34 pm

death wrote:I don't know why proposed solutions used an alist/hash-table.
Because if you doing the exact same operation on multiple variables then you almost always was a datastructure. It might not matter in this example, where the operation is trivial, but it is an exercise anyway, and data structures are a core computer science concept, so it is also good to practice them.

death
Posts: 17
Joined: Sat Jun 28, 2008 1:44 am

Re: Is this legal Lisp?

Post by death » Wed Jun 16, 2010 7:41 am

Ramarren wrote:Because if you doing the exact same operation on multiple variables then you almost always was a datastructure.
If you're doing the exact same operation on multiple variables/values you want an operator, not a data structure. A data structure might make sense when there is a connection between different data. It is true that COUNT-BASES can be thought of as doing rule-based dispatch. For example, matching the type of the object X in my code is the rule, and the action depends on satisfaction of the rule. In another model, X is considered a member of {A, T, C, G} and the action is to increment the corresponding count cell. If there was an intention to generalize COUNT-BASES beyond these cases, or if there were many of them, I'd see the value in defining a more sophisticated data structure. In programming it is also important to recognize when a solution might be an overkill, and to choose a different one: knowing when to use a data structure is important as well.

ramarren
Posts: 613
Joined: Sun Jun 29, 2008 4:02 am
Location: Warsaw, Poland
Contact:

Re: Is this legal Lisp?

Post by ramarren » Wed Jun 16, 2010 8:48 am

death wrote:If you're doing the exact same operation on multiple variables/values you want an operator, not a data structure.
And you would still have to apply the operator to every variable manually, which is missing my point. If you are using the same operator on multiple variables then it is almost always better, except in cases of some microoptimizations which obviously are not relevant to an exercise of this kind, to map the operator over a datastructure rather than write out the mapping by hand.
death wrote:For example, matching the type of the object X in my code is the rule, and the action depends on satisfaction of the rule.
But it doesn't really depend so, since in cases where X is a symbol you are performing the exact same action, just on a different place which has one-on-one relationship with the symbol in question. That is a perfect place to use a key-value map of some sort. It is also a simpler solution in the sense that there is a smaller number of entities to consider, and it pushes an, in this case admittedly small, amount of accidental complexity into built-in functionality of the language.

death
Posts: 17
Joined: Sat Jun 28, 2008 1:44 am

Re: Is this legal Lisp?

Post by death » Wed Jun 16, 2010 9:39 am

If generality and avoiding accidental complexity is the aim, we can decompose the problem like this:

Mapping the structure:

Code: Select all

(defun map-tree (function tree)
  (typecase tree
    (null)
    (cons (map-tree function (car tree))
          (map-tree function (cdr tree)))
    (t (funcall function tree))))
Counting objects:

Code: Select all

(defun histogram (structure &key (test #'eql) (key #'identity) (mapper #'map-tree))
  (let ((histogram (make-hash-table :test test)))
    (funcall mapper
             (lambda (x)
               (incf (gethash (funcall key x) histogram 0)))
             structure)
    histogram))
Converting to the output structure:

Code: Select all

(defun histogram-list (histogram)
  (loop for object being the hash-key of histogram
        using (hash-value count)
        collect (list object count)))
Now COUNT-BASES can be defined:

Code: Select all

(defun count-bases (dna)
  (histogram-list (histogram dna)))

Post Reply