performance tips

Discussion of Common Lisp
Post Reply
habadoh
Posts: 3
Joined: Sat Nov 22, 2008 5:03 am

performance tips

Post by habadoh » Sat Nov 22, 2008 5:08 am

Hey,

Are there any general performance tips, concerning list manipulation, that you can give me? I'm asking this because I'm doing a depth-first search in my application, and to generate the successores I do some list manipulations (nth, substitute, removes, etc). The problem is it's taking more time doing the search than I think it should...

Thanks!

findinglisp
Posts: 447
Joined: Sat Jun 28, 2008 7:49 am
Location: Austin, TX
Contact:

Re: performance tips

Post by findinglisp » Sat Nov 22, 2008 8:22 am

Without seeing your code, it's hard to know what might be taking the time.

In general, when you want things to go fast, prefer the most specific function for the data type that you'll be working on. For instance, sometimes there is a function that operates on a general sequence and one which operates on a list, string, or general array. If your algorithm uses lists, then use the function that operates specifically on a list (use NTH, not ELT). Secondly, if you know that it won't cause problems, operate on the data structure destructively (use DELETE not REMOVE). The non-destructive functions generate a copy of the list each time and if you immediately discard the old copy anyway, you're simply generating garbage that the GC has to deal with. Obviously, you have to know what type of data you're working on and you'll have to know if you can destructively modify it.

Finally, review your code to see how many passes you're making over the data structure. If things are really taking longer than you expect, make sure that you aren't accidentally "walking the tree" more than once for a silly reason. Sometimes this is not obvious. Invoking NTH on a list, for instance, causes it to traverse the cdrs of each cons until it reaches the nth element. Ditto with functions like SUBSTITUTE which basically have to walk the list searching for the object to be replaced. This all takes linear time. If you can rewrite your algorithm such that you naturally visit every node anyway and minimize the number of other traversals, things will run faster.

Also, use the TIME function to benchmark your code. It will print the runtime of your code along with some GC statistics. You'd invoke it as (TIME (MYFUNC PARAM1 PARAM2)).
Cheers, Dave
Slowly but surely the world is finding Lisp. http://www.findinglisp.com/blog/

tayssir
Posts: 35
Joined: Tue Jul 15, 2008 2:33 pm

Re: performance tips

Post by tayssir » Sat Nov 22, 2008 10:14 am

* Know where your performance hotspots are, rather than guess. Look up whatever profiler(s) are available for your lisp platform.

* Once you find it out, the solution may involve a more appropriate data structure or approach. For example, lists are really convenient in Common Lisp, but perhaps arrays or hashtables are more appropriate. Because to get something from the middle of the list, you'll have to go through the elements one-by-one, which is not a problem with certain other data structures.

(That's one thing that makes Clojure interesting; the author claims you can use things like arrays and tables with the ease of lists.)

* A cache might spare you from expensive recomputations. There's a simple "memoize" function floating around the internets, where you simply give it a function name, and it'll turn the function into an automagically caching version.

habadoh
Posts: 3
Joined: Sat Nov 22, 2008 5:03 am

Re: performance tips

Post by habadoh » Sat Nov 22, 2008 11:41 am

First of all thanks for the answers! They were very helpful.

My knowledge of lisp isn't very vast, but let me give you an example of a function I'm using, and maybe you can tell me how I can improve it.

Code: Select all

; Returns a list of values that exist in both l1 and l2. e.g: (inters '((1 2)(2 3)) '((1 4) (2 3)))      ----->     (1 2 3)
(defun inters (l1 l2)
   (let ((res nil)
         (clause nil))
     (dolist (i l1 (unless (null clause)(juntar res clause)))
       (unless (null clause) (setf res (juntar res clause)))
       (dolist (k l2)
         (dolist (j i)
           (if (member j k)
               (setf clause (append clause (list j)))))))))

(defun do-stuff (tab lst-casas lst)
  (if (not (eq (fourth (first lst-casas)) 0))
      (list (append (sort lst #'< :key #'fifth) lst-casas) (second tab) (third tab))
      (do-stuff tab (rest lst-casas)
		  (append lst (list (append (first lst-casas) (list (length (inters (nth (second (first lst-casas)) (second tab)) (nth (third (first lst-casas)) (third tab)))))))))))
I could explain you what this function does in the context of my app, but its probably too lengthy for me to post it here (but I will if you want me to). So I'm just hoping you can find some inappropriate (or unefficient) code in the above functions.
Btw: on normal situations lst-casas has +-500 elements.

Thanks!

tayssir
Posts: 35
Joined: Tue Jul 15, 2008 2:33 pm

Re: performance tips

Post by tayssir » Sat Nov 22, 2008 1:41 pm

Hmm, I can see that this program must be doing a lot of extra work. As I understand, LAS-CASAS contains a bunch of indices into parts of TAB. You collect what a bunch of things from TAB have in common, until there's some sort of stop code in LAS-CASAS.

One small thing you can do is see that your lisp will already have a handy INTERSECTION function for you:

Code: Select all

CL-USER> (intersection '(1 2 2 3) '(1 4 2 3))
(3 2 2 1)
CL-USER> (remove-duplicates (intersection '(1 2 2 3) '(1 4 2 3)))
(3 2 1)
Also, you're constantly stepping through TAB to pick out some item -- if there's 500 items, getting the last item will take 500 steps. Now, if those parts of TAB were instead vectors (not lists), it'll take merely 1 step to pick out any item you want.


Further, all those APPENDs are probably taking lots of time. I don't know your application, but surely there must be some way to cut down on those:

Code: Select all

;; This steps completely through the first list (of 6 items), which is slower than you might expect.
CL-USER> (append (list 1 2 3 4 5 6)
                 (list 7))
(1 2 3 4 5 6 7)
A more readable style is to use things which describe what you're doing -- and not only do you gain clarity, but you may also gain speed. For instance, SECOND and THIRD don't really tell people your intention -- the meaning of what you're doing. There may be alternatives which are not only clearer to the reader, but also can be optimized behind-the-scenes to be faster.

Post Reply