Integer doesn't grow in size when left-shifted?

Discussion of Common Lisp
Post Reply
wvxvw
Posts: 127
Joined: Sat Mar 26, 2011 6:23 am

Integer doesn't grow in size when left-shifted?

Post by wvxvw » Sat Mar 26, 2011 6:48 am

Hello.
In attempt to teach myself the language, I'm writing a library to read and write AMF format. The documentation is here: http://opensource.adobe.com/wiki/downlo ... _05_08.pdf
In a nutshell, this format is a "native" for Flash object encoding, which Flash can use for client-server communication, saving persistent data on disc and serializing objects when communicating from one Flash to another. Several implementations exist in other languages (FluorineFX in C#, BlazeDS in Java, Zend_Amf in PHP and pyamf in Python, and there may be more). I'm trying to write one in CL.

Below is my attempt to rewrite a function that reads UI29 records (that is an unsigned variable length integer of maximum 29 significant bits). The algorithm is quite simple: read 8 bits, if first is 1, then take the rest 7 bits and combine them with the result, however, the bits in result must be shifted 7 spaces to the left, so that the new 7 bits will exactly follow the rest of the already written bits. Repeat the process either until you encounter first 0 bit, or until 4 whole octets are read.
The code in comment is in AS3, but I think I can rewrite it in Java or C# if that will make it easier to understand. Something peculiar to AS3 - >>> is the right unsigned shift (i.e. it doesn't preserve the sign in the first bit, instead it moves the first bit with the rest). Assignment, or any variant of assignment returns the new assigned value, that is (a = 42) == 42.

Code: Select all

;private function encodeUI29(bytes:ByteArray):uint
;{
;	var byte:uint;
;	var total:uint;
;	var bytesRead:int;
;	var shift:int = 7;
;	
;	do
;	{
;		byte = bytes.readUnsignedByte();
;		bytesRead++;
;		total |= int(byte & 0x7F);
;		shift += int(bytesRead == 4);
;		trace("total:", total, "byte", byte, "shift", shift);
;	}
;	while ((bytesRead < 4) && (byte >>> 7 > 0) && ((total <<= shift) > 0));
;	
;	return total;
;}
; ---- calling encodeUI29() ---
; -----------------------------
; bytes.writeByte(0x80 | 0x12);
; bytes.writeByte(0x80 | 0x34);
; bytes.writeByte(0x80 | 0x56);
; bytes.writeByte(0x80 | 0x01);
; bytes.position = 0;
; this.encodeUI29(bytes);
; ---- results: --------------
; total: 18 byte 146 shift 7
; total: 2356 byte 180 shift 7
; total: 301654 byte 214 shift 7
; total: 38611713 byte 129 shift 8


(defun encode-ui29 (bytes)
	(let ((total 0) (shift -7))
		(loop
			for the-byte in bytes
			for bytes-read from 0
		
			do (setf total (logior total (logand the-byte #x7F)))
			do (setf shift (- shift (if (= bytes-read 4) 1 0)))
			do (format t "total: ~a, the-byte: ~a, shift: ~a~%" total the-byte shift)

			when (and 	(< bytes-read 4) 
					(= 0 (ash the-byte 7)) 
					(> (setf total (ash total shift)) 0))
			return total)))

(encode-ui29 (list (logior #x80 #x12) (logior #x80 #x34) (logior #x80 #x56) (logior #x80 #x01)))

; ---- results ---------------
; total: 18, the-byte: 146, shift: -7
; total: 54, the-byte: 180, shift: -7
; total: 118, the-byte: 214, shift: -7
; total: 119, the-byte: 129, shift: -8
What happens is that it looks like the "total" variable won't grow when it's bits are shifted to the right (as if it had fixed number of bits of 8).

Now, my questions are: does my code actually do what I believe it does? :) Are the functions and data types correct for what I'm trying to do? (I saw different manuals suggesting that I use boole and bit-prefixed functions - I don't know what is the difference, and I can't figure out how would I cast a list of bits to integer. I'm less bothered with efficiency, but if I'm doing something outrageous, please tell! :)
Thanks.

ramarren
Posts: 613
Joined: Sun Jun 29, 2008 4:02 am
Location: Warsaw, Poland
Contact:

Re: Integer doesn't grow in size when left-shifted?

Post by ramarren » Sat Mar 26, 2011 1:01 pm

I don't really understand what the code is doing in some places. I mean, I understand what the instructions are doing. But for example, as I am assuming that && operator is short-circuiting, isn't the `shift += int(bytesRead == 4);` instruction redundant, since shift would ever be 8 when bytesRead are 4, in which case it would never get used anyway? Why is shift not a constant?

Code: Select all

(defun encode-ui29 (bytes)
  (loop for byte in bytes
        for bytes-read from 0 below 4
        for byte-flag = (logbitp 7 byte)
        for byte-part = (logand byte #x7f)
        for total = byte-part
          then (+ byte-part (ash total 7))
        do (format t "total:~a byte:~a byte-part:~a~&" total byte byte-part)
        while (and byte-flag (plusp total))
        finally (return total)))
This code seems to give the results as in the example in the comment, and should work the same, but I haven't been doing serious bit fiddling in a long time, so maybe I am missing something.

As a note, the key to using LOOP facility (personally I prefer iterate, but the principle is the same) is to avoid direct side effects and instead use LOOP iteration constructs.

Oh, and the reason why your total number didn't change was because `(= 0 (ash the-byte 7))` was always false, and hence the final, side effecting, condition was never evaluated. The iteration was terminated by the list running out, and the return form was ignored as well.

wvxvw
Posts: 127
Joined: Sat Mar 26, 2011 6:23 am

Re: Integer doesn't grow in size when left-shifted?

Post by wvxvw » Sat Mar 26, 2011 1:30 pm

Oh, my bad. Yeah, the ((total <<= shift) > 0) should've been the first condition, not the last... Sorry, I was copying and pasting to get it posted here and get this messed up. So, no, shift shouldn't be a constant, it should increment if the number of iterations reaches 4 (because we don't care if the first bit of the last byte is set or not, we cant read more bytes anyway). Oh, and thank you for the explanation!
Yet another question: this same function may be written as a recursion. In other languages this wouldn't be a good approach performance-wise, are there any implications of doing it recursively in CL? In general, if the same problem may be solved using loop or recursion, which should I prefer?
One last question if I may... is there a debugger? I mean, something that I could set a breakpoint and then look at the values as the code executes (and stops at breakpoint)? OK, that's probably a question for another thread, sorry :)

ramarren
Posts: 613
Joined: Sun Jun 29, 2008 4:02 am
Location: Warsaw, Poland
Contact:

Re: Integer doesn't grow in size when left-shifted?

Post by ramarren » Sat Mar 26, 2011 1:49 pm

wvxvw wrote:So, no, shift shouldn't be a constant, it should increment if the number of iterations reaches 4 (because we don't care if the first bit of the last byte is set or not, we cant read more bytes anyway).
But in that case, shouldn't then the fourth byte not be truncated by `& 0x7F`? Also, if the actual shift by eight is done, then the result doesn't agree with the example you gave in the comment.

Code: Select all

(defun encode-ui29 (bytes)
  (iter (for byte in bytes)
        (for bytes-read from 1 to 4)
        (for byte-flag = (logbitp 7 byte))
        (for byte-part = (logand byte #x7f))
        (for total first byte-part
             then (if (= bytes-read 4)
                      (+ byte (ash total 8))
                      (+ byte-part (ash total 7))))
        (format t "total:~a byte:~a byte-part:~a~&" total byte byte-part)
        (while (and byte-flag (plusp total)))
        (finally (return total))))

CL-USER> (encode-ui29 (list (logior #x80 #x12) (logior #x80 #x34) (logior #x80 #x56) (logior #x80 #x01)))
total:18 byte:146 byte-part:18
total:2356 byte:180 byte-part:52
total:301654 byte:214 byte-part:86
total:77223553 byte:129 byte-part:1
77223553
wvxvw wrote:In general, if the same problem may be solved using loop or recursion, which should I prefer?
Unlike some other Lisps, Common Lisp does not require support for tail recursion for implementations, and iteration is generally preferable. Note that there are other iteration mechanism than looping, like mapping and reduction.
wvxvw wrote:One last question if I may... is there a debugger? I mean, something that I could set a breakpoint and then look at the values as the code executes (and stops at breakpoint)?
The ANSI standard requires the presence of a debugger, but how exactly it is implemented depends on the implementation. Note that Common Lisp is an image based language, which means that normally you work in an interactive environment at most times.

What are you learning the language from? Practical Common Lisp is a recommended book for those who already know programming.

wvxvw
Posts: 127
Joined: Sat Mar 26, 2011 6:23 am

Re: Integer doesn't grow in size when left-shifted?

Post by wvxvw » Sat Mar 26, 2011 2:36 pm

Regarding the first, hm... I think that was what I did originally, but then I compared it to the result I'm getting from Flash and I changed it to what it is now. I will need to make sure who's wrong here, chances are it's me :)
Re' debugger - thanks again, I was looking in a completely wrong place, well, who knew.
I think I've eventually came across some of the pages from Practical Common Lisp translated, but I was trying to follow SBCL (that's the variant that I'm using) manual and some (pretty random) stuff I could find through Google. That other site you linked me to (Hyperspecs), yes, it looks quite familiar, although, I'm afraid, I had few problems trying examples from there. But I wouldn't insist it really happened, at any rate, it could be me miscopying or misunderstanding something. I also tried to read Successful Lisp, Simplified Common Lisp Reference and some articles by this man http://www.ai.sri.com/pkarp/ which actually proved to be useful.

wvxvw
Posts: 127
Joined: Sat Mar 26, 2011 6:23 am

Re: Integer doesn't grow in size when left-shifted?

Post by wvxvw » Tue Mar 29, 2011 1:36 pm

Just for the record, here's how it ended:

Code: Select all

;package tests
;{
;	import flash.display.Sprite;
;	import flash.utils.ByteArray;
;	
;	public class TestAmfEncoding extends Sprite
;	{
;		public function TestAmfEncoding()
;		{
;			super();
;			this.testUI29();
;		}
;		
;		private function testUI29():void
;		{
;			var bytes:ByteArray = new ByteArray();
;			
;			bytes.writeByte(0x04);
;			bytes.writeByte(0x80 | 0x12);
;			bytes.writeByte(0x80 | 0x34);
;			bytes.writeByte(0x80 | 0x56);
;			bytes.writeByte(0x80 | 0x01);
;			bytes.position = 1;
;			
;			var result:uint = this.encodeUI29(bytes);
;			trace("loop\t", this.toBitString(result));
;			bytes.position = 0;
;			result  = bytes.readObject();
;			trace("native\t", this.toBitString(result));
;			
;			bytes.position = 1;
;			result = this.encodeUInt29(bytes);
;			trace("plain\t", this.toBitString(result));
;		}
;		
;		private function toBitString(value:uint):String
;		{
;			var pattern:String = "0x00000000";
;			var result:String = value.toString(16);
;			return pattern.substring(0, 10 - result.length) + result;
;		}
;		
;		private function encodeUI29(bytes:ByteArray):uint
;		{
;			var byte:uint;
;			var total:uint;
;			var shift:int = 7;
;			var isLastByte:Boolean;
;			
;			do
;			{
;				byte = bytes.readUnsignedByte();
;				// This would mean that we had read at least 2 whole bytes ( 14 bits )
;				// + 1 bit, which means we've been in the loop 3 times
;				// this is the 4th time, so we need to read 1 extra bit
;				isLastByte = total > 0x3FFF;
;				shift += int(isLastByte);
;				total <<= shift;
;				total |= int(byte & (0x7F | (int(isLastByte) << 7)));
;			}
;			while (!isLastByte && (byte >>> 7 > 0));
;			
;			return total;
;		}
;		
;		private function encodeUInt29(bytes:ByteArray):uint
;		{
;			var b:uint = bytes.readUnsignedByte();
;			var sum:uint;
;			// 1 byte
;			if (!(b >>> 7)) return b;
;			sum = b & 0x7F;
;			sum <<= 7;
;			b = bytes.readUnsignedByte();
;			// 2 bytes
;			if (!(b >>> 7))
;			{
;				sum |= int(b & 0x7F);
;				return sum;
;			}
;			sum |= int(b & 0x7F);
;			sum <<= 7;
;			b = bytes.readUnsignedByte();
;			// 3 bytes
;			if (!(b >>> 7))
;			{
;				sum |= int(b & 0x7F);
;				return sum;
;			}
;			// 4 bytes
;			sum |= (b & 0x7F);
;			sum <<= 8;
;			b = bytes.readUnsignedByte();
;			return sum | b;
;		}
;	}
;}

(defun encode-ui29 (bytes)
	(loop	for byte in bytes
		with total = 0
		for is-last-byte = (> total #x3FFF)
		for shift = (if is-last-byte 8 7)
		do (setf total 
			(+ (ash total shift) 
				(logand byte (if is-last-byte #xFF #x7F))))
		when (or is-last-byte (= (ash byte 7) 0))
			return total))

(format t "result: ~x~%"
 	(encode-ui29 (list (logior #x80 #x12) (logior #x80 #x34) 
		(logior #x80 #x56) (logior #x80 #x01))))
I'm still not entirely happy with how I did it in the end, but at least it's working and does what it's supposed to (or so it seems).

PS. Thanks again, especially for the link to the book. And I'm reading about "iterate", looks interesting!

Post Reply