1111
0x1337c0de / Re: Questions on General Software Design.
« on: March 07, 2008, 08:47:52 PM »oh and could someone explain bit significance and byte ordering like little endian and so on...
Endianness is a property of the CPU architecture. It can refer to bits or bytes, although it's rare to see it referred to bits these days. In the days of minicomputers like the PDP-8 bit 15 was the least-significant bit (the one on the right) and had the value 2^0 and bit 0 had the value 2^15 and was on the left. (Left and right were the on the register panel and the toggle switches below it.) The PDP-8 was actually a 12-bit machine (bits 0 thru 11) but had extensions on the memory address to add 3 more bits to the most significant end of the address. The PDP-8A control panel had switches on the front panel for toggling in the programs. This front panel was later emulated on the MITS Altair 8800 that had a Intel 8080 microprocessor in it. In the case of the microprocessors, bit 0 was on the right, value 2^0 and bit 8 or bit 15 were on the left for registers or addresses. The endianness of the bit values were the same but when referring to the bit number the endianness was reversed. I don't think the PDP-8 had any bit-address mode instructions but if you referred to bit 7 in PDP-8's it's value was 2^8 or 256 vs. 2^7 or 128 in 8080's or any microprocessors that came later.
Little endian bit index order and enumerating them from right to left makes more sense because the bit address (offset) corresponds to the power of 2 it represents. Offset 0 is 2^0, offset n is 2^n.
Offset refers to the position of a byte in a file or memory block. The first byte has offset zero, or is the zeroth byte in the block, then offset 1, 2, 3 , etc. until offset n-1, n for any n items in a list. An offset refers to the order of the items in a list, the list could contain bytes, words, or double-words or could even be structures of bytes. Programmers always call offset of the first item in a list as offset zero. Some older computer languages allow counting to begin at 1 instead of 0.
Endianness in byte order refers to the way the bytes are stored in memory or even transmitted serially. If ABCD represents 4 bytes, then on little-endian machines they are stored in memory as DCBA where:
Offset Byte
0 D
1 C
2 B
3 A
On big-endian machines they are stored as:
Offset Byte
0 A
1 B
2 C
3 D
This is great when sending bytes serially since you can walk right down the list and they will be transmitted exactly as they are stored in memory as bytes and they will "read" left to right in this fashion. This is why big-endian was chosen as network byte order in the early days of the Internet. (The first networked computers were big-endian.)
This doesn't apply to byte strings however. Strings are always stored in memory from lowest to highest address first character to last character regardless of the endianness of the computer. It's only when you start talking about multi-byte values that it gets interesting.
Take the 16-bit example. If AB and CD represent the bytes of two 16 bit integers, the little endian machine will store them as:
Offset Byte
0 B
1 A
2 D
3 C
The big-endian will store them as:
Offset Byte
0 A
1 B
2 C
3 D
Storing or transmitting them from a big endian machine to a little endian machine must take this into account and they must know the nature of the data they are trying to transmit.