Since this post was originally a reply to a question, I'll show the question here too.
Post by: hardcore100110 >This is a sub from a huffman encoder. Buffer$ is a string > * 10000. My question is I do not know what the code > following the get statement is doing. It seems to me that > using the MID$ command is the best way to examine the > contents of Buffer$. > > I guess A& is a memory address and the math is done to > advance the position correctly? Address = 1:Endaddress=0 > > > > GET #1, , Buffer$ > > A& = SADD(Buffer$) > A& = A& - 65536 * (A& < 0) > BufferSeg = VARSEG(Buffer$) + (A& \ 16) > Address = (A& MOD 16) > EndAddress = Address + BufferLength > DEF SEG = BufferSeg >
Addresses in memory in real mode (which is the name of the processor mode in which QB, and thus QB programs, execute) are pointed at by overlapping segments & offsets. Segments and offsets are both represented by 16-bit integers, since everything up to the 286 was 16 bits, internally (or less -- but those are sufficiently obsolete by now that we can forget about them). Each segment starts 16 bytes into the previous segment, but you can specify an offset of up to 65535 into any given segment. Since device-mapped memory starts at segment A000h, which is absolute offset A0000h, there are exactly 655360 bytes of addressable memory in pure real mode. This is where the "640K of memory" thing comes from. Basically, memory is one long string of bytes, and the actual address of the byte being referred to is calculated on-the-fly by the processor as (segment * 16 + offset). It is fairly conventional to write segment:offset, in hexadecimal, when referring to a spot in memory. For instance, VGA graphics start at A000:0000, which is device-mapped onto the video card.
VARSEG and VARPTR return the segment and offset of a variable's contents, respectively. Strings in QB, however, are prefixed in memory by a descriptor that specifies the length of the string. Thus, VARSEG(stringvariable$):VARPTR(stringvariable$) points at the length of the string. Because they wanted to leave room for expansion of this descriptor, Microsoft made a separate function -- SADD -- that returns the offset of the start of the string. In later versions of QuickBASIC, they take advantage of this small abstraction, and add another function -- SSEG -- that returns the segment of the start of the string. In these versions of QuickBASIC, the header not only contains the length of the string, but it also contains a far pointer (ie, an offset accompanied by a segment) to the string, or an EMS page number, since they support storing strings in EMS.
In QB, whenever you use a function that returns a string, it creates a new descriptor & allocates space for the descriptor & the string off of the heap. This means that there is a significant setup overhead when you call, for example, the MID$() function. This overhead is even higher when the string is stored in EMS. PEEK and POKE have virtually no overhead, however. The code for them is actually produced inline with the surrounding code, so no function call is generated. Thus, using PEEK() is far more efficient than using MID$(). The author of that code snippet was obviously programming for efficiency, rather than readability. His code was never intended to be read, but rather to run quickly. What he is doing in this particular snippet is retrieving an absolute address (i.e., number of bytes from start of memory), and converting it into a segment and an offset, which the processor can use to index the memory with. Since PEEK() and POKE take just offsets, not segments, QB needed a way to provide the user with access to the other segments in memory, and this is what DEF SEG does: it stores a segment value somewhere in memory, or in a register, and when PEEK() or POKE is called, it takes that offset into the segment specified by the previous DEF SEG.
Note that since segments overlap so much, you can usually take any segment:offset pointer, and add 1 to the segment while subtracting 16 from the offset, or subtract 1 from the segment while adding 16 to the offset. The author here wants to minimize the maximum value of the offset, so he picks the highest possible segment that still contains the first byte of the buffer. This also happens to be the easiest way to do it.