The following thread discusses this for [tt]SCREEN 12[/tt]:
thread314-272644
The buffer format for [tt]SCREEN 13[/tt] is substantially simpler since the pixels are all aligned to byte boundaries and are "chained" (all the bits for a pixel are adjacent). I haven't investigated the [tt]GET/PUT[/tt] format for other video modes, but I assume it is the same as [tt]SCREEN 12[/tt] for the other 16-color modes.
Here is more elaboration on the [tt]SCREEN 13[/tt] format:
[tt]
Offset Size Data Description
--------------------------------------------------------
0 2 8x The width of the captured area multiplied
by the number of bits per pixel (in other
words, 8 * x).
2 2 y The height of the captured area, unaltered.
4 x*y (image) The captured image, stored one scan line at
a time.
[/tt]