Wow.. well, there's isn't a one-paragraph or even a 10-paragraph answer to that... but I'll give you some basics.
Let's start with some basics. Video. Your television has an approximate resolution of 720x480 (Now don't start the flames... )... but that signal is interleaved... in other words, every other line is scanned, then it goes back and does the ones it missed the first time. It writes all the even lines, then all the odd lines on the second pass. Two fields (passes) make one frame. Standard video is at 30 frames per second (compared to motion pictures in the theater, which are usually 24 frames per second). 30 frames per second means 60 updates per second (2 fields per frame).
Meanwhile, computer video is another critter. First off, computer video is not interleaved, meaning that it draws every pixel (picture element) in one pass. Why am I telling you all this? Well, I'm leading into bandwidth required for video. Here we go.
Let's assume that you have a 640x480 video image, at 16 million colors (that means that each pixel can display one color out of a pallet of 16 million colors... this is also called 24 bit color....)... This means that each pixel's color information is represented in 3 bytes.
Now, take 640 wide times 480 high = 307,200 pixels, times 3 bytes (representing color) 921,600 bytes for one screen full of information.
(Note: So far we're talking UNCOMPRESSED here... I'll get to that....) So, at 30 frames per second, take that 921,600 and multiply it by 30, that's 27,648,000 bytes of information for one second of video! And you have to be able to pass that PER SECOND, for it to not drop frames.
Television, being an analogue signal can do this, because it "sweeps" the screen... it's not storing that information, just "playing it out", "streaming it" so to speak as the information comes in.
So..... enter compression. Here's the basic (once again, flames will be directed to /dev/nul) idea behind compression. Let's say that you have a black screen. Well, there's not anything to display there, but as far as the computer is concerned, that's 921,600 "0" bytes. Rather than send the number 0 921,600 times, compression just says "I have 921,600 bytes all the same". It will do things like take "close" colors, and say "the eye can't really distinguish between those colors, so we'll reduce the pallet accordingly". Similar to JPEG compression, MPEG (Motion Picture Engineering Group) does the same thing for moving pictures. For example, if you have the background staying the same, and just someone moving their hand, MPEG compression has a routine for this. This is why on some videos you see artifacting (blocks, etc.) on video... the CoDec (Compressor / DeCompressor) that they used to ENCODE the MPEG stream wasn't working well, or they sacrificed some quality for higher compression.
Meanwhile, you have audio as well, and the audio needs to be in sync with the video. Microsoft uses "AVI", or "Audio-Video Interleave"... since audio information can also be compressed (like 10 to 1 with MP3 files), they put several chunks of video information then stick in a piece of audio information. The codec buffers the audio information during the frame playback, and plays it out during the rendering of the video.
Digital video is a cutting-edge technology, whereas the signal coming into your TV is analogue, and has been around since the 1950's.
It's really just a balancing act of bandwidth and compression.. anything below about 20 frames per second, and your eye will notice, it will look "jumpy". Compression too high will cause "artifacting"; higher resolutions mean more bandwidth necessary.
Remember, it's X and Y, so if you have a 320x240 video, it's one QUARTER the file size and bandwidth requirements of 640x480.
Hope some of this helps.
Just my $0.02
"In order to start solving a problem, one must first identify its owner." --Me
--Greg