I'm sure that the high availability paper mentioned above has good info in it, but for those not inclined to search it out, the two main reasons we run two APS machines (as a cluster) is:
1. Redundancy. If we lose a server to a hardware failure, our environment is still running.
2. Capacity. User sessions are split between the two APS machines allowing for more simultaneous users before seeing a slowdown. This is especially true if you are using custom CSP or SDK calls that do a lot of queries against the APS database (InfoStor) to get report IDs, names, Admin info, etc.
-Tim