Andy - I am interested in seeing your results when you have them. Also, I wonder if packet fragmentation may make the difference in bandwidth?
I did some research through a big stack of Cisco whitepapers on this and got some good info. The most informational paper was titled "Alternatives for High Bandwidth Connections Using Parallel T1/E1 Links." It had some charts comparing pros and cons of various alternatives (hardware IMUX, load balancing, MLPPP, and ATM Inverse Multiplexing). The biggest differences between MLPPP and load balancing are:
MLPPP preserves packet order where the load balancing solution does not. This may not be an issue if your application isn't sensitive to this.
MLPPP conserves IP addresses (one per virtual interface as opposed to one per physical interface).
MLPPP is more CPU intensive than load balancing and may decrease switching performance.
MLPPP makes it a little easier to manage layer 3 services because you only have to manage the single ML interface. This is probably a bigger issue with larger bundles.
Load balancing supports HSSI and IOS QOS, whereas MLPPP does not.
On the issue of packet fragmentation, all I could was was the sentence "In many cases fragmentation can increase the efficiency of the NxT1 link." I imagine that's possible if you have a significant variance in packet sizes going across the link (some very small, some large). Of course, packet fragmentation also causes more higher CPU utilization, possibly slowing things down. I guess the best tack would be to get some CPU utilization numbers before bringing up the MLPPP and then testing it in a couple of different configs to get the best performance/lowest CPU utilization.
This has been a very educational day. Stars for both of you.