Currently the algorithm is pretty stupid; if it can't send the message immediately, it waits 120 seconds and tries transmitting all pending messages again.
This should be improved to take different failure modes into account; transient failures should cause sooner rescheduling, "this account is offline temporarily" should cause a longer retransmit for those messages.
Also, the scheduling isn't really time-based; along with upgraders and batch queues, this is a case where the system just wants to transactionally queue some work for future execution as soon as it's convenient. This should really be sensitive to factors such as load. Another issue that using the scheduler creates is that when a collection of stores which have been idle for a long time are brought online, they will all immediately want to do work, even if the work was all scheduled for a reasonable delay in the future.
The scheduler is convenient for the moment, but we should really add more expressive APIs to it so that systems like these can make their constraints known more easily.