I found the blog post you mentioned: http://home.comcast.net/~tom_forsyth/blog.wiki.html#%5B%5BPremultiplied%20alpha%5D%5D
At first glance it seemed the answer to all my problems, but after I read through it very carefully a few times and I don't see how the claims he makes match up with the methods he is describing. From what I gather he is proposing packing all textures in an alpha premultipled state. Given the normally accepted alpha blending technique equation:
S.A * S.RGB + (1 - S.A) * D.RGB
It looks like the premultiplying only saves the step of S.A * S.RGB giving you a little extra performance. Maybe I am just missing something obvious, but I don't see how that would fix the problem of other alpha = 0 colors bleeding into the partially visible colors around it when they are sampled for scaling operations. I also don't see how it fixes the problem I'm dealing with when compositing partially transparent textures. It claims to fix both of those problems though. It seems to me that when composing multiple partially transparent textures, the only way to handle it is to take into account the alpha of the pixel being drawn over.
I played around with pixel shaders for a bit, but couldn't figure out an easy way to sample the source texel and destination texel and then blend the two myself. It seems like it should be possible, but all the examples I saw only seemed to work using input textures.
After some thought, I did think of a way to fix the problem using a 2 pass sprite rendering technique that uses an equation I thought up to accurately take into account the alpha of the destination rather than just the source:
S.A * S.RGB + (1 - S.A) * (D.A * D.RGB + (1 - D.A) * S.RGB)
Basically I run the first pass (D.A * D.RGB + (1 - D.A) * S.RGB) with the these settings:
renderState.DestinationBlend = Blend.DestinationAlpha;
renderState.SourceBlend = Blend.InverseDestinationAlpha;
renderState.BlendFunction = BlendFunction.Add;
renderState.AlphaDestinationBlend = Blend.DestinationAlpha;
renderState.AlphaSourceBlend = Blend.InverseDestinationAlpha;
renderState.AlphaBlendOperation = BlendFunction.Add;
renderState.AlphaFunction = CompareFunction.Always;
And then I render the second pass (S.A * S.RGB + (1 - S.A) * D.RGB) with the these settings:
renderState.DestinationBlend = Blend.InverseSourceAlpha;
renderState.SourceBlend = Blend.SourceAlpha;
renderState.BlendFunction = BlendFunction.Add;
renderState.AlphaDestinationBlend = Blend.InverseSourceAlpha;
renderState.AlphaSourceBlend = Blend.SourceAlpha;
renderState.AlphaBlendOperation = BlendFunction.Add;
renderState.AlphaFunction = CompareFunction.Always;
This produces the effect I am looking for, but it annoys me that I have to render all the sprites twice. It is all cached on a texture so it is not much of a performance hit, but I would really like to know if there is some magic to premultiplied alpha that I'm missing or a simple pixel shader that could accomplish this with one sprite rendering. Any help would be appreciated.