Any comments on perf? I've not profiled it to see which bits are taking
up time yet, but from 1300Hz to 100Hz to draw one tree is a serious hit.
That is a very poor measure of performance you have given me. Drawing nothing at all has a theoretical framerate of infinite frames per second, so comparing one tree to nothing at all does not measure anything. The 100 Hz by themselves does not mean a lot without knowing your video card at least. Try rendering a model of similar polygon count and see how they compare.
I tried profiling it with nVIDIA PerfHUD, but the frame profiler seems to be broken - it tells me that render call #1 takes several million seconds to render. A quick run with nProf didn't bring up anything suspicious. It says that TreeDemo.Draw() only takes 1.48% of the total CPU time, but my CPU usage is only ~4% when running the demo, so that is also a poor measurement.
With that said, I am not an expert on profiling, and neither on billboards. If anything can be optimized, it would definitely be the leaves (since the trunk is just a mesh, nothing special about it).
If the performance is a big let-down, you can use LOD to increase performance. Generating a high-poly and low-poly version of the same tree is easy. To render a very low-poly version, set radialSegments to 5 and cutoffLevel to 1 when calling GenerateTree. The high-poly and low-poly should still look alike (apart from the level of detail) if they were generated using the same seed.