I think your questions show a bit of confusion how things work "under the hood."
Is a Model of a cube better than indexed primatives (with
VertexPositionTexture)?
There is no difference. Internally, a Model is a vertex buffer and an index buffer with indexed primitives.
Is Texturing and drawing with indexed
primatives any more/less efficient than a Mesh.Draw() method?
There is no difference. Internally, the Mesh sorts triangles by materials, and issues one device draw call per material, because it needs to change the bound texture for each face.
Should
Texturing be done via HLSL texture methods like texCUBE, or by using a
Material in the Draw code?
Texturing is done by the hardware no matter what. A 2D texture on the BasicEffect material translates to a tex2D() in HLSL function call.
Cube texture look-up is slightly less efficient than 2D texture look-up, because there is a normalizing divide (and a comparison) involved in calculating the actual texture coordinates in the texture unit. Thus, fetching each texel may be a bit slower when using texCUBE than when using tex2D. On the other hand, texturing with tex2D means that you need to re-bind four or five textures each time you draw the sky box (assuming the back face will never be drawn). Re-binding textures and issuing draw calls have fixed overhead. Thus, you may find, when you measure, that on your hardware, texCUBE will be slower on large displays (1920x1080, etc), because the per-pixel cost dominates. Meanwhile, you may find on your hardware that tex2D is slower on small displays (848x480, say), because the set-up cost dominates. Or you may find that there is no difference on an empty scene, but there is a difference on a heavily loaded scene. Or vice versa. It's impossible to tell from just a loose "what if" -- or, more accurately, I can sketch reasons for why either outcome would be true, depending on the specific hardware. Pick a target, and measure.
What are the preloading benefits/drawbacks
when using the Content Pipeline, if any? Or is it just data structure
thing for simplicity?
Once data is loaded into vertex buffers, it's loaded data. How it gets there doesn't matter for rendering. However, from a content creation pipeline point of view, it's always better to let artists (the ones doing the model creation) have as much control as possible, and reduce the number of times they have to call in a programmer to have some change made. That has to do with production efficiency, not rendering efficiency.