Q1: A clear explanation: A normal is a direction vector. The normals stored in a normal map are direction vectors derived from arbitraray points on the surface of a high resolution mesh. When you use a normal map on a low res mesh, essentially what you are doing is telling the GPU "When calculating the effect of a light on this surface, ignore the mesh's true normals and use the ones in the map instead."
Bump and displacement maps are basically the same, but they do not contain a direction vector, they contain a magnitude. Essentially when you a bump map you are telling the renderer "Use the true surface normal, but calculate the lighting as if the existing normal were displaced by the magnitude contained in the bumpmap". Displacement maps are the same as bumpmaps, they contain a magnitude, however you are correct, they are used to actually modify the existing points on the geometry. A displacement map pushes the the existing points of a mesh along it's existing normals by the magnitude stored in the displacement map.
So the difference really is that a diplacement or bump map affects the existing normals of a surface without changing their direction, where a normal map replaces the existing surface normals.
Q2: You could use a bumpmap as a normal or displacement map, because in the end they are all just textures, however it would not give you desirable results. Bumpmaps and displacement maps are fairly interchangable, a normal map is something slightly different.
Q3:A normal map again stores the normals from a high resolution surface. Basicly it works like this. You have your low resolution in game mesh and a very high resolution high detail mesh. Both meshes have similar shapes and UV sets. Their texture map layout looks the same. To create the normal map, the high res surface is sampled at the UV coordinates of every pixel in the normal map and the high res surface's XYZ surface normal is written into the RGB channels of the normal map. So the map contains the normals of the high res mesh. These normals are a direction vector used by shaders to calculate how light is reflected from the surface. As far as the shader is concerned, you are replacing the true surface normals of the low resolution mesh with the normals contained in the normal map.
Binormals and tangents are a bit more difficult to explain, but in general you can think of the bi-normal, tangent and normal vectors as axis' which make up a little 3D coordinate system for any arbitrary point on a surface. As a simple example, the normal is perpendicular to the surface, the tangent is perpendicular to the normal, and the binormal is perpendicualr to both the normal and the tangent. A simple way to visualize it is as a 3D coordinate system Normal is the Y axis, Tangent is the X axis and binormal is the Z axis.