You are right that in flat panel business the supersampling is a way to achieve anti-aliasing. I.e. rendering at higher res than the panel and then downsampling makes the picture "smoother". I probably would not call it "sharpness", but I guess this is just a different wording.
In HMD the supersampling has basically the same role except one important detail. The downsampling is no longer linear, because it also includes pre-lens warp transformation. This transformation is non-linear and causes some picture regions to be more downsampled than the others. The actual distribution of the downsampling "intensity" depends on the optical properties of the lenses and the FOV. Suffice to say that for Oculus they decided to use SS factor of 1.7 to achieve at worst 1:1 pixel mapping over the 90° FOV. Which means that there are actually part of the picture, which do not get any anti-aliasing treatment (in the center towards the rim of the FOV), as they are basically mapped one to one, while the other regions get higher downsampling/anti-aliasing than the original 1.7 SS would suggest (towards the rim in the center).
This is also the reason why the tools offer an option to the user to increase SS factor to either improve/apply the anti-aliasing over the distant areas from the optical center, or to decrease SS factor at the expense of introducing aliasing artifacts at the center FOV border.
As 8K (and 5K) uses larger FOV than OR or Vive, the common sense would suggest that SS factor should be also higher if one would want keep 1:1 pixel mapping in the center at the edge of the FOV. On the other hand, the display resolution (and subpixel arrangement) may help to mitigate it a bit and it is not clear how both the advantage and the disadvantage add up in the final product.
Pimax never acknowledged which SS factor they were using (even though they were asked many times about that during the KS and after) for pre-lens warp transformation. Which is unfortunate, because this is clearly one the most important factor which defines whether your GPU can actually handle it or not.
I guess we can only wait and see on this point.
This has been already addressed by others, so I would just say that the HW scaler will work on serial data as they are being received. It can wait for one or two scanlines to be ready before starting its work to minimize the latency. So technically I was not completely correct, as there are algorithms which can work just fine in this setup, i.e. while having the scaler having the complete info. I was just trying to point out that the algorithms will probably rather simple and will only work locally, never globally on the whole image.
I must admit I do not expect Brainwarp to help or to add anything to the (figurative) picture. From the original (even though a bit vague) explanation given by Pimax, I concluded that it would bring equal amount of problems as solutions and was not worthy. Unfortunately Pimax never elaborated on the technologie further.
Edit: Fixed the wrong explanations about the difference in downsampling in pre-lens warp transformation. Thanks to @jojon.