Fixation Prediction for 360° Video Streaming to Head-Mounted Displays


We study the problem of predicting the Field-of-Views (FoVs) of viewers watching 360° videos using commodity Head-Mounted Displays (HMDs). Existing solutions either use the viewer’s current orientation to approximate the viewed tiles in the future, or extrapolate the FoVs using the historical orientations and dead-reckoning algorithms. In this paper, we develop Fixation prediction networks that concurrently leverage sensor- and content-related features to predict the viewer Fixation in the future, which is quite different from the solutions in the literature. The sensor-related features include HMD orientations, while the content-related features include image saliency maps and motion maps. We build a 360° video streaming testbed to HMDs, and recruit twenty-five viewers to watch ten 360° videos. We then train and validate two design alternatives of our proposed networks, which allows us to identify the better-performing design with the optimal parameter settings. Trace-driven simulation results show the merits of our proposed fixation prediction networks compared to the existing solutions, including: (i) lower consumed bandwidth, (ii) shorter initial buffering time, and (iii) short running time.


Ching-Ling Fan, Jean Lee, Wen-Chih Lo, Chun-Ying Huang, Kuan-Ta Chen, and Cheng-Hsin Hsu, "Fixation Prediction for 360° Video Streaming to Head-Mounted Displays," In Proceedings of ACM NOSSDAV 2017, June 2017.


@inproceedings{fan17:fixapred, author = {Ching-Ling Fan and Jean Lee and Wen-Chih Lo and Chun-Ying Huang and Kuan-Ta Chen and Cheng-Hsin Hsu}, title = {Fixation Prediction for 360° Video Streaming to Head-Mounted Displays}, booktitle = {Proceedings of ACM NOSSDAV 2017}, pages = {67--72}, year = {2017} }