Gaze-Aware Streaming Solutions for the Next Generation of Mobile VR Experiences

Pietro Lungaro; Rickard Sjöberg; Alfredo Jose Fanghella Valero; Ashutosh Mittal; Konrad Tollmar

“Gaze-Aware Streaming Solutions for the Next Generation of Mobile VR Experiences” by Lungaro, Sjöberg, Valero, Mittal and Tollmar

Next: “Gaze-Based Drawing Assistant” by Chen,... »

« Previous: “Gaze-aware Displays and Interaction” by...

Conference:

SIGGRAPH 2018

Title:

Gaze-Aware Streaming Solutions for the Next Generation of Mobile VR Experiences

Session/Category Title: IEEE TVCG Session on Virtual and Augmented Reality

Presenter(s)/Author(s):

Pietro Lungaro

Rickard Sjöberg

Alfredo Jose Fanghella Valero

Ashutosh Mittal

Konrad Tollmar

Abstract:

This paper presents a novel approach to content delivery for video streaming services. It exploits information from connected eye-trackers embedded in the next generation of VR Head Mounted Displays (HMDs). The proposed solution aims to deliver high visual quality, in real time, around the users’ fixations points while lowering the quality everywhere else. The goal of the proposed approach is to substantially reduce the overall bandwidth requirements for supporting VR video experiences while delivering high levels of user perceived quality. The prerequisites to achieve these results are: (1) mechanisms that can cope with different degrees of latency in the system and (2) solutions that support fast adaptation of video quality in different parts of a frame, without requiring a large increase in bitrate. A novel codec configuration, capable of supporting near-instantaneous video quality adaptation in specific portions of a video frame, is presented. The proposed method exploits in-built properties of HEVC encoders and while it introduces a moderate amount of error, these errors are indetectable by users. Fast adaptation is the key to enable gaze-aware streaming and its reduction in bandwidth. A testbed implementing gaze-aware streaming, together with a prototype HMD with in-built eye tracker, is presented and was used for testing with real users. The studies quantified the bandwidth savings achievable by the proposed approach and characterize the relationships between Quality of Experience (QoE) and network latency. The results showed that up to 83% less bandwidth is required to deliver high QoE levels to the users, as compared to conventional solutions.

References:

[1]
Github repository for the SEEN project, showcasing videos, results and code associated to the paper : https://github.com/MSL-EECS/SEEN.

[2]
Akamai’s {state of the internet}, Q1 2017 : https://content.akamai.com/g1-en-pg9135-q1-soti-connectivity.html. Technical report, Akamai Inc., 2017.

[3]
Recommended upload encoding settings (bitrate) . Technical report, YouTube Inc., 2017.
Google Scholar
[4]
Specifications of Tobii Pro VR Integration, 2017.

[5]
Worldwide Quarterly Augmented and Virtual Reality Headset Tracker . Technical report, IDC Research Inc., 2017.

[6]
A. Aijaz, M. Dohler, A.H. Aghvami, V. Friderikos, and M. Frodigh. Realizing the Tactile Internet: Haptic Communications over Next Generation 5G Cellular Networks. IEEE Wireless Communications – Networking and Internet Architecture, 2017.

[7]
E. Akbas and M.P. Eckstein. Object detection through exploration with A foveated visual field. CoRR, abs/1408.0814, 2014.

[8]
E. Arabadzhiyska, O.T. Tursun, K. Myszkowski, H.-P. Seidel, and P. Didyk. Saccade landing position prediction for gaze-contingent rendering. ACM Trans. Graph., Volume 36 (Issue 4 pp. 50:1–50:12, 2017.

[9]
M.S. Banks, A.B. Sekuler, and S.J. Anderson. Peripheral spatial vision: limits imposed by optics, photoreceptors, and receptor pooling. Journal of the Optical Society of America, Volume 11 (Issue 8 pp. 1775–87, 1991.

[10]
J.A. Boluda, F. Pardo, T. Kayser, J.J. Perez, and J. Pelechano. A new foveated space-variant camera for robotic applications. In Proceedings of Third International Conference on Electronics, Circuits, and Systems, vol. Volume 2, pp. 680–683 vol. 2, Oct 1996.

[11]
G. Cheung, Z. Liu, Z. Ma, and J.Z.G. Tan. Multi-stream switching for interactive virtual reality video streaming . CoRR, abs/1703.09090, 2017.

[12]
N.-M. Cheung, A. Ortega, and G. Cheung. Distributed source coding techniques for interactive multiview video streaming. In Picture Coding Symposium, 2009. PCS2009, pp. 1–4. IEEE, 2009.

[13]
C. Concolato, J. Le Feuvre, F. Denoual, E. Nassor, N. Ouedraogo, and J. Taquetet. Adaptive streaming of HEVC tiled videos using MPEG-DASH. IEEE Transactions on Circuits and Systems for Video Technology, Volume PP (Issue 99, 2017.

[14]
X. Corbillon, G. Simon, A. Devlic, and J. Chakareski. Viewport-Adaptive Navigable 360-Degree Video Delivery . CoRR, abs/1609.08042, 2016.

[15]
W. Dai, G. Cheung, N.-M. Cheung, A. Ortega, and O.C. Au. Merge frame design for video stream switching using piecewise constant functions. IEEE Transactions on Image Processing, Volume 25 (Issue 8 pp. 3489–3504, 2016.

[16]
T. El-Ganainy and M. Hefeeda. Streaming virtual reality content . CoRR, abs/1612.08350, 2016.

[17]
Y. Feng, G. Cheung, W. Tan, P. Le Callet, and Y. Ji. Low-cost eye gaze prediction system for interactive networked video streaming. IEEE Transactions on Multimedia, Volume 15 (Issue 8 pp. 1865–1879, 2013.

[18]
G. Ghinea and G.M. Muntean. An eye-tracking-based adaptive multimedia streaming scheme. In IEEE International Conference on Multimedia and Expo, pp. 962–965, June 2009.

[19]
B. Guenter, M. Finch, S. Drucker, D. Tan, and J. Snyder. Foveated 3D Graphics. ACM Transactions on Graphics (TOG) – Proceedings of ACM SIGGRAPH Asia2012, Volume 31 (Issue 6, Nov. 2012.

[20]
M. Hosseini and V. Swaminathan. Adaptive 360 VR video streaming: Divide and conquer! CoRR, abs/1609.08729, 2016.

[21]
High Efficiency Video Coding . Standard, ITU, 2016.

[22]
“Subjective video quality assessment methods for multimedia applications”. in Recommendation, International Telecommunication Union, 2008.

[23]
K. Kammachi-Sreedhar, A. Aminlou, M. Hannuksela, and M. Gabbouj. Standard-compliant multiview video coding and streaming for virtual reality applications. In Proceedings of the IEEE International Symposium on Multimedia, ISM ’16, San Jose, CA, USA, December, 2016 pp. 11–13.

[24]
M. Karczewicz and R. Kurceren. The SP-and SI-frames design for H.264/AVC. IEEE Transactions on circuits and systems for video technology, Volume 13 (Issue 7 pp. 637–644, 2003.

[25]
E. Kuzyakov and D. Pio. Next-generation video encoding techniques for 360 video and VR, 2016.

[26]
G.M. Muntean, G. Ghinea, and T.N. Sheehan. Region of interest-based adaptive multimedia streaming scheme. IEEE Transactions on Broadcasting, Volume 54 (Issue 2 pp. 296–303, 2008.

[27]
F. Qian, L. Ji, B. Han, and V. Gopalakrishnan. Optimizing 360 Video Delivery over Cellular Networks. In Proceedings of the 5th Workshop on All Things Cellular: Operations, Applications and Challenges, ATC ’16, pp. 1–6. ACM, New York, NY, USA, 2016.

[28]
J. Ross, D. Burr, and M. Morrone. Suppression of the magnocellular pathway during saccades. Behavioral Brain Researh, Volume 80 : pp. 1–8, 1996.

[29]
J. Ross, M. Morrone, M. Goldberg, and D. Burr. Changes in visual perception at the time of saccades. Trends in Neuroscience, Volume 24 (Issue 2 pp. 113–121, 2001.

[30]
G.J. Sullivan, J.R. Ohm, W.J. Han, and T. Wiegand. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Transactions on Circuits and Systems for Video Technology, Volume 22 (Issue 12 pp. 1649–1668, 2012.

[31]
P.J.A. Unema, S. Pannasch, M. Joos, and B.M. Velichkovsky. Time course of information processing during scene perception: The relationship between saccade amplitude and fixation duration. Visual Cognition, Volume 12 (Issue 3 pp. 473–494, 2005.

[32]
G. Van der Auwera, M. Coban, F. Hendry, and M. Karczewicz. AHG8: Truncated Square Pyramid Projection (TSP) For 360 Video. Joint Video Exploration Team of ITU-T SG 16 WP3 and ISO/IEC JTC1/SC29/WG11 .

[33]
M. Viitanen, A. Koivula, A. Lemmetti, A. Ylä-Outinen, J. Vanne, and T.D. Hämäläinen, Kvazaar: Open-source hevc/h.265 encoder. In Proceedings of the 2016 ACM on Multimedia Conference, MM ’16, pp. 1179–1182. ACM, New York, NY, USA, 2016.

[34]
A. Vlachos. Advanced VR Rendering. In Game Developers Conference . GDC, SanFrancisco, CA., 2015.

[35]
F. Volkmann, L. Riggs, K. White, and R. Moore. Contrast sensitivity during saccadic eye movements. Vision Research, Volume 18 : pp. 1193–1199, 1978.

[36]
S.J. Wilson, P. Glue, D. Ball, and D.J. Nutt. Saccadic eye movement parameters in normal subjects. Electroencephalography and Clinical Neurophysiology, Volume 86 (Issue 1, 1993.

[37]
A. Zare, A. Aminlou, M.M. Hannuksela, and M. Gabbouj. HEVC-compliant Tile-based Streaming of Panoramic Video for Virtual Reality Applications. In Proceedings of the ACM on Multimedia Conference, MM ’16, pp. 601–605. ACM, NY, USA, 2016.

[38]
W. Zhou and A.C. Bovik. Foveated image and video coding. Digital Video, Image Quality and Perceptual Coding, 2006.

ACM Digital Library Publication: