Skip to content

Stanford University
I am currently working on video streaming with interactive region-of-interest (IROI). The user can control pan/tilt/zoom while watching the video. Alternatively, the system can choose an ROI for presentation; this relieves navigation burden.

High-resolution digital imaging sensors are becoming more widespread. In addition, high-spatial-resolution videos can also be stitched from views from multiple cameras, as implemented by Hewlett-Packard in their video conferencing product Halo [1]. However, challenges in delivering this high-resolution content to the client are posed by the limited resolution of display panels and/or limited bit-rate for communications. Suppose that a client limited by one of these factors requests the server to stream a high-spatial-resolution video. One approach would be to stream a spatially downsampled version of the entire video scene to suit the client's display window resolution or bit-rate. However, with this approach, the user might not be able to watch a local region-of-interest (ROI) in the highest captured resolution. We propose a video delivery system which enables virtual pan/tilt/zoom functionality during the streaming session such that the server can adapt and stream only those regions of the video that are desired at that time at the client's end. The nature of the system requires a video coding scheme that allows for sufficient random access to arbitrary ROIs while keeping the transmission rate as low as possible. We have proposed such a video coding scheme in [2]. Additionally, we have also developed a user interface which allows the user to select the ROI while watching the video. As shown in the figure below, the display screen at the client's side consists of two areas:
The zoom factor can be controlled with the scroll of the mouse. For any zoom factor, the ROI can be shifted by holding down the left mouse-button and moving the mouse. As shown in the figure below, the location of the ROI is depicted in the thumbnail/overview display area by overlaying a corresponding rectangle on the video. The color and size of the rectangle vary according to the zoom factor.


User Interface


We are currently designing the system such that it works over practical packet-switched networks like the Internet. Our research involves various optimizations of the coding scheme along with algorithms for packet scheduling and pre-fetching data [3] from the server to ensure good quality of the delivered video along with low latency of interaction.

We have recently proposed [4,5] to exploit overlaps in ROIs within a peer population to employ application-layer multicasting for an efficient and scalable delivery-mechanism. Notable challenges include adapting the overlay topology on-the-fly to account for changing ROIs, stringent latency constraint due to the interactive nature of the system, and limited bandwidth at the server hosting the IROI video session.

Demos:


[a] Aditya Mavlankar recently built a demonstrator at Deutsche Telekom Research Laboratories in Berlin, Germany. The demonstrator shows interactive viewing of a soccer game. The view of the entire soccer playfield was obtained by stitching views from multiple cameras. The ROI can be chosen to conveniently focus on a part of the playfield. Also provided is an automatic mode in which the system can track the ball and choose the ROI. The automatic mode relieves navigation burden although the user can change the zoom factor. Download video of demo.

[b] Head-tracking for finer selection of ROI in automatic mode: A camera placed under the TV screen can track the user's head. The user can shift the ROI to the right or to the left in the automatic mode described in [a] above by moving his/her head. Download video of demo.

References:


[1] Halo: Video Conferencing Product by Hewlett-Packard

[2] Aditya Mavlankar, Pierpaolo Baccichet, David Varodayan, and Bernd Girod, "Optimal Slice Size for Streaming Regions of High Resolution Video with Virtual Pan/Tilt/Zoom Functionality," Proc. of 15th European Signal Processing Conference (EUSIPCO), Poznan, Poland, Sept. 2007 ([paper], [presentation]) (Best Student Paper Award)

[3] Aditya Mavlankar, David Varodayan, and Bernd Girod, "Region-of-Interest Prediction for Interactively Streaming Regions of High Resolution Video," Proc. of 16th IEEE International Packet Video Workshop (PV), Lausanne, Switzerland, Nov. 2007 ([paper], [poster]) (Student Travel Grant Awarded; Sponsored by Vidyo (formerly Layered Media) and Microsoft Research Asia)

[4] Aditya Mavlankar, Jeonghun Noh, Pierpaolo Baccichet, and Bernd Girod, "Peer-to-Peer Multicast Live Video Streaming with Interactive Virtual Pan/Tilt/Zoom Functionality" Proc. of International Conference on Image Processing (ICIP), San Diego, CA, USA, Oct. 2008 ([paper])

[5] Aditya Mavlankar, Jeonghun Noh, Pierpaolo Baccichet, and Bernd Girod, "Optimal Server Bandwidth Allocation for Streaming Multiple Streams via P2P Multicast" Proc. of IEEE International Workshop on Multimedia Signal Processing (MMSP), Cairns, Australia, Oct. 2008 ([paper])





Last modified: Aug 11, 2008.