Video organizes paper

By Kimberly Patch, Technology Research News

With the notion of the paperless office fading into history, researchers from the University of Washington are working to more closely integrate the paper world -- still on the rise -- with the world of electronic data.

The researchers' system uses a computer and overhead video camera to track physical documents on a desk and automatically link them to appropriate electronic documents. The researchers have constructed a pair of prototypes that track paper documents and sort photos.

One advantage of the system is that it doesn't require any special tags, paper or marks, said Jiwon Kim, a researcher at the University of Washington. "Our work... allows the user to keep using the old paper while only adding a single video camera to the environment," she said.

The paper-tracking system allows users to pinpoint the location of a given document within a stack of documents on the desktop. Users can find documents by using keywords, appearance or by how recently a paper was moved. The photo-sorting application allows users to sort digital photographs using printouts of the photos.

The use of paper is ever on the increase because the user interfaces of paper and electronic documents complement each other, said Kim. "Paper enables tangible interactions and therefore is more intuitive to use, while electronic documents are convenient for editing, sharing [and] indexing," she said. Currently, however "these two formats are decoupled from each other, making it hard to take advantage of the conveniences of both media."

The researchers were able to better integrate the two formats by taking advantage of recent breakthroughs in computer vision techniques that allow for fast and reliable object recognition, said Kim.

The researchers' system uses a combination of computer vision techniques to infer the structure of a stack of papers. "Such a video sequence usually differs from regular video in that the scene changes infrequently," said Kim. The user moves paper X from stack A to stack B, and after a pause moves paper Y from stack A to stack C, and so forth, she said. "Therefore, we first split the video into these individual movements... then we interpret each event to figure out which document moved."

After such events are processed, the system reconstructs the structure of the paper stacks and is ready to answer such queries as "Where is my W-2 form?", said Kim.

The user can interact with the system in several ways, said Kim. Users can choose a document from a group of thumbnails on the computer screen or can perform keyword searches on the title or author of the document. The system shows the user the location of the document by showing the desktop stack containing the document, then expanding the stack and highlighting the document in red.

The system can also begin with a desk that is already full of documents, said Kim. "We didn't want to force the user to start with an empty desk," she said. "Instead, the system gradually discovers the paper documents on the desk over time, as the user moves them around."

Users can also browse desktops in remote locations by clicking and dragging on an image of the remote desk, said Kim.

Although the video image resolution is not high enough for a human observer to read the text, making it difficult to distinguish documents with a similar layout, the researchers used an existing feature-based object recognition technique, dubbed Scale-Invariant Feature Transform, that was able to differentiate them, said Kim.

The researchers developed another algorithm that models the evolution of paper stacks by modeling it as a sequence of graphs.

The researchers' first prototype application allows users to find physical documents buried in stacks of documents on a desk. "The user can issue queries like 'where is the paper written by John Smith?', or 'this looks like the thumbnail of my tax form. Where is it?'," said Kim.

The researchers' second prototype application allows users to more easily sort digital photographs. "Sorting a large number of digital photographs is not an easy task [because it entails] having to drag-and-drop each file into different folders," said Kim. "In contrast, people are adept at hand-sorting printed photographs into piles on a desk."

The researchers printed out digital photographs on paper and recorded a video of the user sorting them into physical stacks on the desk. The system automatically organizes the corresponding digital photographs into folders on the computer corresponding to the physical grouping on the desk.

This way of merging the digital and physical worlds has the potential to prove useful beyond just tracking the locations of physical objects, said Kim. The system could be taken further to allow for querying the history of documents, lifting written annotations from document surfaces, recognizing text, and attaching reminders to documents, she said.

A similar tracking and recognition system could be applied to objects other than documents, said Kim. "Applications may range from finding lost objects [like a] key, pen or wallet, to indexing books or CDs on the shelf, to keeping track of items in supermarkets or warehouses," she said. "In many of these cases computer vision-based tracking could be used in combination with other dedicated technologies like radio frequency ID tags."

Radio frequency ID tags are small computer chips that, when hit with a certain frequency radiowave, use the energy from the radiowave to emit a unique identification number.

The researchers are working on improving the existing system to handle a wider range of user interactions with documents, to optimize the performance of the video analysis engine so the input video can be processed in real-time, and to build other applications, said Kim. "We can imagine supporting queries like 'find me all documents I haven't used for the past three weeks so I can clean them off the desk' or 'find me all documents that look similar to this credit card bill so I can file them together'," she said.

The system can be used now for applications similar to the researchers' prototype applications. The system could be ready for practical use on general desktops in three to four years, said Kim.

Kim's research colleagues were Steven M. Seitz and Maneesh Agrawala. The researchers presented their work at User Interface Software and Technology 2004 (UIST '04), held in Santa Fe, New Mexico, October 24 to 27, 2004. The research was funded by the National Science Foundation (NSF) and Intel Corporation.

Timeline:   3-4 years
Funding:   Government, Private
TRN Categories:  Human-Computer Interaction; Databases and Information Retrieval
Story Type:   News
Related Elements:  Technical paper, "Video Based Document Tracking: Unifying Your Physical and Electronic Desktops,", presented at User Interface Software and Technology 2004 (UIST) '04, Santa Fe, New Mexico, October 24-27, 2004 and posted at


January 12/19, 2005

Page One

Video organizes paper

Conversations control computers

DNA scheme builds computers

The History Files:
A Short History of the Computer

Letter to readers

Copy-and-paste goes natural
RNA tiles form nanopatterns
Input device tracks muscle tremors
Nano gas turbine designed
Ultrasound makes blood stand out
Silicon surfaces speed circuits
Branchy molecules make precise pores


Research News Roundup
Research Watch blog

View from the High Ground Q&A
How It Works

RSS Feeds:
News  | Blog  | Books 

Ad links:
Buy an ad link


Ad links: Clear History

Buy an ad link

Home     Archive     Resources    Feeds     Offline Publications     Glossary
TRN Finder     Research Dir.    Events Dir.      Researchers     Bookshelf
   Contribute      Under Development     T-shirts etc.     Classifieds
Forum    Comments    Feedback     About TRN

© Copyright Technology Research News, LLC 2000-2006. All rights reserved.