| Web searches tap databasesBy 
      Kimberly Patch, 
      Technology Research News
 Although the computer has made it possible 
        to quickly search through documents and databases, sifting through a series 
        of sources -- like a local database, a bunch of text documents, and the 
        Web -- still means using different programs and different searches.
 
 Researchers from Birkbeck University of London in England have 
        written software designed to allow users to search for something without 
        having to know where it might reside.
 
 The search method makes it possible to search different types 
        of sources at the same time, said Richard Wheeldon, a researcher at the 
        Birkbeck University of London. "Think of how difficult it is to search 
        a company's intranet, file system and databases at the same time," he 
        said. "With a few alterations to our technology, it could be made incredibly 
        simple."
 
 The key to the method is software, dubbed DbSurfer, that permits 
        free-text searches on the contents of relational databases. Data stored 
        in relational databases is ordinarily accessed using queries structured 
        to match the organization of the database.
 
 DbSsurfer enables free text relational database searches through 
        a modified version of the trails method used to organize the links contained 
        in hypertext, said Wheeldon.
 
 A trail is a sequence of connected pages. As far back as 1945, 
        computer pioneer Vannevar Bush wrote about the concept of a web of trails. 
        Hypertext and the World Wide Web take advantage of this concept, but databases 
        do not, according to Wheeldon. "Trails have often been used in hypertext 
        systems, but never in relational database systems," he said.
 
 Relational databases organize information using tables subdivided 
        by fields like columns and rows. Individual pieces of information, or 
        records, reside in cells delineated by columns and rows. Relationships 
        between records are determined by fields that the records have in common.
 
 The researchers' software automatically constructs trails across 
        tables in relational databases, according to Wheeldon. The software treats 
        each database row as a virtual Web page, and builds links according to 
        database settings, he said.
 
 When presented with a free text database query, DbSurfer's navigation 
        engine calculates scores for each database row, and the best scores are 
        used to construct trails. The scheme uses a probabilistic best-first algorithm 
        to select the most relevant trails. A probabilistic best-first algorithm 
        assigns more promising alternatives higher probabilities. The researchers' 
        Best Trail algorithm does this in two ways -- proportionally according 
        to the score assigned to the trail, and decreasing exponentially according 
        to rank.The program presents the results to the user as a navigation search 
        interface.
 
 The researchers have also used the same basic system to search 
        the Web, a group of Java documents, program code, and Usenet newsgroups, 
        said Wheeldon. "Theoretically, it could also be used in virtual environments 
        or as a search application at the operating system level," he said.
 
 The method uses standard keyword searches of data sources, and 
        is easily customized, said Wheeldon. Data is represented in the Internet's 
        extensible markup language (XML), and this means "the look of the pages 
        can be changed in many different ways," Wilson said.
 
 The technical challenge to building the software was being able 
        to construct trails efficiently, said Wheeldon. "Trail construction is 
        now typically performed in a few hundredths of a second," he said.
 
 The current prototype won't scale to very large databases, but 
        this is not a fundamental limitation, said Wheeldon. "Anything more than 
        a few tens of millions of rows and the system will choke [but] this is 
        easily fixed in theory," he said.
 
 The software does have a downside -- it is not secure enough for 
        highly sensitive data, Wheeldon said.
 
 The next steps in developing the system are linking the DbSurfer 
        indexer to a Web robot, optimizing the indexer for common databases, and 
        adding software that will enable the entire index and trail structure 
        to be accessed from within the database interface, according to Wheeldon.
 
 The software could be ready for deployment in less than a year, 
        said Wheeldon.
 
 Wheeldon's research colleagues were Mark Levine and Kevin Keenoy. 
        The research was funded by the UK Engineering and Physical Sciences Research 
        Council (EPSRC).
 
 Timeline:   > 1 year
 Funding:   Government
 TRN Categories:  Databases and Information Retrieval; Internet
 Story Type:   News
 Related Elements:  Technical paper, "Search and Navigation 
        in Relational Databases," posted in the Computing Research Repository 
        (CoRR) database at arxiv.org/abs/cs.DB/0307073.
 
 
 
 
 Advertisements:
 
 
 
 | September 24/October 1, 2003
 
 Page 
      One
 
 Radio tags give guidance
 
 Laser made from single 
      atom
 
 Web searches tap databases
 
 Heated plastic holds 
      proteins
 
 News briefs:
 Reflective dust 
      IDs substances
 Rapid process 
      shapes aluminum
 3D display goes deeper
 Artificial 
      DNA stacks metal atoms
 Teamed lasers 
      make smaller spots
 Glow shows individual 
      DNA
 
 News:
 Research News Roundup
 Research Watch blog
 
 Features:
 View from the High Ground Q&A
 How It Works
 
 RSS Feeds:
 News
  | Blog  | Books  
 
   
 Ad links:
 Buy an ad link
 
 
 
         
          | Advertisements: 
 
 
 
 |   
          |  
 
 
 |  |  |