GIR


 
   
Documents

GIR-WG Charter

Administration

Name

Grid Information Retrieval (GIR or GridIR)

Chairs

Greg Newby, PhD
Arctic Region Supercomputing Center
gbnewby at arsc.edu

Nassib Nassar
Etymon Systems, Inc.
nassar at etymon.com

Yangwoo Paul Kim, PhD
Dongguk University
ywkim at dongguk.edu

Secretary

Sousan Karimi
MCNC
sousan at mcnc.org

Mailing List

Subscription details and user interface available at http://www.gir-wg.org/wg_list.html .

Description and Objectives

Purpose

The GridIR WG will focus on establishing the requirements, specifications, reference implementations and best practices in supporting Information Retrieval (IR) services on the Grid. Grid IR services will be needed by users, applications and portals to provide documents, document extracts, answers or other data items to satisfy information needs.

Goals

The GridIR WG will focus on the following:

1. Establish the requirements for Grid IR services:

GridIR will be defined as a set of grid services which, together, constitute a complete an IR system, including:

  • Harvesters, to gather network-based documents
  • Indexers, to build data- and file-structures for retrieval
  • Index processors, to determine post-indexing term and document weights
  • Query processors, to take user queries and gather results
  • Integrators, for ranking results from different sources
  • Renderers, to take results and organize or present them
  • Many other sub-systems and control systems

GridIR will also need to impose requirements on the IR service specific to the Grid, including:

  • Rapid update schedules for datasets
  • Federation of datasets from multiple sources
  • Enabling local policy for dataset content access, based on Grid security infrastructure
  • Sophisticated localized indexing and query processing appropriate for each dataset
  • Sophisticated post-hoc results ranking
  • Efficient use of computational resources (e.g., multiple harvesters feeding one indexer)
  • Multimedia capabilities (incorporation of special-purpose IR systems into one meta-system)
  • Rapid rendering and context-switching, including data visualization of results and multiple 'views' of data b ased on different user profiles
  • Consensus-based results generation from multiple retrieval algorithms to select best-of-breed algorithms

2. Define a set of GridIR specifications:

The Open Grid Services Architecture (OGSA) along with technologies such as the Web Services Flow Language (WSFL) provide a framework for linking loosely coupled grid services together to form more advanced services. Though these technologies provide the infrastructure, each service description must be created by stakeholder communities to ensure required functionality. The GridIR WG will develop an overarching IR architecture, will detail service-level requirements, will establish independent service models, and develop interface specifications for the various independent IR-related services, all with an eye towards tying those services