Orchestration of web services in the NIF project: using the Kepler workflow engine for data fusion

Short Rating (16 votes):

1.0625

Vadim Astakhov (UCSD), Anita Bandrowski (UCSD), Amarnath Gupta (UCSD), Jeffery Grethe (UCSD), Maryann Martone (UCSD)

We report on progress of employing the Kepler workflow engine and service oriented architecture (SOA) to prototype application integration workflows that integrate data and web services developed by the Neuroscience Information Framework (NIF). One prerequisite of the scientific enterprise consists of searching for effective and useful data and resources, i.e., reagents, neuro-anatomy features, genes, or proteins. Finding relevant resources is becoming not a challenge of scarcity, but one of overabundance; in fact relevant data can be found anywhere among thousands of neuroscience-relevant information resources created by a range of information providers including, research groups, funding agencies, vendor groups, and public data initiatives.
NIF provides a graphical user interface, GUI, to locate and access ontologically aligned and semantically fused heterogeneous federated information. NIF also atomized the various functions that serve the user interface and put them out as services that can be used like “Lego blocks” to query the data, build entirely new interfaces or tools. Currently, we use Kepler to orchestrate communication among various NIF services and provide a transparent layer for data fusion. Kepler combines data and processes into a configurable, structured set of steps that helps to implement semi-automated workflows. Kepler provides a development environment with a graphical user interface for designing workflows composed of a linked set of components called Actors, which can be executed under different Models of Computation. In this work, we report on specific workflows that perform data fusion and orchestration of diverse web services. This “Brain data flow” (See figures below) outputs categorized counts of information from 150 data sources about brain regions. Obtaining a similar set of data from the NIF GUI, requires manually writing down result counts that are the result values for each database for each query. Kepler, unencumbered by the current configuration of the user interface can be asked to pull a different set of data from the result set, in this case the number of results, and place that into a table. This table can then be easily turned into a graphic that helps users see which databases are information rich given a particular query. In this example, Kepler loops and recovers the same set of information for all of the brain parts and all databases, producing a massive matrix (http://tinyurl.com/6nkfe9f).

Orchestration of web services in the NIF project: using the Kepler workflow engine for data fusion

The figure above shows a workflow, where individual web services called functions are used to transform the results and these are fed into other web services. Each component is configurable, and lines between components connect and transfer data obtained from one type of service, into another. These steps are graphics making conceptualization of a workflow relatively simple.

Preferred presentation format: Poster

Topic: General neuroinformatics

Latest news for Neuroinformatics 2011 Twitter icon

Follow INCF on Twitter

Sections

Orchestration of web services in the NIF project: using the Kepler workflow engine for data fusion