Orchestration of web services in the NIF project: using the Kepler workflow engine for data fusion
Vadim Astakhov (UCSD), Anita Bandrowski (UCSD), Amarnath Gupta (UCSD), Jeffery Grethe (UCSD), Maryann Martone (UCSD)
NIF provides a graphical user interface, GUI, to locate and access ontologically aligned and semantically fused heterogeneous federated information. NIF also atomized the various functions that serve the user interface and put them out as services that can be used like “Lego blocks” to query the data, build entirely new interfaces or tools. Currently, we use Kepler to orchestrate communication among various NIF services and provide a transparent layer for data fusion. Kepler combines data and processes into a configurable, structured set of steps that helps to implement semi-automated workflows. Kepler provides a development environment with a graphical user interface for designing workflows composed of a linked set of components called Actors, which can be executed under different Models of Computation. In this work, we report on specific workflows that perform data fusion and orchestration of diverse web services. This “Brain data flow” (See figures below) outputs categorized counts of information from 150 data sources about brain regions. Obtaining a similar set of data from the NIF GUI, requires manually writing down result counts that are the result values for each database for each query. Kepler, unencumbered by the current configuration of the user interface can be asked to pull a different set of data from the result set, in this case the number of results, and place that into a table. This table can then be easily turned into a graphic that helps users see which databases are information rich given a particular query. In this example, Kepler loops and recovers the same set of information for all of the brain parts and all databases, producing a massive matrix (http://tinyurl.com/6nkfe9f).