sing group http://sing.ei.uvigo.es  
head
sing logo
AIBench overview Print
Written by Daniel Glez-Peña   

Motivation

AIBench was created inside the SING research group focused in the IA/data-mining field, where several applications has been developed through the last years solving real and practical problems. However, these applications were isolated between them, with a marginal reuse of code and with a hard integration work.

In the IA field, and more specially, in the data-mining applications we always perform these types of operations:

  • Data loading, from some format/origin (text files, XML, databases, etc.).
  • Data-type design and definition in the programming language to support the in-memory representation of the problem-related data.
  • Perform analysis of the data (neural networks, fuzzy systems, genetic algorithms, etc.).
  • Data saving, to some format/destination (text files, XML, databases, etc.).

When a real problem were targeted by our group, a new application development were started (often from scratch), taking chunks of code (if possible) from past applications and also developing newer chunks of code to finally obtain a new hard-coded, problem-centric application.

If you are thinking that AIBench is a(nother) data-mining toolkit, such Weka or the recent Java Data Mining API (JSR-73), you are wrong, let us explain.

The basic difference between our hard-coded applications was the different combination of these chunks of code. For example, sometimes a neural network was applied to two different problems, where the data origin was the unique difference, so we only needed to replace the load operation with a new one. Keeping this idea in mind, we found essential that the combination itself must be isolated from the operations. Here is where AIBench comes as a new framework to develop these kinds of applications.

To achieve this isolation it was necessary to define an operation model, that is, how these chunks of code (and the new ones) should be programmed in order to achive the transparent combination and code reuse. If the operations are well defined (well defined INPUT/OUTPUT), we could achieve that isolation and more benefits that are explained in the next sections.

Objectives

The main objectives of AIBench are the following:

  1. Define an operation model, that is, how the operations should be developed. [DONE]
  2. Reduce the GUI-related code, because it's hard and it isn't centred in the problem. [DONE]
  3. Bring a mechanism to plug/unplug the operations without recompiling the entire framework. [DONE]
  4. Possibility of saving a combination of operations with their parameters (similar to an experiment design), with the possibility of applying it over other data one or more times in batch mode. [IN PROGRESS]

Basic concepts

To learn how this objectives where targeted, here is a brief description of some basic concepts.

Operation Model

An AIBench operation is a simple Java class with some annotations that well define his INPUT/OUTPUT. For example:

Abstract View of a operation Source code of an operation with 3 ports

@Operation(description=”this operation adds two numbers”)
public class Sum{
private int x,y;

@Port(direction=Direction.INPUT, name=”x param”)
public void setX(int x){
this.x = x;
}

@Port(direction=Direction.INPUT, name=”y param”)
public void setY(int y){
this.y = y;
}

@Port(direction=Direction.OUTPUT)
public int sum(){
return this.x + this.y;
}
}

With these annotations (plus a plugin descriptor), AIBench knows everything to:

  • Deploy these operations in menus.
  • Generate input dialogs to invoke them.
  • Save the results to give the possibility to forward them to other operations.

You can define your own data-types as the INPUT/OUTPUT of your operations (in-memory representation of data, trained neural networks, results, etc). Of course, you data-types doesn't need to inherit from/implement anything. These data-types are specific of your domain, and AIBench only keeps track of them to give you the possibility to forward them from the output of an operation to the input of another. This is achieved by the clipboard mechanism.

You can get more information of how to develop operations in this documentation chapter.

Dynamic generation of input dialogs

AIBench generates input dialogs based on the operation's definition. It can infer the type of components (text fields, combo boxes) needed to gather all the INPUT parameters of a given operation. The following input dialog was generated from the operation's definition in the above example.

To learn more about the dynamic generation of input dialogs, see this documentation chapter.

Defining Views to your data-types

As it was explained before, the operations can define complex data-types as their INPUT/OUTPUT. When one of these objects is generated from a given operation, you can provide a custom "view" to visualize it. To learn how to do this, see this documentation chapter.

Plugin architecture of AIBench

AIBench runs powered by a plugin engine (to learn more about this, see this documentation chapter). The basic plugins that runs inside AIBench are:

  • CORE [maintained by the core programmer]: Manages the operations, invoking them and keeping track of their results and invocation history.
  • WORKBENCH [maintained by the core programmer]: Implements a Swing based GUI and is responsible of the dynamic generation of input dialogs.
  • YOUR PLUGINS [developed by the operations programmer]: Here you put your operations, classes of your data-types and custom views.

As you can see, there are two roles developing in AIBench, the core programmers and the plugins/operations programmers (or AIBench users). For the last ones, we provide this documentation.

Example

Here is a snapshot with two operations (Load CSV and Classifier Train) invoked and integrated through the clipboard:

And here is a snapshot of AIBench.

Current and future work

The current work is focused in:

  • Possibility of saving a combination of operations with their parameters (similar to an experiment design) and launch it in the future over other data one or more times in batch mode.
  • Experiment Designer. With it we could connect all the operations needed by some experiment, before execute anything and, of course, save the design.
  • Virtual operations. Give the possibility to combine two or more operations, connecting some of their INPUTS/OUTPUTS, creating a new virtual operation (similar as in digital electronics). A good place to do this can be the "Experiment Designer".