Proposal for Graph INterface librarY

Much biological attention has been given to a “network” analysis view. This is commonly implemented in terms of the Graphing paradigm where there are Nodes connected by Edges. There are numerous implementations of a graph that all have merit, and are appropriate at different times. For the remainder of this document all interface names and methods will be in italic.

Using the Interface Library

When designing GINY there were many considerations to include in the design. Obviously it needed to be able to support all graph algorithms, and all of the GUI interaction that we became accustomed to. Other important considerations was a clear separation between the Model and View parts of the interface. This allows for graphical and non-graphical applications to use the same interfaces for algorithm development, but does not require that non-graphical applications be burdened with the overhead associated with graphics.

Overview of the Interfaces

Giny.model :

There are four interfaces that make up the giny.model package, GraphRoot, GraphPerspective, Node, and Edge.Of these GraphPerspective will be the most used. GraphPerspective is the class that algorithms should be written to. GraphRoot should be thought of as the entire set of Nodes and Edges. A GraphPerspective is then just a usable set of those Nodes and Edges. A GraphPerspective can be of any size, can include all or none of the objects in the GraphRoot and can contain the same Nodes and Edges as another GraphPerspective. In most implementations, the relationship between a GraphRoot and the many GraphPerspectives that eminate from it will be that the GraphPerspectives hold pointers to the actual objects in the GraphRoot.

Nodes and Edges, because they are in multiple GraphPerspectivesare therefore ignorant of many common properties associated with them. This includes things like degree and neighbors as well as what GraphPerspectives they are in. These methods are available in the GraphPerspective interface, to reflect the independence of each GraphPerspective from each other.

Use Case 1:

A user wants to load in the entire yeast interactome, but only look at the GAL genes. Then they want to expand to the first neighbors. This is done by loading the entire interactome into the RootGraph, and then getting a GraphPerspective that only includes the GAL genes. When the user requests the first-neighbors, an algorithm gets the RootGraph from the GraphPerspective, and calls List nodeList = rootGraph.getNeighbors( node ) for each node in the current Perspective, and then puts those nodes in the GraphPerspective via perspective.restore( node ); .

Use Case 2:

A user wants to compare three different sets of Nodes using a variety of algorithms. This is done by taking three different GraphPerspectives from the RootGraph, and then running the algorithm of interest on each of those GraphPerspectives via: algorithm.run( perspective ); .

Giny.view :

There are three interfaces in the giny.view package , GraphView, NodeView andEdgeView.The view package is dependent on giny.model. Each of the interfaces are designed to provide a viewable element for each model interface. NodeView and EdgeView are views on Node and Edge respectively. GraphView provides a view onto a GraphPerspective. GraphRoot is not viewable. Each implementation will be responsible for handling input events ( mouse and keyboard ) , and then firing events that are specified by GINY. As is obvious from the diagram, each GraphView contains Node/EdgeViews that are independent on Node/EdgeViews in other GraphViews.

For debate, there are a number of issues that need to be discussed. These include the drawing operations such as setPaint, setBorderPaint, setLabel, setLabelPaint, setShape. Please weigh in on which operations are considered essential so that we can include them in the API. It is also desirable to support SWT, because it is much faster than Swing. It should be possible to support both toolkits.

Use Case 1:

A biologist wants to look at one set of genes colored according to expression data, and another set of genes colored according to different expression data. To do this, make two GraphPerspectives that reflect the two sets of genes. Create two GraphViews from those GraphPerspectives. In each GraphView, color accordingly.

Use Case 2:

A biologist wants to look at the same set of genes colored according to two different expression experiments. To do this, make one GraphPerspective, then create two GraphViews from that GraphPerspective. Color each GraphView accordingly.

Support for a Hierarchy of Node Relationships (aka. “MetaNodes” )

The information that we find ourselves dealing with demands a more complex data structure than has previously been available. This data takes the form of a node hierarchy. This diagram attempts to explain the uses for this functionality:

The pink frame is a sample graph of six nodes and six edges. The red frame shows a possible grouping of those nodes. There are three separate groupings shown. Group1 holds C and B, Group2 holds D and E. Group3 holds Group2 and F ( well, maybe, read on for more on this ). The purple frame shows the resulting graph if each group is assigned its own “metanode” (nodes labeled 1-3). Edges are created between each metanode and the nodes that interact with its members (shown as curvy edges). For example, metanode 1 has edges to A and E, because of B; and an edge to D because of C. The yellow frame shows how this graph could be collapsed, such that all nodes belonging to a group are removed from the graph and replaced by its metanode. In this case, metanode 2 is removed because it belongs to metanode 3.

These groupings are useful, but they do not adequetly describe the relationships between the nodes. The green and blue frames are a hierarchechal view of these relationships. In the green frame, it is obvious that metanode 3 is a parent of metanode 2 and node F, yet in the blue frame it is obvious that metanode 3 is a parent of metanode 2, node F, node D, and node E. This makes clear the confusion of the red frame. This type of hierarchy is supported by GINY, and is available via Hparent() and Hchild() methods.

In GINY, there is no “metanode”, since all nodes are peers. The functionality of a “metanode” is available to any Node, by using the addHParent() and addHChild() classes in RootGraph. Obviously, these relationships are available in the GraphPerspectives that contain any nodes that are part of the hierarchy. It is important to note that if a Node is parent of another Node via rootGraph.isHParent( parent, child ), that it does not gaurentee that the relationship is true for a GraphPerspectivewhere only one of those nodes exists.

Use Case 1:

The Biomodule idea first pioneered by the Galitski lab. (If you are not familiar with the Biomodule plugin developed by Iliana Avila-Campillo, then become so.) To implement the current functionality of the Biomodule Plugin in GINY, it will no longer require separate data structures. After adding the interactome to RootGraph, and running the appropriate algorithms on a GraphPerspective, new nodes will be created for each Biomodule, and rootGraph.addHChild( biomodule, node ); will be called for each node that is a member of a Biomodule.

Use Case 2:

A user wishes to reduce the complexity of a graph by collapsing several highly connected Nodes into one. To do this, a new Node is created and added to the current GraphPerspective. The highly connected Nodes can then be iteratred through to make sure each is a Hchild of the new Node. Those Nodes can be removed from the GraphPerspective and all algorithms run on the GraphPerspective will now only see the new Node.

Use Case 3:

A user wishes to have a very modular view of the cell. This means that he only wants to see interactions between Cellular Components. But he also want to be able to iteregate each of those components for what their internal interactions are. To do this, after loading the entire interactome into RootGraph, Nodes are created that will represent each of the Cellular Components of interest. The appropriate Nodes are added to the component Nodes via addHChild(). A GraphPerspective is created that only contains the component Nodes. As the user becomes interested in a component, he can get a GraphPerspective that contains all of the Hchildren of the component. Note that nothing prevents a Node in this example from belonging to multiple components.

Non-Interface Issues:

Class Specific Iterators

Because of the inclusion of generics in Java1.5 ( coming soon ), we do not feel that it is necessary to create convience classes such as "NodeIterator" and "EdgeIterator". However as soon as Java1.5 is released it is suggested that we use the generic support offered, which will eliminate the overhead of constant casting.

Common Graph Routines

There are a number of graph routines that are used commonly, these include BFS, DFS, APSP and other acronymed algorithms. Support for these will not be in the interface library, but they will be supported via implementations that are shipped with GINY.