• 検索結果がありません。

複雑ネットワーク(知識、CytoScape) script of

N/A
N/A
Protected

Academic year: 2018

シェア "複雑ネットワーク(知識、CytoScape) script of"

Copied!
3
0
0

読み込み中.... (全文を見る)

全文

(1)

Vol. 24 no. 2 2008, pages 282–284

BIOINFORMATICS

APPLICATIONS NOTE

doi:10.1093/bioinformatics/btm554

Systems biology

Computing topological parameters of biological networks

Yassen Assenov, Fidel Ramı´rez, Sven-Eric Schelhorn, Thomas Lengauer and

Mario Albrecht

*

Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Stuhlsatzenhausweg 85, 66123 Saarbru¨cken, Germany

Received on June 16, 2007; revised on October 8, 2007; accepted on November 1, 2007

Advance Access publication November 15, 2007

Associate Editor: Martin Bishop

ABSTRACT

Summary:Rapidly increasing amounts of molecular interaction data are being produced by various experimental techniques and compu-tational prediction methods. In order to gain insight into the organi-zation and structure of the resultant large complex networks formed by the interacting molecules, we have developed the versatile Cytoscape plugin NetworkAnalyzer. It computes and displays a com-prehensive set of topological parameters, which includes the number of nodes, edges, and connected components, the network diameter, radius, density, centralization, heterogeneity, and clustering coeffi-cient, the characteristic path length, and the distributions of node degrees, neighborhood connectivities, average clustering coeffi-cients, and shortest path lengths. NetworkAnalyzer can be applied to both directed and undirected networks and also contains extra functionality to construct the intersection or union of two networks. It is an interactive and highly customizable application that requires no expert knowledge in graph theory from the user.

Availability:NetworkAnalyzer can be downloaded via the Cytoscape web site: http://www.cytoscape.org

Contact:[email protected]

Supplementary information:Supplementary data are available at Bioinformaticsonline.

1 INTRODUCTION

In recent years, high-throughput experiments have produced large networks of interacting molecules, which are represented as nodes linked by edges in complex graphs (Albrechtet al., 2005; Ramı´rez et al., 2007; Zhu et al., 2007). In this context, the characterization of biological networks by means of graph-topological properties has become very popular for gaining insight into the global network structure (Albert, 2005; Almaas, 2007; Barabasi and Oltvai, 2004; Dong and Horvath, 2007; Zhu

et al., 2007). However, general software libraries for graph analysis such as JUNG (http://jung.sourceforge.net/), LEDA (http://algorithmic-solutions.com/enleda.htm), NetworkX (https://networkx.lanl.gov/) and yFiles (http://www.yworks. com/) are not easily applied by the biological user. Other applications like Pajek (Batagelj and Mrvar, 1998) require expert knowledge in graph theory on the user side. Specialized tools for the analysis of biological networks like CentiBiN

(Junkeret al., 2006), tYNA/TopNet (Yipet al., 2006; Yuet al., 2004) and VisANT (Huet al., 2005) calculate only a limited set of topological parameters.

Therefore, we have developed NetworkAnalyzer, a user-friendly Java plugin for Cytoscape (Shannon et al., 2003), which is an established free open-source software platform for the visualization and analysis of molecular interaction networks (Shannonet al., 2003). An initial release of NetworkAnalyzer was made available in January 2006. In the following, we describe the basic functionality of NetworkAnalyzer and numerous extensions and improvements of the next major release.

2 PROGRAM OVERVIEW

NetworkAnalyzer efficiently computes a large number of topological network parameters for directed and undirected networks loaded into Cytoscape. The user can decide whether directed edges should be treated as undirected for the analysis. The computed simple and complex topology parameters are represented as single values and distributions, respectively. Simple parameters are the number of nodes, edges, self-loops, and connected components, the average number of neighbors, the network diameter, radius, density, centralization, hetero-geneity, and clustering coefficient, the number of shortest paths, and the characteristic path length. Complex parameters are distributions of node degrees, neighborhood connectivities, average clustering coefficients, topological coefficients, shortest path lengths, and shared neighbors of two nodes. Network-Analyzer utilizes the free Java libraries JFreeChart (http:// jfree.org/jfreechart/) and Batik (http://xmlgraphics.apache.org/ batik/) to display the distributions as histograms or scatter plots (Fig. 1) and to export them as chart images in the formats JPG/PNG/SVG or as tables in plain text files. Details on the formal definitions of all topological parameters are given in the online help page of the plugin. To ensure the validity of the calculations performed by NetworkAnalyzer, the computed parameters were compared with those obtained from Pajek, TopNet, and using the Python graph library NetworkX.

While the majority of the topological parameters included in NetworkAnalyzer is already well known and frequently used in the literature, our plugin additionally computes some novel network properties. In particular, we have extended the original definition of neighborhood connectivity (Maslov and Sneppen,

*To whom correspondence should be addressed.

ß2007 The Author(s)

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Downloaded from https://academic.oup.com/bioinformatics/article-abstract/24/2/282/227130 by guest

(2)

2002) to directed networks by introducing three types of related connectivity parameters, see Supplementary Data for more details. NetworkAnalyzer is also capable of enumerating the shared neighbors of all node pairs in a network. As an application of that, the Supplementary Data describes the use of the shared neighbors distribution to detect bias in the topology of predicted human networks of protein–protein interactions in comparison to experimentally derived networks. Further unique features of NetworkAnalyzer comprise various visual settings of the obtained diagrams (Fig. 1). The user has the option of switching between histogram or scatter plot of the computed distributions and between linear or logarithmic scales for any of the two displayed diagram axes. Gridlines can be enabled or disabled, and a power law can be fitted to resultant distributions. Additionally, the title of the chart diagram, the labels of the axes, and the colors of the scatter points and gridlines can be configured.

Topology parameters computed for network nodes are stored as node attributes in the Cytoscape data structure. Thus, users can easily apply the visual mapping settings of Cytoscape to highlight any parameter on the screen (see online tutorial). For example, the clustering coefficient may be visualized proportional to the node size, and the node color may be related to its degree. Another useful application of NetworkAnalyzer is the selection of nodes based on any of the calculated attributes. This enables Cytoscape users to examine, for instance, structural perturbations in a network caused by the removal of nodes with high degrees.

In NetworkAnalyzer, the complete set of simple and complex parameters is referred to as network statistics. Once calculated and displayed, the network statistics can be saved into and reloaded from a text file in order to avoid recomputation. The comparison of multiple network topologies can easily be achieved by the parallel inspection of the computed statistics for different networks. Optional user settings can be stored and reloaded. Users can customize the appearance of the results by choosing between two alternative dialog interfaces, the compact one shown in Fig. 1 and an expandable interface. Aside from

parameter computations, NetworkAnalyzer offers a useful set of network modifications and supports the construction of the intersection, union, and difference of two networks, the extraction of connected components as new separate networks, and the removal of self-loops.

3 CONCLUSIONS

NetworkAnalyzer is a versatile and user-friendly tool for the analysis of biological and other networks. This plugin is well integrated into Cytoscape and computes a comprehensive list of simple and complex topology parameters using efficient graph algorithms. It incorporates useful visualization settings to display and export the resulting distributions and adds node attributes for the results.

ACKNOWLEDGEMENTS

Part of this work has been financially supported by the German National Genome Research Network (NGFN) and the German Research Foundation (DFG), contract number KFO 129/1-1. The research has been conducted in the context of the BioSapiens Network of Excellence funded by the European Commission under grant number LSHG-CT-2003-503265.

Conflict of Interest: none declared.

REFERENCES

Albert,R. (2005) Scale-free networks in cell biology.J. Cell Sci.,118, 4947–4957. Albrecht,M.et al. (2005) Decomposing protein networks into domain-domain

interactions.Bioinformatics,21(Suppl. 2), ii220–ii221.

Almaas,E. (2007) Biological impacts and context of network theory.J. Exp. Biol.,

210, 1548–1558.

Barabasi,A.L. and Oltvai,Z.N. (2004) Network biology: understanding the cell’s functional organization.Nat. Rev. Genet.,5, 101–113.

Batagelj,V. and Mrvar,A. (1998) Pajek – program for large network analysis.

Connections,21, 47–57.

Fig. 1. Analysis of a human protein interaction network for neurodegenerative diseases with 3607 nodes and 7093 edges (Lim et al., 2006). The shortest path length distribution (left) indicates that the network possesses small-world property. The decreasing trend of the neighborhood connectivity (right) shows that the network is dominated by edges between low and highly connected nodes.

Computing topological parameters

283

Downloaded from https://academic.oup.com/bioinformatics/article-abstract/24/2/282/227130 by guest

(3)

Dong,J. and Horvath,S. (2007) Understanding network concepts in modules.

BMC Syst. Biol.,1, 24.

Hu,Z.et al. (2005) VisANT: data-integrating visual framework for biological networks and modules.Nucleic Acids Res.,33, W352–W357.

Junker,B.H.et al. (2006) Exploration of biological network centralities with CentiBiN.BMC Bioinformatics,7, 219.

Lim,J.et al. (2006) A protein-protein interaction network for human inherited ataxias and disorders of purkinje cell degeneration.Cell,125, 801–814. Maslov,S. and Sneppen,K. (2002) Specificity and stability in topology of protein

networks.Science,296, 910–913.

Ramı´rez,F.et al. (2007) Computational analysis of human protein interaction networks.Proteomics,7, 2541–2552.

Shannon,P. et al. (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks.Genome Res.,13, 2498–2504.

Yip,K.Y.et al. (2006) The tYNA platform for comparative interactomics: a web tool for managing, comparing and mining multiple networks.Bioinformatics,

22, 2968–2970.

Yu,H. et al. (2004) TopNet: a tool for comparing biological sub-networks, correlating protein properties with topological statistics.Nucleic Acids Res.,

32, 328–337.

Zhu,X.et al. (2007) Getting connected: analysis and principles of biological networks.Genes Dev.,21, 1010–1024.

Y.Assenov et al.

284

Downloaded from https://academic.oup.com/bioinformatics/article-abstract/24/2/282/227130 by guest

Fig. 1. Analysis of a human protein interaction network for neurodegenerative diseases with 3607 nodes and 7093 edges (Lim et al., 2006)

参照

関連したドキュメント

Keywords: Convex order ; Fréchet distribution ; Median ; Mittag-Leffler distribution ; Mittag- Leffler function ; Stable distribution ; Stochastic order.. AMS MSC 2010: Primary 60E05

She reviews the status of a number of interrelated problems on diameters of graphs, including: (i) degree/diameter problem, (ii) order/degree problem, (iii) given n, D, D 0 ,

In Section 3, we show that the clique- width is unbounded in any superfactorial class of graphs, and in Section 4, we prove that the clique-width is bounded in any hereditary

Inside this class, we identify a new subclass of Liouvillian integrable systems, under suitable conditions such Liouvillian integrable systems can have at most one limit cycle, and

• For k and λ small enough, a large typical map consists of several planar components each of a fixed bicolored type, connected by a finite number of monocolored edges (with weight

We study infinite words coding an orbit under an exchange of three intervals which have full complexity C (n) = 2n + 1 for all n ∈ N (non-degenerate 3iet words). In terms of

After performing a computer search we find that the density of happy numbers in the interval [10 403 , 10 404 − 1] is at least .185773; thus, there exists a 404-strict

So here we take our set of connected blocks to be the isomorphism classes of finite strongly connected tournaments (and again, the weight of a connected block is the number of