Supporting codes for the manuscript: Algorithms for Large, Sparse Network Alignment Problems.
This package contains datasets and programs to solve network alignment problems written in the Matlab language.
These codes are research prototypes and may not work for you. No promises.
The package is organized by directory
dataexperimentsmatlabprivate_dataWith a recent version of Matlab (2008b used for development), the best way to use the package is to open Matlab, navigate to the matlab directory, and then execute the following commands to solve the network alignment problem for Figure 1.
% load A, B, and L for figure 1
>> load('../data/example_overlap.mat');
% create S, w, and a variant of L from graph A, B, and L
>> [S,w,li,lj] = netalign_setup(A,B,L);
% use S, w, L, and alpha=0, beta=1 and the bp algorithm
>> x = netalign_bp(S,w,0,1,li,lj);
% x is just a heuristic, so we need to round it
>> [ma mb mi overlap weight] = mwmround(x,S,w,li,lj);
% [ma mb] give pairs of matched vertices
% mi is the binary matching indicator for L (as li, lj)
% overlap and weight are the overlap and weight objectives
If you want to get more information about how to use the tools, see the matlab/demo.m script.
All of the algorithms and codes are contained in the matlab directory.
| Algorithm | Code | Source |
|---|---|---|
| IsoRank | full_isorank.m | Singh et al. 2007 |
| SpaIsoRank | isorank.m | |
| NetAlginBP | netalignbp.m | |
| Exact enumeration | netalign_exact.m | |
| NetAlignMR | netalignmr.m | Klau 2009 |
| LP | netalign_lp_prob.m | Klau 2009 |
| Lagrangean LP | netalign_llp.m | Klau 2009 |
| Dataset | File | Source |
|---|---|---|
lcsh2wiki-small | private_data/lcsh2wiki-small.mat | |
lcsh2wiki-full | private_data/lcsh2wiki_full.mat | |
dmela-scere | [ available with IsoRank, see below ] | Singh et al. 2008 |
musm-homo | [ available with Natalie, see below ] | Klau 2009 |
lcsh2wikiThese data sets only come with S, w, li, and lj. The original datasets are considerably larger (hundreds of megabytes). Please contact us if you want them. None of the experiments require them.
dmela-scereThese datasets are distributed with IsoRank in the file http://groups.csail.mit.edu/cb/mna/packages/multiway_kpartite.tgz .
Use the python program experiments/bioinfo/convert_isorank_data.py to generate dmela-scere.smat dmela.smat and scere.smat which are L, A, and B, respectively. The program readSMAT.m will load these files into Matlab.
See http://groups.csail.mit.edu/cb/mna/ http://groups.csail.mit.edu/cb/mna/ for more about IsoRank.
After converting the data to smat with the above script, execute
A = readSMAT('dmela.smat');
B = readSMAT('scere.smat');
L = readSMAT('dmela-scere.smat');
[S,w,li,lj] = netalign_setup(A,B,L);
save '../../data/dmela-scere.mat' A B L S w li lj
to save the data for use with the experiments.
musm-homoThese datasets are distributed with Natalie in the file https://www.mi.fu-berlin.de/wiki/pub/LiSA/Natalie/natalie-0.9.tgz .
Use the perl script experiments/bioinfo/parse_natalie.pl in the to extract the files and then load_natalie.m to read the data in Matlab.
In the experiments/bioinfo directory, execute
load_natalie
[S,w,li,lj] = netalign_setup(A,B,L);
save '../../data/natalie_graphs.mat' A B L S w li lj
to save the data for use with the experiments.
gaimc package : gaimc, but it is distributed in the experiments/misc/gaimc directoryg++ RandomPowerLaw.cpp ought to work)| Experiment | Description | Figure |
|---|---|---|
misc/figure_2.m | Generate figure 2 | Figure 2 |
bioinfo/bioinfo.m | Playing with musm-homo and dmela-scere datasets | |
evaluation/evaluate_codes.m | Generate results for figures 5 and 6, plot with evaluation_plots.m | Figures 5, 6 |
grid_test/phase_trans.m | Synthetic grid experiments, plot with plot_phase_trans.m | Figure 4 |
klau/large_lcsh.m | Test of large lcsh data with netalignmr | |
klau/load_data.m | Tests on musm-homo | |
linprog/clp_lp.m | Solve network alignment with an LP | |
overlap_stats/overlap_stats.m | Old version of evaluation experiments | |
powerlaw/phase_trans.m | Synthetic power law experiments, plot with plot_phase_trans.m | Figure 4 |
repisorank/test_repisorank.m | Try repeating IsoRank for better solutions | |
rounding/round_test.m | Variations on rounding schemes | |
timing/timing_data.m | Run codes to generate timing information |