Department of Mathematics, University of Houston, 651 PGH Building, Houston TX, USA

Department of Biochemistry and Cell Biology, Rice University, W100 George R. Brown Hall, Houston TX, USA

Department of Computer Science, Virginia Tech, Blacksburg VA, USA

TNG Technology Consulting GmbH, Unterföhring, Germany

Center for Quantitative Medicine, University of Connecticut Health Center and Jackson Laboratory for Genomic Medicine, Farmington CT, USA

Abstract

Background

A key problem in the analysis of mathematical models of molecular networks is the determination of their steady states. The present paper addresses this problem for Boolean network models, an increasingly popular modeling paradigm for networks lacking detailed kinetic information. For small models, the problem can be solved by exhaustive enumeration of all state transitions. But for larger models this is not feasible, since the size of the phase space grows exponentially with the dimension of the network. The dimension of published models is growing to over 100, so that efficient methods for steady state determination are essential. Several methods have been proposed for large networks, some of them heuristic. While these methods represent a substantial improvement in scalability over exhaustive enumeration, the problem for large networks is still unsolved in general.

Results

This paper presents an algorithm that consists of two main parts. The first is a graph theoretic reduction of the wiring diagram of the network, while preserving all information about steady states. The second part formulates the determination of all steady states of a Boolean network as a problem of finding all solutions to a system of polynomial equations over the finite number system with two elements. This problem can be solved with existing computer algebra software. This algorithm compares favorably with several existing algorithms for steady state determination. One advantage is that it is not heuristic or reliant on sampling, but rather determines algorithmically and exactly all steady states of a Boolean network. The code for the algorithm, as well as the test suite of benchmark networks, is available upon request from the corresponding author.

Conclusions

The algorithm presented in this paper reliably determines all steady states of sparse Boolean networks with up to 1000 nodes. The algorithm is effective at analyzing virtually all published models even those of moderate connectivity. The problem for large Boolean networks with high average connectivity remains an open problem.

Background

Boolean network (BN) models are widely used in molecular and systems biology to capture coarse-grained dynamics of a variety of regulatory networks, with a particular focus on features such as steady state behavior ^{
n
}, this approach becomes unfeasible for larger models, those with more than approximately 30 variables, depending on the computational resources available. Also, for larger models, finding steady states (fixed points in this manuscript) through sampling is not effective anymore either, since even large attractors can be missed entirely by this approach. On the theoretical side, it has been shown that the problem of finding, or even counting, steady states of Boolean networks is NP-hard

Several methods have been proposed in the literature for dealing with this problem, including exact as well as heuristic methods. We provide a brief review of the different types here. For this purpose, we represent a Boolean network as follows. Let _{1},…,_{
n
}. Each node _{
i
} has associated to it a Boolean function _{
i
}:^{
n
}→

One can represent the variable dependencies through the _{1},…,_{
n
}. There is an edge _{
i
}→_{
j
} if _{
i
} appears in the function _{
j
}, that is, the state of _{
j
} depends on the state of _{
i
}. The problem of finding steady states is then formulated as finding all states ^{
n
} such that

One approach to the problem is model reduction. Some existing _{
i
}=_{
i
}(_{
j
},_{
k
},_{
l
}), then we can remove variable _{
j
} from the network by replacing _{
i
}(_{
j
},_{
k
},_{
l
}) with the new function _{
i
}(_{
j
}(_{1},…,_{
n
}),_{
k
},_{
l
}). By repeating this process, one obtains a reduced network that in practice is much smaller than the original network. The stopping criteria for reduction methods is that variables can be removed only if the steady state information is preserved. The steady states of the reduced network are in algorithmic one-to-one correspondence with the steady states of the original network. More precisely, the reduction algorithm decomposes a large system into a smaller system and a set of equations in triangular form, so that when the steady states of the reduced system are found, the steady states of the original systems can be found simply by backwards substitution. That is, the existence of the one-to-one correspondence is not just theoretical.

Another method uses the fact that one can represent a Boolean function as a polynomial function in the variables _{1},…,_{
n
}, with coefficients in the finite number system _{
i
}:=_{
i
}(_{1},…,_{
n
})−_{
i
}=0;_{1},…,_{
n
}}. Using tools from computational algebra it is possible to find another set that has the same roots (a Gröbner basis), such that it is possible to do a generalized version of Gaussian elimination. These computations can be done using several different software packages developed for this purpose.

A graph-theoretic method, _{
i
}, for all ^{
n
} states to the problem of checking 2^{|S|} states, where |

SAT methods have also been used for the purpose of finding steady states of Boolean networks, which are used to determine whether a Boolean expression in several variables has a variable assignment that makes the expression true; see _{
i
}=_{
i
}, is rewritten as a single equation _{
i
} contain only the AND and OR operators, with a time complexity of ^{
n
}) (where

Integer programming-based method have also been used to find the steady states of Boolean networks, Tamura, Hayashida, and Akutsu ^{
T
}

^{
n
}) (where _{1}=_{1}. Since the _{
i
}’s depend on few variables in practice, one only has to keep track of the variables that appear in _{1}. Then, one finds the solutions of _{2}=_{2} that are compatible with the solutions previously found. The process continues until one finds solutions of all equations. In the worst case, however, algorithm complexity can be ^{
n
})

Finally, the problem of finding attractors has also been studied by using Binary Decision Diagrams (BDD)

In this paper, we present a new method for computing steady states of a Boolean network, combining a graph theoretic reduction/transformation method with an approach using computational algebra. We show that the method performs favorably on some types of networks in comparison with other methods on a collection of benchmark networks, consisting of both published models and random networks with certain properties, namely Kauffman networks and networks whose in-degree distribution satisfies a power law.

Methods

The method we propose for steady state analysis is a combination of network reduction/transformation and computational algebra (see Figure _{1}∧_{2}∧…, where _{
i
}∈{_{
i
},¬_{
i
}}. The AND-NOT network has the property that its steady states are in one-to-one correspondence with the steady states of the original network. Furthermore, the one-to-one correspondence between steady states is algorithmic. In

Flow chart of steady state computation.

**Flow chart of steady state computation.** Main steps in our method highlighting the intermediate systems.

The correspondence between Boolean and polynomial functions is accomplished via the “dictionary” ^{
n
}→

The algorithm is summarized in the following pseudocode and a more detailed description follows. The source code can be found at github.com/PlantSimLab/ ADAM.

The input of our algorithm is an _{1},…,_{
n
}). In Step 1, we use the formulas from _{1},…,_{
m
}), with _{1}=¬_{2}∧(_{3}∨_{4}) can be written as _{1}=¬_{2}∧¬_{5}, where _{5}=¬_{3}∧¬_{4}. Furthermore, the steady states of _{1},…,_{
l
}); the steady states of _{
i
} with 1+_{
i
}, and _{
i
}∧_{
j
} with _{
i
}
_{
j
}, as explained earlier. In Step 6 we solve the system of polynomial equations _{
i
}=_{
i
}, ^{″}={^{′}={

**Example and individual performance of network reduction and computational algebra.**

Click here for file

**Instructions for usage.**

Click here for file

**Source code.**

Click here for file

Results and discussion

We first tested the software implementation of our algorithm on 1,000,000 Boolean networks with 50 nodes each, for which we also computed all steady states by a custom-made algorithm based on minimal feedback vertex sets. For each graph we found the minimal number of vertices that had to be removed so that the graph had no directed cycles; call this set ^{|S|}, the values of the other variables are completely determined. This gave us 2^{|S|} candidates for steady states which we then checked by exhaustive search. In all cases our algorithm computed correctly all steady states. We are therefore confident that our implementation is error-free. This extends to the relevant functionalities of other software packages we used for intermediate computations (Macaulay2

Then we used over 100,000 Boolean networks to benchmark our method against others. The methods we used for comparison were those with published benchmarks or those for which the code was readily available. As we will see later, for Kauffman networks with

We used random biologically meaningful Boolean networks

First, we compare the performance of different methods on Kauffman networks with connectivity

**Zañudo **

**Devloo **

**Tamura **

**Our method**

**
n
**

**mean**

**stdev.**

**mean**

**stdev.**

**mean**

**stdev.**

**mean**

**stdev.**

The best results are in bold. *=interpolated/extrapolated from reported results. NR=not reported.

2000

7.341

3.192

107.1 ^{∗}

83.49 ^{∗}

**0.022**

NR

0.490

0.023

4000

12.084

3.636

223.0 ^{∗}

173.9 ^{∗}

**0.035**

NR

1.123

0.049

6000

31.174

340.213

338.9 ^{∗}

264.3 ^{∗}

**0.047**

NR

2.172

0.114

8000

28.091

11.572

454.8 ^{∗}

354.8 ^{∗}

**0.069**

NR

3.642

0.212

10000

38.394

13.301

570.8 ^{∗}

445.2 ^{∗}

**0.072**

NR

5.218

0.235

**Zañudo **

**Devloo **

**Tamura **

**Our method**

**
n
**

**mean**

**stdev.**

**mean**

**stdev.**

**mean**

**stdev.**

**mean**

**stdev.**

The best results are in bold. *=interpolated/extrapolated from reported results. DF=did not finish in a day. NR=not reported.

20

1.024

0.403

0.110

0.090

**0.011**

NR

0.273

0.040

40

DF

DF

0.340

0.270

**0.296**

NR

0.300

0.126

60

DF

DF

2.251 ^{∗}

2.120 ^{∗}

2.414

NR

**0.415**

0.552

80

DF

DF

10.05 ^{∗}

10.84 ^{∗}

17.07

NR

**1.143**

8.414

100

DF

DF

60.10

59.10

94.08

NR

**2.878**

16.74

120

DF

DF

200.5 ^{∗}

283.6 ^{∗}

714.4 ^{∗}

NR

**9.278**

51.79

Not all molecular networks have properties similar to Kauffman networks, but can exhibit power law properties for their degree distribution. Thus, we supplemented the results from Tables

**Zañudo **

**Our method**

**
n
**

**mean**

**stdev.**

**mean**

**stdev.**

DF = did not finish in a day.

25

1.264

1.778

0.254

0.011

50

2.488

3.807

0.257

0.018

100

5.255

9.172

0.260

0.022

250

DF

DF

0.271

0.046

500

DF

DF

0.358

1.429

1000

DF

DF

6.798

65.39

**Zañudo **

**Our method**

**
n
**

**mean**

**stdev.**

**mean**

**stdev.**

DF = did not finish in a day.

20

3.828

5.133

0.251

0.029

40

DF

DF

0.259

0.055

60

DF

DF

0.288

0.222

80

DF

DF

0.543

4.724

100

DF

DF

1.331

7.752

120

DF

DF

3.033

25.94

140

DF

DF

7.185

57.23

Finally, our results on published networks are shown in Table

**Zañudo **

**Our method**

**Ref.**

**
n
**

**
〈k〉
**

**mean**

**stdev.**

**mean**

**stdev.**

DF=did not finish in a day. *=49% of simulations reported, 51% of simulations were stopped because they did not finish in a day or had a large memory consumption.

62

1.62

1.678

0.729

0.231

0.010

94

1.65

1.300

0.074

0.234

0.012

302

1.71

4.698

0.116

0.236

0.011

60

2.10

4636.245

89.311

0.239

0.013

120

2.45

2023.954

18448.754

0.312

0.141

54

2.59

6878.594

22059.317

0.256

0.030

54

3.62

3.789

3.903

0.492

0.247

76

4.01

DF

DF

0.242

0.013

130

5.00

DF

DF

23.19

98.42

225

5.16

DF

DF

4186*

12284

The computational complexity of our algorithm depends on the type of networks used as well as the connectivity. The algorithm seems to run in polynomial time for Kauffman networks with

Conclusions

The capability to analyze the attractors of discrete dynamic models of biological networks is a key technology in any systems biology toolkit that incorporates this popular type of model. This capability needs to include steady state analysis as well as the determination of periodic points of larger periods. And it needs to apply to models that allow an arbitrary (finite) number of states for its variables, such as logical models. In this paper, we have focused on Boolean networks as the model type most commonly used currently. And we have focused only on steady state analysis, at the exclusion of periodic limit cycles. As is the case in many situations, algorithms available for this purpose, some of which we used here for comparison, perform well on some types of models and not so well on others. For instance, for Kaufmann networks with connectivity 2, the method in

We have used three types of networks for benchmarking: Kauffman networks, power law networks, and published networks. Kauffman networks are commonly used for this purpose, but they don’t capture all properties of molecular networks, which include a power law distribution of node connectivities. Our analysis of published networks shows that some of them have high average connectivity, not generally considered in theoretical studies. These pose serious challenges to computational methods, as we demonstrate. As more large published networks become available, they will represent the most important suite of benchmark models to be used, in our opinion.

We believe that this study also holds another important lesson. Our method is a combination of two methods, neither one of which performs particularly well when applied on its own (see Additional file

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

AV-C designed and applied the graph reduction methods, and combined them with the computer algebra algorithm. He also generated the suite of benchmark networks used in the study. BA implemented the graph reduction methods. He also surveyed the literature for other available methods and carried out and collected performance data for the other methods used in the study for comparison. FH carried out Gröbner basis calculations for a subset of the benchmark networks. RL conceived, planned, and directed the project. All authors contributed to the writing of the manuscript. All authors read and approved the final manuscript.

Acknowledgements

The work of R.L. was supported in part by the grant