Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Combinatorial Optimization - Artificial Neural Networks - Lecture Notes, Study notes of Computer Science

These are the Lecture Notes of Artificial Neural Networks which includes Neural Networks, Hopfield Network, Approximate Solutions, Local Minima of Cost Function, Complexity Theory, Combinatorial Optimization Problems, Globally-Optimal Solution etc. Key important points are: Combinatorial Optimization, Neural Networks, Hopfield Network, Approximate Solutions, Local Minima of Cost Function, Complexity Theory, Combinatorial Optimization Problems, Globally-Optimal Solution

Typology: Study notes

2012/2013

Uploaded on 03/21/2013

dharmpaal
dharmpaal 🇮🇳

3.9

(10)

87 documents

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Ugur HALICI ARTIFICIAL NEURAL NETWORKS CHAPTER 4
EE543 LECTURE NOTES . METU EEE . ANKARA
59
CHAPTER IV
Combinatorial Optimization by
Neural Networks
Several authors have suggested the use of neural networks as a tool to provide
approximate solutions for combinatorial optimization problems such as graph matching,
the traveling salesman problem, task placement in a distributed system, etc.
In this chapter, we first give a brief description of combinatorial optimization problems.
Next we explain in general how neural networks can be used in combinatorial
optimization and then introduce Hopfield network as optimizer for two well known
combinatorial optimization problems: the graph partitioning and the traveling salesman.
Hopfield optimizer solves combinatorial optimization problems by gradient descent,
which has the disadvantage of being trapped in local minima of the cost function.
The efficiency of neural networks in solving of NP_hard combinatorial optimization
problems has been investigated by several researchers [Bruck and Goodman 88, 90, Yao
92]. It has been shown that even finding approximate solutions to NP_hard problems is
not an easy task. By the use of techniques of complexity theory, it has been proved that
no network of polynomial size exists to solve the traveling salesman problem unless
NP=P [Bruck and Goodman 1990]. However, their parallel nature and good performance
in finding approximate solution make the neural optimizers interesting.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Combinatorial Optimization - Artificial Neural Networks - Lecture Notes and more Study notes Computer Science in PDF only on Docsity!

CHAPTER IV

Combinatorial Optimization by

Neural Networks

Several authors have suggested the use of neural networks as a tool to provide approximate solutions for combinatorial optimization problems such as graph matching, the traveling salesman problem, task placement in a distributed system, etc.

In this chapter, we first give a brief description of combinatorial optimization problems. Next we explain in general how neural networks can be used in combinatorial optimization and then introduce Hopfield network as optimizer for two well known combinatorial optimization problems: the graph partitioning and the traveling salesman. Hopfield optimizer solves combinatorial optimization problems by gradient descent, which has the disadvantage of being trapped in local minima of the cost function.

The efficiency of neural networks in solving of NP_hard combinatorial optimization problems has been investigated by several researchers [Bruck and Goodman 88, 90, Yao 92]. It has been shown that even finding approximate solutions to NP_ hard problems is not an easy task. By the use of techniques of complexity theory, it has been proved that no network of polynomial size exists to solve the traveling salesman problem unless NP=P [Bruck and Goodman 1990]. However, their parallel nature and good performance in finding approximate solution make the neural optimizers interesting.

4.1 Combinatorial Optimization Problems

The problems typically having a large but finite set of solutions among which we want to find the one that minimizes or maximizes a cost function are often referred as combinatorial optimization problem s. Since any maximization problem can be reduced to a minimization problem simply by changing the sign of the cost function, we will consider only the minimization problem with no loss of generality. An instance of a combinatorial optimization problem can be formalized as a pair ( S, g). The solution space, denoted S, is the finite set of all possible solutions. The cost function, denoted by g , is a mapping from the set of solutions to real numbers, that is, g S →R [Aarts and

Korst 89]. In the case of minimization, the problem is to find a solution S*S , called globally-optimal solution , which satisfies

S (^) S S g Si i

  • (^) = min ( ). ∈ (4.1.1)

Notice that for a given instance of the problem, such an optimal solution may not be unique.

Optimization problems can be divided into classes according to the time required to solve them. If there exists an algorithm that solves the problem in a time that grows only polynomially with the size of the problem, then it is said to be polynomial. The set of polynomial time problems, denoted P, is a subclass of another class called NP. Here NP stands for non-deterministic polynomial , implying that a polynomial time algorithm exists for a nondeterministic Turing machine. However for the problems in NP but not P, there exists neither a polynomial time algorithm for deterministic Turing machine (although it exists for nondeterministic Turing Machine) nor a proof the non-existence of such an algorithm. In spite of unavailability of polynomial time algorithms to solve this

circuit placement in VLSI, tool motion in manufacturing, network design etc. Thus the development of methods searching for solutions that are close to the optimum, and yet not excessively time consuming, is the source for continued research. In the TSP, the shortest closed path traversing each city under consideration exactly once is searched. For TSP, the number of cities determines the size of the problem (Figure 4.2).

Figure 4.2 The traveling salesman problem a) an instance with 4 cities b) the optimum solution b) a nonoptimum solution c) non feasible solution having some unvisited cities

Another problem that we will consider in this chapter because of its simplicity in designing a neural optimizer is the vertex cover problem. It is also an NP_ complete problem, therefore no efficient algorithms for its exact solution is available when the number of nodes in the graph is large. The problem size is determined by the number of nodes in the graph for which a minimum cover is searched.

The formal problem can be stated as follows: Let G= ( V,E ) be a graph where V ={ v 1 , v 2 ,..,v (^) N } is the vertices and E={( v (^) i ,v (^) j )} is the edges of the graph. A cover C of G is a subset of V such that for each edge ( v (^) i , v (^) j ) in E, either v (^) i or v (^) j is in C. A minimum cover

of G is a set C* such that the number of nodes in C* is the minimum among all the covers of G , that is | C* |≤| C |. For exaple for the sample graph given in Figure 4.3, the covers are C 1 =(a,b,c,d,e), C 2 =(a,b,c,d), C 3 =(a,b,c,e), C 4 =(a,b,d,e), C 5 =(a,c,d,e), C 6 =(b,c,d,e), C 7 =(a,b,e), C 8 =(a,d,e), C 9 =(b,c,d), C 10 =(b,c,e), C 11 =(b,d,e), C 12 =(b,e) and the minimal cover is C 12 =(b,e).

Figure 4.3 A sample graph

If we have to solve an NP _ complete problem, then a very long computation may be needed for an exact solution. The optimum solution of the vertex cover problem can be obtained by enumerating all the covers and then selecting the minimum one. However such an enumerative search for the exact optimum solution have a time complexity of O(2 n ), where n is the number of vertices in the graph. Being an NP complete problem, finding the exact minimum cover of G is not practical when the number of vertices is very large. Thus, in some cases, approximate algorithms are preferred [Will 86].

Exercise: Explain what approximate solutions may be used for vertex cover problem.

A heuristic solution to the problem using a greedy approach may be established as follows: First, the node having the highest degree in G is selected and included in C+, that is the cover being generated. Then, the node and all its adjacent edges all together with the related terminal nodes are removed from G and the procedure is repeated until all nodes in G have been removed.

a

b c

d e

function assigning a real value to each solution. The aim is to find a feasible solution for which the cost function is optimal [Aarts and Korst 89].

In order to use a neural optimizer to solve combinatorial optimization problems, the state space of the network is mapped onto the set of solutions. The state space X of a neural optimizer is the set of all possible state vectors x whose components correspond to the neuron outputs. For this purpose, first the given problem is formulated as a 0- programming problem. Then, a neural network is defined such that the state of each unit determines the value of a 0-1 variable. Thus, the neural network implements a bijective (one to one and onto) function m : XS. The next step is to determine the strengths of the connections such that the energy function is order-preserving.

The energy function E of a neural network that implements a minimization problem ( S,S', g) is called order-preserving if

g ( m ( x k )) < g ( m ( x l )) ⇒ E ( x k ) < E ( x l ). (4.2.1)

for any x k, x l^ ∈ X with m ( x k ), m ( x l ) ∈ S'

Exercise: Explain order preservation in terms of traveling salesman problem.

Another desired property of the network is feasibility. Let X* to denote the set of stable states of a neural network. The energy function E of the neural network is called feasible if each local minimum of the energy function corresponds to feasible solution, that is

m ( X* ) ⊆ S' (4.2.2) where

m ( X* ) = { SiS | ∃ x k^ ∈ X * : m ( x k ) = Si }. (4.2.3)

Feasibility of the energy function implies that the solution achieved by the network will always be a feasible one, since a neural optimizer always converges to a configuration xX*

Exercise: Explain feasibility in terms of traveling salesman problem

Note that, if the energy function is order preserving, then the energy will be minimal for configurations corresponding to an optimal solution (Figure 4.4). Furthermore, if the energy function is feasible, the network is guaranteed to converge to a feasible solution. Hence, feasibility and order-preservation of the energy function imply that the network will tend to find an optimal feasible solution for the given instance of the combinatorial optimization problem.

Figure 4.4: The goal of a neural optimizer is to converge to the global minimum of the energy function

Further notice that if { S* }⊂ S' - m ( X *), where S * is the minimum solution as defined by Eq. (4.1.1), in such a case, the neural network will never converge to a state

E w (^) ji x xi x j

N i

N j i i

N = − − i = = =

(^12) ∑ ∑ ∑ 1 1 1

θ (4.3.3)

where xi is the output of neuron i , wji is the connection weight from neuron j to neuron i , θ i is the input bias to neuron i and N is the number of neurons in the network.

Notice that the energy function is bounded and has negative derivative when x (^) i ∈{0,1}, so it is a Lyapunov function. Therefore, the energy is to be minimized by the Hopfield network's state transitions. Furthermore notice that x=x2^ whenever x ∈{0,1}, hence the energy can be reorganized as:

E w (^) ji x xi j

n i

n = − j = =

(^12) ∑∑ 1 1

where w ji =2θ i

Exercise: What happens to the restriction wii =0? Is it still necessary for binary state Hopfield Network?

Now our goal is to represent the vertex cover problem by a Hopfield network so that the cost of the problem will be minimized as the energy of the network decreases at each step.

A solution to the vertex covering problem has the following constraints:

  1. Every edge in the graph must be adjacent to at least one of the vertices in the cover,
  2. There should be as few vertices in the cover as possible.

The first constraint is necessary for the feasibility. The second one is, in fact, not a constraint but a statement for the minimization of cost function. The problem can be represented by a neural network in which each neuron corresponds to a vertex in the

graph. The outputs of neurons indicate whether the corresponding vertex is included in the cover or not. The case x (^) i =1 indicates that vertex i is in the cover while x (^) i =0 indicates it is not.

The energy function should be formed so that it satisfies the constraints that we discussed above. We are thus dealing with a special case of a very general class of problems, namely to find the minimum of a function in the presence of constraints. The standard method of solution is to introduce the constraint via constants called Lagrange multipliers into the cost function, so the minimum of the cost function automatically satisfies the constraints for the feasibility.

Let a 0-1 variable e ij be assigned value 1 if there is an edge from vertex i to vertex j in the graph, and it is 0 otherwise. Below, the cost function to be minimized is formulated as 0- 1 programming (Ghanwani 94):

C A eij x e x x e B x j

N i

N i j

N i

N ij i j j

N i

N ij i i

N ( x ) = ( − + ) + = = = = = = =

∑ ∑ ∑ ∑ ∑ ∑ ∑ 1 1 1 1 1 1 1

The term with coefficient A in Eq. (4.3.5) is zero when the requirement for a valid cover has been met. That is, all the edges in the graph are adjacent to at least one of the vertices in the cover. The term with coefficient B increases the energy by an amount proportional to the number of vertices in the cover, emphasizing minimality. The constant part of the cost function can be dropped without affecting the solution. Hence the cost function becomes

= − ∑∑= = +∑∑= = + ∑=

n ij i i

n i

n ij j i j

n i

n j i^

C A xe xxe B x 1 1 1 1 1

( x ) ( 2 ) (4.3.6)

Introducing a square matrix containing n x n binary elements, the solution can be represented in 0-1 programming (Figure 4.5). An entry having value "1" in the i th

position of row α indicates that the visit order of city α is i. The matrix corresponds to a

feasible solution if and only if each row and column contains exactly one entry having value "1" [Muller 90].

Figure 4.5. Representation of the tour of Figure 4.2.b by an nxn matrix, in which the rows corresponds to the cities while columns are indicating the order of visit

When a matrix of neurons is used to represent the problem, the energy of the network becomes:

i j

n n i

n n j i^ j

E w x α x β α∑ ∑ ∑ ∑= = β= = αβ

2 1 1 1 1 ,

where x α i is the output of neuron α i, w α i, β j is the connection strength between the units

α i and β j while w α i, α i is reletad to the bias θα i such that w α i, α i = 2 θα i

For TSP, we have the following constraints:

  1. Each city should be visited exactly once;
  1. At each position of the travel route, there is exactly one city;
  2. The length of the tour should be the minimum.

An appropriate choice for the cost function is [Abe 91]:

, 1 , 1

2 2

  • ≠ + + −

∑ ∑∑

∑ ∑ ∑ ∑

i i i

n n n i

n i

n i

n n i i D d x x x

C A x B x

α βα αβ α β^ β

α α α α

x (4.3.10)

where A, B are the Lagrange multipliers used to combine the constraints in the cost function.

The cost function can be written as

2 (^21 )

∑ ∑∑

∑ ∑ ∑ ∑

∑ ∑ ∑ ∑

i i i

n n n i

n i

n i i

n i

n

n i i

n j i j

n n i

D d x x x

B x x x

C A x x x

α βα αβ α β^ β

α β α β α α

x α α α α

(4.3.11)

In order to have the cost function of Eq. (4.3.11) in a form similar to the energy function given in Eq. (4.3.9) it can be reorganized as:

i j

n n n i ij ij

n i

n i

n i

n i

n j ij i j

n n i

n

n n i i

n n j i j

n n n i

D d x x

B x x B x B

C A x x A x A

α β αβ αβ α^ β

α β α β α α

α β αβ α β α α α

∑ ∑ ∑∑

∑ ∑ ∑ ∑ ∑∑ ∑

∑ ∑ ∑ ∑ ∑ ∑ ∑

2 (^1 )(^ )

, 1 , 1

x

(4.3.12)

w A B A

C A^ A^ A n

Dd

i j ij ij ij n (^) j i j i

α β αβ αβ αβ αβ

, , ,

− + −^ −^ − + + −

1 1 2 1 1