Suppose you write the a number \(N\) as the sum of \(k\) non-negative terms. That is, \[ N = n_1 + n_2 + \cdots + n_k \] where \(n_i \ge 0\) for \(i = 1,2,\ldots,k\). For example, for \(k=4\), \(N=10\) can be written as \(N = 2 + 3 + 4 + 1\), or as \(N = 5 + 0 + 2 + 3\).
Here’s a challenge: Find the sum that maximizes a function \(f(n_1,n_2,\dots,n_k)\). For example, let \(f(n_1, n_2, \dots, n_k) = (n_1 \times n_2 \times \cdots \times n_k)^\frac{1}{k}.\) Plenty of other examples abound, but let’s use this simple function for now.
Start by setting up the scene.
N <- 100
k <- 10
A state will refer to 10 numbers that sum to 100. Here is one way to obtain a random state:
vals <- sample(1:k, N, rep=T)
vals
## [1] 6 4 9 3 4 9 7 6 4 8 9 3 1 4 3 10 5 9 3 7 3 7 3
## [24] 7 3 5 7 5 5 8 5 4 4 7 7 1 7 1 8 4 3 2 3 8 5 1
## [47] 3 1 5 2 2 7 10 6 7 10 9 4 10 5 6 4 10 7 4 1 1 9 1
## [70] 7 9 8 5 10 9 2 3 8 9 1 7 6 7 6 7 9 1 2 10 1 5 1
## [93] 7 6 10 3 7 1 3 10
state <- as.vector(table(vals))
state
## [1] 13 5 13 10 10 7 17 6 10 9
Convince yourself that this really does produce a random state.
Now we want to maximize the product of any 10 values that sum to 100. Let’s slightly generalize this. We will maximize the log of the product (with a small tweak to account for a product that is 0). Clearly, finding \((n_1, n_2, \ldots, n_k)\) which maximizes the log of the product suffices.
f <- function(state){
##add a small value to get away from 0
-log(prod(state) + .001)/k
}
Did you notice the negative sign? Since we will use simulated annealing, we really want to solve a minimization problem. Thus, the minimum value of \(f\) corresponds to the maximum product of the 10 numbers that sum to 100.
Try it out:
f(state)
## [1] -2.24152
The state spaces consists of all \(k-\)tuples of non-negative values that sum to \(N\). If \((n_1, n_2, \ldots, n_k)\) is the current state, then we can propose to transition to a new state by:
For example, suppose \(N = 10\) and \(k = 4\), and \((0,2,5,3)\) is the current state. Randomly select, say, \(i=2\) and \(j=4\). In this case, the new state becomes \((0,1,5,4)\). However, if we select \(i=1\) and \(j=3\), then (since \(n_1 = 0\)), the next state is the same as the current state.
Here is one way to do this in R:
state
## [1] 13 5 13 10 10 7 17 6 10 9
sites <- sample(1:k, 2, rep=F)
sites
## [1] 7 5
if(state[sites[1]] > 0){
state[sites[1]] <- state[sites[1]] - 1
state[sites[2]] <- state[sites[2]] + 1
}
state
## [1] 13 5 13 10 11 7 16 6 10 9
From this, create the proposal function.
proposal <- function(currState){
propState <- currState
sites <- sample(1:k, 2, rep=F)
if(propState[sites[1]] > 0){
propState[sites[1]] <- propState[sites[1]] - 1
propState[sites[2]] <- propState[sites[2]] + 1
}
propState
}
Now create the doMove
function. Notice how much it looks like our previous doMove function.
sig <- 0.1
doMove <- function(currState,sig){
# propose a transition
propState <- proposal(currState)
# get current and proposed function values
currF <- f(currState)
propF <- f(propState)
# calculate rho and make a move based on result
dFunc <- propF - currF
rho <- exp(-dFunc/sig)
if(runif(1,0,1) < rho){
return(propState)
}
return(currState)
}
Never hurts to test it…
state
## [1] 13 5 13 10 11 7 16 6 10 9
(state <- doMove(state,1))
## [1] 14 5 13 9 11 7 16 6 10 9
(state <- doMove(state,1))
## [1] 14 5 13 9 11 7 16 6 11 8
Here we go with simulated annealing. Let’s start clean.
N <- 100
k <- 10
(state <- as.vector(table(sample(1:k, N, rep=T))))
## [1] 14 5 10 11 7 10 8 13 15 7
Ready to go. When doing simulated annealing, the choice of M
and decFac
are control parameters and needed to be fiddled with to get this to work properly.
sig <- .1
M <- 30000
decFac <- .9999
for(m in 1:M){
state <- doMove(state, sig)
# slowly decrease sig
sig <- sig*decFac
}
sig
## [1] 0.00497796
state
## [1] 13 9 7 8 12 10 9 12 9 11
As it turns out, the the answer is well-known—this function is maximized when all the values are as close to the same as possible.
Your turn: Change N
and k
, and see how the choices of \(\sigma^2\), decFac
, and M
need to be adjusted to get a reasonable answer.
There are lots of other interesting functions to consider. For example, try: \[ f(n_1, n_2, \dots, n_k) = \frac{1}{k}\sum_{i=1}^k n_i^2 \] That is, the mean of the sum of the squares of \((n_1, n_2, \dots, n_k)\). There isn’t too much different here, but you will need to think about the initial value of \(\sigma^2\) and how to decrease it properly.
A \(3 \times 3\) magic square is an arrangement of the numbers \(1, 2, \ldots, 9\) in a \(3 \times 3\) grid in such way that all the rows, columns, and diagonals have the same sum. Note there are 3 rows, 3 columns, and 2 diagonals—hence 8 sums total. For example:
\[\begin{array}{|c|c|c|} \hline 4 & 9 & 2\\ \hline 3 & 5 & 7\\ \hline 8 & 1 & 6\\ \hline \end{array} \] Here, all the rows, columns, and diagonals sum to 15.
Here is a simple way to create a random starting state. It almost certainly will not be magic!
mm <- matrix(sample(1:9, 9, rep=F), nrow=3)
The function to minimize is critically important. A good idea is to create a function that first calculates the 8 row/column/diagonal sums. Then for each of these values, determine (in absolute) value, how far they are from 15. The return value can be the mean (or max) of these distances.
It might seem sufficient to just swap entries around, but it turns out that it’s easy to get stuck in a bad square for which no swap will really improve things. A better idea is to randomly either
\[\begin{array}{|c|c|c|} \hline 1&\mathbf{2}&5\\ \hline 3&7&6\\ \hline 8&4&\mathbf{9}\\ \hline \end{array}\rightarrow\begin{array}{|c|c|c|} \hline 1&\mathbf{9}&5\\ \hline 3&7&6\\ \hline 8&4&\mathbf{2}\\ \hline \end{array} \]
\[\begin{array}{|c|c|c|} \hline 1&2&5\\ \hline 3&7&6\\ \hline 8&4&9\\ \hline \end{array}\rightarrow\begin{array}{|c|c|c|} \hline 3&7&6\\ \hline 1&2&5\\ \hline 8&4&9\\ \hline \end{array} \]
Hence your move function will have to make a few choices:
I suggest writing two functions: swapSites and swapRowCols, and blending these into your proposal function.
Submit your work to the Magic Squares assignment on Moodle. Your work should be an HTML or PDF file produced using R Markdown. Please delete the examples and discussion prior to the Assignment heading from the file that you turn in. This is a minor assignment—it is not necessary to provide a lot of discussion about your answers. This assignment is due on Monday, May 7.