# Balanced Subsets USACO 2021 US

Page Contents

## Balanced Subsets USACO

Farmer John’s pasture can be regarded as a large 2D grid of square “cells” (picture a huge chessboard) labeled by the ordered pairs (i,j)(i,j) for each 1≤i≤N1≤i≤N, 1≤j≤N1≤j≤N (1≤N≤1501≤N≤150). Some of the cells contain grass.

A nonempty subset of grid cells is called “balanced” if the following conditions hold:

1. All cells in the subset contain grass.
2. The subset is 4-connected. In other words, there exists a path from any cell in the subset to any other cell in the subset such that every two consecutive cells of the path are horizontally or vertically adjacent.
3. If cells (x1,y)(x1,y) and (x2,y)(x2,y) (x1≤x2x1≤x2) are part of the subset, then all cells (x,y)(x,y) with x1≤x≤x2x1≤x≤x2 are also part of the subset.
4. If cells (x,y1)(x,y1) and (x,y2)(x,y2) (y1≤y2y1≤y2) are part of the subset, then all cells (x,y)(x,y) with y1≤y≤y2y1≤y≤y2 are also part of the subset.

Count the number of balanced subsets modulo 109+7109+7.

#### INPUT FORMAT (input arrives from the terminal / stdin):

The first line contains NN.

The next NN lines each contain a string of NN characters. The jj-th character of the ii-th line from the top is equal to G if the cell at (i,j)(i,j) contains grass, or . otherwise.

#### OUTPUT FORMAT (print output to the terminal / stdout):

The number of balanced subsets modulo 109+7109+7.

```2
GG
GG
```

#### SAMPLE OUTPUT:

```13
```

For this test case, all 4-connected subsets are balanced.

```G.  .G  ..  ..  GG  .G  ..  G.  GG  .G  G.  GG  GG
.., .., G., .G, .., .G, GG, G., G., GG, GG, .G, GG
```

```4
GGGG
GGGG
GG.G
GGGG
```

#### SAMPLE OUTPUT:

```642
```

Here is an example of a subset that satisfies the second condition (it is 4-connected) but does not satisfy the third condition:

```GG..
.G..
GG..
....
```

#### SCORING:

• Test cases 1-4 satisfy N≤4N≤4.
• Test cases 5-10 satisfy N≤20N≤20.
• Test cases 11-20 satisfy no additional constraints.

Problem credits: Benjamin Qi

A key aspect in approaching this problem is more humanly characterizing what makes a 4-connected subset valid. For each row with at least one cell in the subset, the subset’s cells in this row must make up a contiguous range. This is necessary and sufficient to satisfy the third constraint in the problem statement.

However, we must characterize when the fourth constraint is also satisfied. First, the rows which have any subset cells must be a contiguous range of rows (otherwise it would not be 4-connected). Thus, let us characterize these rows. We can denote the column of the leftmost cell in the ii-th row (among those with cells in the subset) as LiLi and the column of the rightmost cell in the ii-th row as RiRi. To maintain 4-connectedness, each pair of consecutive rows must have some overlap (a necessary and sufficient requirement for this is that Li+1≤RiLi+1≤Ri and Ri+1≥LiRi+1≥Li). Finally, LL must be non-increasing for some prefix and then non-decreasing for the remaining (likewise RR must be non-decreasing and then non-increasing). If this condition is not met, then there will be a violation of the fourth condition. If this condition is met, there will be no such violation. Thus, we finalize our characterization by saying this property must hold for LL and RR.

Now, we must calculate the number of subsets that satisfy this characterization and contain only grass. Our main intuition is that we would like to use dynamic programming, where our state is the current row we are looking at, the starting and ending column of a contiguous range within this row, and flags that indicate whether our LL should be non-increasing or non-decreasing (and likewise for RR). There would be O(N3)O(N3) such states, and we could use this to ask how many valid subsets there are such that the bottom row of the subset is exactly this contiguous range of the row with these flags.

To be more concrete, one such method is supposing dp(a,b,h,L,R)dp(a,b,h,L,R) corresponds to the number of valid subsets with bottom row hh, cells from column LL to RR within hh, and flags aa and bb at the end of this process. The flag in aa corresponds to 00 if LL is in the non-increasing stage and 11 if it is in the non-decreasing stage. Likewise with the bb flag for RR, it is 00 for non-decreasing and 11 for non-increasing.

If we have computed the DP values for row h−1h−1, we can compute the values for row hh.

For example, dp(0,0,h,L,R)dp(0,0,h,L,R) (denoting row hh, leftmost column LL, rightmost column RR, LL is in the non-increasing prefix, and RR is in the non-decreasing prefix) is equal to 00 if there is a non-grass cell between columns LL and RR at row hh, and otherwise equal to:

1+∑Rl=L∑Rr=ldp(0,0,h−1,l,r)1+∑l=LR∑r=lRdp(0,0,h−1,l,r)

If we assume DP values are 0 when l>rl>r, then we can also use the following sum such that the sum is over a rectangle:

1+∑Rl=L∑Rr=Ldp(0,0,h−1,l,r)1+∑l=LR∑r=LRdp(0,0,h−1,l,r)

Note that there are more summands with different cases of flags. A naive transition given LiLi and RiRi is to loop over O(N2)O(N2) possible Li−1Li−1 and Ri−1Ri−1, which would result in an O(N5)O(N5) running time solution which is sufficient for the first two subtasks.

To speed up this solution, we can observe that the logic of this O(N2)O(N2) transition can instead be replaced with operations using prefix sums with DP values.

We maintain the required values in O(N2)O(N2) time for each row, enabling a solution with O(N3)O(N3) runtime.

Danny Mittal’s code:

```import java.io.BufferedReader;
import java.io.IOException;

public class BeautifulSubsets {
public static final long MOD = 1000000007;

public static void main(String[] args) throws IOException {
int n = Integer.parseInt(in.readLine());
boolean[][] grid = new boolean[n][n];
for (int y = 0; y < n; y++) {
String line = in.readLine();
for (int x = 0; x < n; x++) {
grid[y][x] = line.charAt(x) == 'G';
}
}
long answer = 0;
long[][][][] dpPrev = new long[n][n];
for (int y = 0; y < n; y++) {
long[][][][] sums1 = new long[n][n];
for (int x2 = 0; x2 < n; x2++) {
for (int x1 = x2; x1 >= 0; x1--) {
for (int b = 0; b < 2; b++) {
sums1[b][x1][x2] = dpPrev[b][x1][x2];
if (x1 < x2) {
sums1[b][x1][x2] += sums1[b][x1 + 1][x2];
sums1[b][x1][x2] %= MOD;
}
}
}
for (int x1 = 0; x1 <= x2; x1++) {
for (int b = 0; b < 2; b++) {
sums1[b][x1][x2] = dpPrev[b][x1][x2];
if (x1 > 0) {
sums1[b][x1][x2] += sums1[b][x1 - 1][x2] + dpPrev[b][x1 - 1][x2];
sums1[b][x1][x2] %= MOD;
}
}
}
}
long[][][][] sums2 = new long[n][n];
for (int x1 = 0; x1 < n; x1++) {
for (int x2 = x1; x2 < n; x2++) {
for (int a = 0; a < 2; a++) {
sums2[a][x1][x2] = sums1[a][x1][x2];
if (x2 > x1) {
sums2[a][x1][x2] += sums2[a][x1][x2 - 1];
sums2[a][x1][x2] %= MOD;
}
}
}
for (int x2 = n - 1; x2 >= x1; x2--) {
for (int a = 0; a < 2; a++) {
sums2[a][x1][x2] = sums1[a][x1][x2];
if (x2 < n - 1) {
sums2[a][x1][x2] += sums2[a][x1][x2 + 1] + sums1[a][x1][x2 + 1];
sums2[a][x1][x2] %= MOD;
}
}
}
}
long[][][][] dpNext = new long[n][n];
for (int x1 = 0; x1 < n; x1++) {
boolean valid = true;
for (int x2 = x1; x2 < n; x2++) {
valid = valid && grid[y][x2];
if (valid) {
dpNext[x1][x2] = (sums2[x1][x2] + 1L) % MOD;
dpNext[x1][x2] = (sums2[x1][x2] - (x2 == n - 1 ? 0L : (sums2[x2 + 1][x2 + 1] + dpPrev[x2 + 1][x2 + 1])) + (2L * MOD)) % MOD;
dpNext[x1][x2] = sums2[x1][x2];
dpNext[x1][x2] = sums2[x1][x2];
} else {
for (int a = 0; a < 2; a++) {
for (int b = 0; b < 2; b++) {
dpNext[a][b][x1][x2] = 0;
}
}
}
for (int a = 0; a < 2; a++) {
for (int b = 0; b < 2; b++) {
}
}
}
}
dpPrev = dpNext;
}
}
}```