# Routing Schemes USACO 2021 US

Page Contents

## Routing Schemes USACO

Consider a network of NN (2≤N≤1002≤N≤100) nodes labeled 1…N1…N. Each node is designated as a sender, a receiver, or neither. The number of senders, SS, is equal to the number of receivers (S≥1S≥1).

The connections between the nodes in this network can be described by a list of directed edges each of the form i→ji→j, meaning that node ii may route to node jj. Interestingly, all of these edges satisfy the property that i<ji<j, aside from KK that satisfy i>ji>j (0≤K≤20≤K≤2). There are no self-loops (edges of the form i→ii→i).

The description of a “routing scheme” consists of a set of SS directed paths from senders to receivers such that no two of these paths share an endpoint. That is, the paths connect distinct senders to distinct receivers. A path from a sender ss to a receiver rr can be described as a sequence of nodess=v0→v1→v2→⋯→ve=rs=v0→v1→v2→⋯→ve=rsuch that the directed edges vi→vi+1vi→vi+1 exist for all 0≤i<e0≤i<e. A node may appear more than once within the same path.

Count the number of distinct routing schemes such that every directed edge is traversed exactly once. Since the answer may be very large, report it modulo 109+7109+7. It is guaranteed that there is at least one routing scheme satisfying these constraints.

Each input contains TT (1≤T≤201≤T≤20) test cases that should be solved independently. It is guaranteed that the sum of N2N2 over all test cases does not exceed 2⋅1042⋅104.

#### INPUT FORMAT (input arrives from the terminal / stdin):

The first line of the input contains TT, the number of test cases.

The first line of each test case contains the integers NN and KK. Note that SS is not explicitly given within the input.

The second line of each test case contains a string of length NN. The ii-th character of the string is equal to S if the ii-th node is a sender, R if the ii-th node is a receiver, or . if the ii-th node is neither. The number of Rs in this string is equal to the number of Ss, and there is at least one S.

The next NN lines of each test case each contain a bit string of NN zeros and ones. The jj-th bit of the ii-th line is equal to 11 if there exists a directed edge from node ii to node jj, and 00 otherwise. As there are no self-loops, the main diagonal of the matrix consists solely of zeros. Furthermore, there are exactly KK ones below the main diagonal.

Consecutive test cases are separated by newlines for readability.

#### OUTPUT FORMAT (print output to the terminal / stdout):

For each test case, the number of routing schemes such that every edge is traversed exactly once, modulo 109+7109+7. It is guaranteed that there is at least one valid routing scheme for each test case.

```2

8 0
SS....RR
00100000
00100000
00011000
00000100
00000100
00000011
00000000
00000000

13 0
0001000000000
0001000000000
0001000000000
0000111000000
0000000000000
0000000000000
0000000000000
0000000001000
0000000001000
0000000000110
0000000000000
0000000000000
0000000000000
```

#### SAMPLE OUTPUT:

```4
12
```

For the first test case, the edges are 1→3,2→3,3→4,3→5,4→6,5→6,6→7,6→81→3,2→3,3→4,3→5,4→6,5→6,6→7,6→8.

There are four possible routing schemes:

• 1→3→4→6→7,2→3→5→6→81→3→4→6→7,2→3→5→6→8
• 1→3→5→6→7,2→3→4→6→81→3→5→6→7,2→3→4→6→8
• 1→3→4→6→8,2→3→5→6→71→3→4→6→8,2→3→5→6→7
• 1→3→5→6→8,2→3→4→6→71→3→5→6→8,2→3→4→6→7

For the second test case, the edges are 1→4,2→4,3→4,4→5,4→6,4→7,8→10,9→10,10→11,10→121→4,2→4,3→4,4→5,4→6,4→7,8→10,9→10,10→11,10→12.

One possible routing scheme consists of the following paths:

• 1→4→51→4→5
• 2→4→72→4→7
• 3→4→63→4→6
• 8→10→128→10→12
• 9→10→119→10→11

In general, senders {1,2,3}{1,2,3} can route to some permutation of receivers {5,6,7}{5,6,7} and senders {8,9}{8,9} can route to some permutation of receivers {11,12}{11,12}, giving an answer of 6⋅2=126⋅2=12.

```2

5 1
SS.RR
00101
00100
10010
00000
00000

6 2
S....R
001000
000100
010001
000010
001000
000000
```

#### SAMPLE OUTPUT:

```3
1
```

For the first test case, the edges are 1→3,1→5,2→3,3→1,3→41→3,1→5,2→3,3→1,3→4.

There are three possible routing schemes:

• 1→3→1→51→3→1→5, 2→3→42→3→4
• 1→3→41→3→4, 2→3→1→52→3→1→5
• 1→51→5, 2→3→1→3→42→3→1→3→4

For the second test case, the edges are 1→3,2→4,3→2,3→6,4→5,5→31→3,2→4,3→2,3→6,4→5,5→3.

There is only one possible routing scheme: 1→3→2→4→5→3→61→3→2→4→5→3→6.

```5

3 2
RS.
010
101
100

4 2
.R.S
0100
0010
1000
0100

4 2
.SR.
0000
0011
0100
0010

5 2
.SSRR
01000
10101
01010
00000
00000

6 2
SS..RR
001010
000010
000010
000010
100101
000000
```

```2
1
2
6
24
```

#### SCORING:

• Test cases 4-5 satisfy N≤6N≤6.
• Test cases 6-7 satisfy K=0K=0.
• Test cases 8-12 satisfy K=1K=1.
• Test cases 13-24 satisfy K=2K=2.

Problem credits: Benjamin Qi

First, create a new “start” node and add edges from it to every sender. Also add an “end” node and add edges from every receiver to it. In order for a routing scheme to exist, all NN original nodes must now have equal in-degree as out-degree; let deg(i)deg(i) equal this common degree.

For each node n∈[1,N]n∈[1,N], we must pair up every in-edge of the form i→ni→n with a distinct out-edge of the form n→on→o, meaning that if a packet enters nn through i→ni→n it will exit through n→on→o. Such pairings will result in a valid routing scheme as long as no cycles are induced. For example, in the first test case of the third sample input, {2→1,1→2→3→1}{2→1,1→2→3→1} uses all the edges but is not a valid routing scheme due to the induced cycle 1→2→3→11→2→3→1.

For K=0K=0, it is impossible to induce a cycle, so we can simply compute the number of ways to pair for each node independently and multiply the contributions together. The answer is given by the following expression:∏i=1Ndeg(i)!,∏i=1Ndeg(i)!,

For example, in the second test case of the first sample input, deg(4)=3deg(4)=3 and deg(10)=2deg(10)=2, so the answer is 3!⋅2!=123!⋅2!=12.

Code:

```#include <bits/stdc++.h>
using namespace std;

const int MOD = 1e9+7;

using ll = long long;

struct mi {
int v;
mi() { v = 0; }
mi(ll _v) { v = int(_v%MOD); }
};
mi& operator+=(mi& a, mi b) {
if ((a.v += b.v) >= MOD) a.v -= MOD;
return a; }
mi& operator-=(mi& a, mi b) {
if ((a.v -= b.v) < 0) a.v += MOD;
return a; }
mi operator+(mi a, mi b) { return a += b; }
mi operator-(mi a, mi b) { return a -= b; }
mi operator*(mi a, mi b) { return mi((ll)a.v*b.v); }
mi& operator*=(mi& a, mi b) { return a = a*b; }

void solve() {
bool send{}, recei{};
int in_deg{}, out_deg{}, deg{};

// setup
int N,K; cin >> N >> K; assert(K <= 1);
for (int i = 1; i <= N; ++i) {
char c; cin >> c;
if (c == 'S') send[i] = 1;
if (c == 'R') recei[i] = 1;
}
for (int i = 1; i <= N; ++i)
for (int j = 1; j <= N; ++j) {
char c; cin >> c;
if (c == '1') ++out_deg[i], ++in_deg[j];
}
for (int i = 1; i <= N; ++i) {
assert(in_deg[i]+send[i] == out_deg[i]+recei[i]);
deg[i] = in_deg[i]+send[i];
}
vector<mi> factorial(N+1); factorial = 1;
for (int i = 1; i <= N; ++i) factorial[i] = i*factorial[i-1];

mi ans = 1;
for (int i = 1; i <= N; ++i) ans *= factorial[deg[i]];
cout << ans.v << "\n";
}

int main() {
int T; cin >> T;
while (T--) solve();
}
```

For K=1K=1, suppose that the only edge satisfying i>ji>j is estart→eendestart→eend. Then the pairing is invalid if and only if there exists some sequence eend=v0→v1→v2→⋯→vk=estarteend=v0→v1→v2→⋯→vk=estart such that vi<vi+1vi<vi+1 and edge vi→vi+1vi→vi+1 paired with edge vi+1→v(i+2)%Kvi+1→v(i+2)%K at vi+1vi+1 for all i∈[0,k)i∈[0,k). We can count the number of valid routing schemes with an O(N2)O(N2) DP where we pair up the edges adjacent to ii in increasing order of ii. Let dp[i][j]dp[i][j] equal the number of ways to pair up the edges adjacent to vertices 1…i1…i such that there is currently a path eend=v0→v1→v2→⋯→vk=jeend=v0→v1→v2→⋯→vk=j where j>ij>i (or j=0j=0 if we know that regardless of how we pair up the edges adjacent vertices i+1…Ni+1…N, no cycle will be produced). Initially, dp[eend]=1dp[eend]=1, and our answer will be dp[N]dp[N].

There are several possible transitions from dp[i−1][j]dp[i−1][j]:

• If j≠ij≠i, it doesn’t matter how we pair up the edges adjacent to jj. So add dp[i−1][j]⋅deg(i)!dp[i−1][j]⋅deg(i)! to dp[i][j]dp[i][j].
• If j=i=estartj=i=estart, then there are (deg(i)−1)⋅(deg(i)−1)!(deg(i)−1)⋅(deg(i)−1)! ways to pair up the edges adjacent to ii such that no cycle is produced. Then regardless of how the edges are paired for i+1…Ni+1…N, we’ll never create a cycle. So add dp[i−1][j]⋅(deg(i)−1)⋅(deg(i)−1)dp[i−1][j]⋅(deg(i)−1)⋅(deg(i)−1) to dp[i]dp[i].
• If j=i≠estartj=i≠estart, then we transition from dp[i−1][j]dp[i−1][j] to dp[i]dp[i] if there is a receiver at ii and dp[i][k]dp[i][k] for all k>jk>j such that there exists an edge from jj to kk.

Code for K≤1K≤1:

```#include <bits/stdc++.h>
using namespace std;

const int MOD = 1e9+7;

using ll = long long;

struct mi {
int v;
mi() { v = 0; }
mi(ll _v) { v = int(_v%MOD); }
};
mi& operator+=(mi& a, mi b) {
if ((a.v += b.v) >= MOD) a.v -= MOD;
return a; }
mi& operator-=(mi& a, mi b) {
if ((a.v -= b.v) < 0) a.v += MOD;
return a; }
mi operator+(mi a, mi b) { return a += b; }
mi operator-(mi a, mi b) { return a -= b; }
mi operator*(mi a, mi b) { return mi((ll)a.v*b.v); }
mi& operator*=(mi& a, mi b) { return a = a*b; }

void solve() {
mi dp{};
bool send{}, recei{};
int in_deg{}, out_deg{}, deg{};

// setup
int N,K; cin >> N >> K; assert(K <= 1);
for (int i = 1; i <= N; ++i) {
char c; cin >> c;
if (c == 'S') send[i] = 1;
if (c == 'R') recei[i] = 1;
}
int e_start = 0, e_end = 0;
for (int i = 1; i <= N; ++i)
for (int j = 1; j <= N; ++j) {
char c; cin >> c;
if (c == '1') {
++out_deg[i], ++in_deg[j];
if (i > j) e_start = i, e_end = j;
}
}
for (int i = 1; i <= N; ++i) {
assert(in_deg[i]+send[i] == out_deg[i]+recei[i]);
deg[i] = in_deg[i]+send[i];
}
vector<mi> factorial(N+1); factorial = 1;
for (int i = 1; i <= N; ++i) factorial[i] = i*factorial[i-1];

// now do DP
dp[e_end] = 1;
for (int i = 1; i <= N; ++i)
for (int j = 0; j <= N; ++j) if (dp[i-1][j].v != 0) {
assert(j == 0 || j >= i);
if (j == i) {
if (j == e_start) dp[i] += (deg[i]-1)*ad;
else {
for (int k = i+1; k <= N; ++k) if (adj[j][k]) dp[i][k] += ad;
}
} else {
dp[i][j] += dp[i-1][j]*factorial[deg[i]];
}
}
cout << dp[N].v << "\n";
}

int main() {
int T; cin >> T;
while (T--) solve();
}
```

Essentially, our solution for K=1K=1 involves sweeping a vertical line from i=1i=1 to i=Ni=N and maintaining the endpoint of a path that we want to make sure does not become part of a cycle (the start point of the path is estartestart). When the line hits the current endpoint of the path (j=ij=i), we perform the appropriate DP transitions; otherwise, we can pair up the edges adjacent to ii in whatever way we want (for j≠ij≠i, just multiply by deg(i)!deg(i)!).

K>1K>1 is trickier but the idea is similar. Instead of a single path, we can maintain the start and end points of up to KK paths such that both the start and end points of each path are to the right of the line. Specifically, let dp[i][j][k]dp[i][j][k] equal the number of ways to pair up the edges adjacent to vertices 1…i1…i such that there is currently a path ends→starts→⋯→jends→starts→⋯→j where j≥ij≥i and ends≥iends≥i (or j=0j=0 if no such path exists), as well as a path ends→starts→⋯→kends→starts→⋯→k where k≥ik≥i and ends≥iends≥i (or k=0k=0 if no such path exists). We initialize dp[ends][ends]=1dp[ends][ends]=1 because initially, the paths lying to the right of the sweepline are starts→endsstarts→ends and starts→endsstarts→ends. The answer will be stored in dp[N]dp[N].

Figuring out the transitions from dp[i−1][j][k]dp[i−1][j][k] depends on whether i=ji=j, i=ki=k, or both. It requires a bit of casework since we may choose to merge the two paths into one; see the code for details.

```#include <bits/stdc++.h>
using namespace std;

const int MOD = 1e9+7;

using ll = long long;

struct mi {
int v;
mi() { v = 0; }
mi(ll _v) { v = int(_v%MOD); }
};
mi& operator+=(mi& a, mi b) {
if ((a.v += b.v) >= MOD) a.v -= MOD;
return a; }
mi& operator-=(mi& a, mi b) {
if ((a.v -= b.v) < 0) a.v += MOD;
return a; }
mi operator+(mi a, mi b) { return a += b; }
mi operator-(mi a, mi b) { return a -= b; }
mi operator*(mi a, mi b) { return mi((ll)a.v*b.v); }
mi& operator*=(mi& a, mi b) { return a = a*b; }

void solve() {
mi dp{};
bool send{}, recei{};
int in_deg{}, out_deg{}, deg{}, id{};

// setup
int N,K; cin >> N >> K;
for (int i = 1; i <= N; ++i) {
char c; cin >> c;
if (c == 'S') send[i] = 1;
if (c == 'R') recei[i] = 1;
}

vector<int> starts, ends;
for (int i = 1; i <= N; ++i)
for (int j = 1; j <= N; ++j) {
id[i][j] = -1;
char c; cin >> c;
if (c == '1') {
++out_deg[i], ++in_deg[j];
if (i > j) {
id[i][j] = starts.size();
starts.push_back(i); ends.push_back(j);
}
}
}
while (starts.size() < 2) starts.push_back(0), ends.push_back(0);
for (int i = 1; i <= N; ++i) {
assert(in_deg[i]+send[i] == out_deg[i]+recei[i]);
deg[i] = in_deg[i]+send[i];
}
vector<mi> factorial(N+1); factorial = 1;
for (int i = 1; i <= N; ++i) factorial[i] = i*factorial[i-1];

// DP
// sweep line: keep track of which segments cross the border
dp[ends][ends] = 1;
for (int i = 1; i <= N; ++i) for (int j = 0; j <= N; ++j) for (int k = 0; k <= N; ++k)
if (dp[i-1][j][k].v != 0) { // paths are (starts,j), (starts,k)
vector<int> in; // which path endpoints do we hit?
if (j) {
assert(starts >= i && j >= i);
if (j == i) in.push_back(0);
}
if (k) {
assert(starts >= i && k >= i);
if (k == i) in.push_back(1);
}
auto inc_dp = [&](int jj, int kk) {
if (starts <= i || jj <= i) jj = 0;
if (starts <= i || kk <= i) kk = 0;
};
if (in.empty()) { inc_dp(j,k); continue; } // paths remain same, ez
vector<int> out;
for (int jj = 1; jj <= N; ++jj)
if (adj[i][jj] || (i == jj && recei[jj]))
out.push_back(jj);
assert(out.size() == deg[i]);
if (in == vector<int>{0}) {
for (int jj: out) {
if (id[i][jj] == 0) continue;
if (id[i][jj] == 1) inc_dp(k,0); // merge paths
else inc_dp(jj,k);
}
} else if (in == vector<int>{1}) {
for (int kk: out) {
if (id[i][kk] == 1) continue;
if (id[i][kk] == 0) inc_dp(0,j); // merge paths
else inc_dp(j,kk);
}
} else {
assert((in == vector<int>{0,1}));
for (int jj: out) for (int kk: out) if (jj != kk) {
if (id[i][jj] == 0 || id[i][kk] == 1) continue;
if (id[i][jj] == 1) {
if (id[i][kk] == 0) continue;
assert(kk >= i);
inc_dp(kk,0); // merge paths
} else if (id[i][kk] == 0) {
assert(jj >= i);
inc_dp(0,jj); // merge paths
} else {
inc_dp(jj,kk);
}
}
}
}
cout << dp[N].v << "\n";
}

int main() {
int T; cin >> T;
while (T--) solve();
}
```

Another version that should work in O(NK+1)O(NK+1) for any fixed KK (ignoring factors of logNlog⁡N):

```#include <bits/stdc++.h>
using namespace std;

const int MOD = 1e9+7;

using ll = long long;
using vpi = vector<pair<int,int>>;

#define f first
#define s second
#define sz(x) (int)(x).size()

struct mi {
int v;
mi() { v = 0; }
mi(ll _v) { v = int(_v%MOD); }
};
mi& operator+=(mi& a, mi b) {
if ((a.v += b.v) >= MOD) a.v -= MOD;
return a; }
mi& operator-=(mi& a, mi b) {
if ((a.v -= b.v) < 0) a.v += MOD;
return a; }
mi operator+(mi a, mi b) { return a += b; }
mi operator-(mi a, mi b) { return a -= b; }
mi operator*(mi a, mi b) { return mi((ll)a.v*b.v); }
mi& operator*=(mi& a, mi b) { return a = a*b; }

string G, nodes;
int N,K;

int in_deg(int i) {
int in = 0;
if (nodes[i] == 'S') ++in;
for (int j = 0; j < N; ++j) if (G[j][i] == '1') ++in;
return in;
}

namespace std {

template<class Fun>
class y_combinator_result {
Fun fun_;
public:
template<class T>
explicit y_combinator_result(T &&fun): fun_(std::forward<T>(fun)) {}

template<class ...Args>
decltype(auto) operator()(Args &&...args) {
return fun_(std::ref(*this), std::forward<Args>(args)...);
}
};

template<class Fun>
decltype(auto) y_combinator(Fun &&fun) {
return y_combinator_result<std::decay_t<Fun>>(std::forward<Fun>(fun));
}

} // namespace std

vector<mi> factorial;
map<vpi,mi> dp;

void process_vert(int v) {
int deg = in_deg(v);
map<vpi,mi> DP;
for (pair<vpi,mi> tmp: dp) {
auto comp = [&](int x) { return -x-1; };
vector<int> going_in, going_out;
for (int j = 0; j < sz(tmp.f); ++j) {
if (tmp.f[j].s == v) going_in.push_back(comp(j));
if (tmp.f[j].f == v) going_out.push_back(comp(j));
}
for (int j = v+1; j < N; ++j) if (G[v][j] == '1') going_out.push_back(j);
while (sz(going_out) < deg) going_out.push_back(v);
vector<bool> done(sz(going_out));
auto tran = [&](vpi edges, mi num) {
map<int,int> nex, xen;
for (pair<int,int> e: edges) {
nex[e.f] = e.s;
xen[e.s] = e.f;
}
vpi ntmp;
for (pair<int,int> x: nex) {
set<int> vis; int lst = x.f;
while (1) { // check for cycle
if (vis.count(lst)) return; // found cycle
if (!nex.count(lst)) break;
vis.insert(lst); lst = nex[lst];
}
if (!xen.count(x.f)) {
if (lst < 0) lst = tmp.f[comp(lst)].s;
if (lst > v) ntmp.push_back({tmp.f[comp(x.f)].f,lst});
}
}
// if nex contains any cycle -> FAIL
for (pair<int,int> t: tmp.f) if (t.f > v && t.s > v) ntmp.push_back(t);
sort(begin(ntmp),end(ntmp)); DP[ntmp] += num;
};
auto generate = y_combinator([&](auto self, int ind, vpi edges) {
if (ind == sz(going_in)) {
tran(edges,tmp.s*factorial[sz(going_out)-ind]);
return;
}
for (int i = 0; i < sz(going_out); ++i) if (!done[i]) {
vpi nedges = edges;
nedges.emplace_back(going_in[ind],going_out[i]);
done[i] = 1; self(ind+1,nedges); done[i] = 0;
}
});
generate(0,vpi());
}
swap(dp,DP);
}

void solve() {
cin >> N >> K;
factorial.resize(N+1);
factorial = 1; for (int i = 1; i <= N; ++i) factorial[i] = i*factorial[i-1];

cin >> nodes;
for (int i = 0; i < N; ++i) cin >> G[i];
vpi back_edges;
for (int i = 0; i < N; ++i) for (int j = 0; j < i; ++j)
if (G[i][j] == '1') back_edges.emplace_back(i,j);
assert(sz(back_edges) == K);
dp.clear();
dp[back_edges] = 1;
for (int i = 0; i < N; ++i) process_vert(i);

mi ans = dp[{}];
cout << ans.v << "\n";
}

int main() {
int T; cin >> T;
while (T--) solve();
}
```

Danny Mittal’s code (slightly different approach):

```import java.io.BufferedReader;
import java.io.IOException;
import java.util.StringTokenizer;

public class RoutingSchemesMultitest {
public static final long MOD = 1000000007;

public static void main(String[] args) throws IOException {
for (int tc = 1; tc <= t; tc++) {
int n = Integer.parseInt(tokenizer.nextToken());
int k = Integer.parseInt(tokenizer.nextToken());
char[] types = ("." + in.readLine()).toCharArray();
boolean[][] adj = new boolean[n + 1][n + 1];
int[] inDegree = new int[n + 1];
int[] outDegree = new int[n + 1];
int specialFrom1 = 0;
int specialTo1 = 0;
int specialFrom2 = 0;
int specialTo2 = 0;
for (int a = 1; a <= n; a++) {
String line = " " + in.readLine();
for (int b = 1; b <= n; b++) {
outDegree[a]++;
inDegree[b]++;
if (a > b) {
if (specialFrom1 == 0) {
specialFrom1 = a;
specialTo1 = b;
} else {
specialFrom2 = a;
specialTo2 = b;
}
}
}
}
}
for (int a = 1; a <= n; a++) {
if (inDegree[a] + (types[a] == 'S' ? 1 : 0) != outDegree[a] + (types[a] == 'R' ? 1 : 0)) {
System.out.println(0);
return;
}
}
long[] factorial = new long[n + 1];
factorial = 1;
for (int j = 1; j <= n; j++) {
factorial[j] = (((long) j) * factorial[j - 1]) % MOD;
}
long[][][] dp = new long[n + 1][n + 1][n + 1];
dp[specialTo1][specialTo2] = 1;
for (int a = 1; a <= n; a++) {
int degree = Math.max(inDegree[a], outDegree[a]);
for (int b = 0; b <= n; b++) {
for (int c = 0; c <= n; c++) {
dp[a][b][c] += dp[a - 1][b][c];
dp[a][b][c] %= MOD;
dp[a][a][c] += dp[a - 1][b][c];
dp[a][a][c] %= MOD;
}
dp[a][b][a] += dp[a - 1][b][c];
dp[a][b][a] %= MOD;
}
dp[a][a][a] += dp[a - 1][b][c];
dp[a][a][a] %= MOD;
}
}
}
for (int b = 0; b <= n; b++) {
for (int c = 0; c <= n; c++) {
dp[a][b][c] *= factorial[Math.max(0, degree - (b == a ? 1 : 0) - (c == a ? 1 : 0))];
dp[a][b][c] %= MOD;
}
}
}
if (k == 0) {
} else if (k == 1) {
for (int a = 1; a <= n; a++) {
if (types[a] == 'R') {
}
}
} else {
for (int a = 1; a <= n; a++) {
if (types[a] == 'R') {
for (int b = 1; b <= n; b++) {
if (types[b] == 'R' && b != a) {
}
}

}
}
}
}
}
}
```

Here is a solution that works in O(N2)O(N2) time that uses the principle of inclusion and exclusion (courtesy of Andrew He). Essentially, you

2. subtract the number of ways to form a cycle with the first back-edge using get_path_1get_path_1
3. subtract the number of ways to form a cycle with the second back-edge using get_path_1get_path_1
4. add back the number of ways to form two cycles, one involving each back-edge
5. subtract the number of ways to form a single cycle involving both back-edges

The code below calculates (2) as prod_deg⋅get_path_1(estart,eend)prod_deg⋅get_path_1(estart,eend), (3) as prod_deg⋅get_path_1(estart,eend)prod_deg⋅get_path_1(estart,eend), and (4) – (5) asprod_deg⋅get_path_1(estart,eend)⋅prod_deg⋅get_path_1(estart,eend)prod_deg⋅get_path_1(estart,eend)⋅prod_deg⋅get_path_1(estart,eend)−prod_deg⋅get_path_1(estart,eend)⋅prod_deg⋅get_path_1(estart,eend).−prod_deg⋅get_path_1(estart,eend)⋅prod_deg⋅get_path_1(estart,eend).The intuition for this last expression is that if a proposed scheme results in a vertex being part of more than one cycle or part of the same cycle more than once, then it is added in (4) and subtracted in (5), so the net contribution is 0. Proof of correctness is left to the reader.

```#include <bits/stdc++.h>
using namespace std;

const int MOD = 1e9+7;

using ll = long long;

struct mi {
int v; explicit operator int() const { return v; }
mi() { v = 0; }
mi(ll _v):v(_v%MOD) { v += (v<0)*MOD; }
};
mi& operator+=(mi& a, mi b) {
if ((a.v += b.v) >= MOD) a.v -= MOD;
return a; }
mi& operator-=(mi& a, mi b) {
if ((a.v -= b.v) < 0) a.v += MOD;
return a; }
mi operator+(mi a, mi b) { return a += b; }
mi operator-(mi a, mi b) { return a -= b; }
mi operator*(mi a, mi b) { return mi((ll)a.v*b.v); }
mi& operator*=(mi& a, mi b) { return a = a*b; }
mi pow(mi a, ll p) { assert(p >= 0); // asserts are important!
return p==0?1:pow(a*a,p/2)*(p&1?a:1); }
mi inv(mi a) { assert(a.v != 0); return pow(a,MOD-2); }
mi operator/(mi a, mi b) { return a*inv(b); }

vector<mi> factorial;

void solve() {
bool send{}, recei{};
int in_deg{}, out_deg{}, deg{};

int N,K; cin >> N >> K;
for (int i = 1; i <= N; ++i) {
char c; cin >> c;
if (c == 'S') send[i] = 1;
if (c == 'R') recei[i] = 1;
}

vector<int> e_start, e_end;
for (int i = 1; i <= N; ++i)
for (int j = 1; j <= N; ++j) {
char c; cin >> c;
if (c == '1') {
++out_deg[i], ++in_deg[j];
if (i > j) {
e_start.push_back(i);
e_end.push_back(j);
}
}
}
while (e_start.size() < 2) {
e_start.push_back(0);
e_end.push_back(0);
}
for (int i = 1; i <= N; ++i) {
assert(in_deg[i]+send[i] == out_deg[i]+recei[i]);
deg[i] = in_deg[i]+send[i];
}
factorial.resize(N+1); factorial = 1;
for (int i = 1; i <= N; ++i) factorial[i] = i*factorial[i-1];

mi prod_deg = 1;
for (int i = 1; i <= N; ++i) prod_deg *= factorial[deg[i]];
auto get_path_1 = [&](int st, int en) -> mi {
if (st == 0 || en == 0 || en > st) return 0;
vector<mi> dp(N+1);
dp[en] = 1;
for (int i = en; i <= st; i++) {
dp[i] = dp[i]/max(deg[i],1);
for (int j = i+1; j <= st; j++) {
dp[j] += dp[i];
}
}
}
return dp[st];
};
auto get_path_2 = [&](int st1, int en1, int st2, int en2) -> mi {
return get_path_1(st1, en1) * get_path_1(st2, en2);
};
mi ans =1;
ans -= get_path_1(e_start,e_end);
ans -= get_path_1(e_start,e_end);
ans += get_path_2(e_start,e_end,e_start,e_end);
ans -= get_path_2(e_start,e_end,e_start,e_end);
ans *= prod_deg;
cout << ans.v << "\n";
}

int main() {
int T; cin >> T;
while (T--) solve();
}
```

Regarding the O(N2)O(N2) solution for K=2K=2 above, note that the answer is actually the product of prod_degprod_deg and a determinant:prod_deg⋅det[1−get_path_1(estart,eend)−get_path_1(estart,eend)−get_path_1(estart,eend)1−get_path_1(estart,eend)].prod_deg⋅det[1−get_path_1(estart,eend)−get_path_1(estart,eend)−get_path_1(estart,eend)1−get_path_1(estart,eend)].

In general, we need to compute K2K2 values of get_path_1get_path_1 and take the determinant of a K×KK×K matrix, which can be done in O(N2K+K3)O(N2K+K3) time.

Alternatively, suppose that we combine the start and end nodes into a single node nspecialnspecial. Now every node in the graph has equal in-degree as out-degree, and the number of routing schemes is equal to the number of Eulerian circuits in this graph divided by (deg(nspecial)−1)!(deg(nspecial)−1)!. The number of Eulerian circuits in this graph is given by the BEST theorem, which involves multiplying the determinant of an N×NN×N matrix by some factorials. This can be done in O(min(N3,KN2))O(min(N3,KN2)), where the KN2KN2 term comes from the observation that the matrix we are taking the determinant of has nonzero entries along its main diagonal and is almost upper triangular, with the exception of KK entries below the main diagonal.