# Finding cartesian product with PHP associative arrays

Say that I have an array like the following:

``````Array
(
[arm] => Array
(
 => A
 => B
 => C
)
[gender] => Array
(
 => Female
 => Male
)
[location] => Array
(
 => Vancouver
 => Calgary
)
)
``````

How can I find the cartesian product while preserving the keys of the outer associative array and using them in the inner ones? The result of the algorithm should be this:

``````Array
(
 => Array
(
[arm] => A
[gender] => Female
[location] => Vancouver
)

 => Array
(
[arm] => A
[gender] => Female
[location] => Calgary
)

 => Array
(
[arm] => A
[gender] => Male
[location] => Vancouver
)

...etc.
``````

I've looked up quite a number of cartesian product algorithms but I'm getting stuck on the specifics of how to preserve the associative keys. The current algorithm I am using gives numerical indices only:

``````    \$result = array();
foreach (\$map as \$a) {
if (empty(\$result)) {
\$result = \$a;
continue;
}
\$res = array();
foreach (\$result as \$r) {
foreach (\$a as \$v) {
\$res[] = array_merge((array)\$r, (array)\$v);
}
}
\$result = \$res;
}

print_r(\$result);
``````

Any help would be appreciated.

62

Here's a solution I wouldn't be ashamed to show.

## Rationale

Assume that we have an input array `\$input` with `N` sub-arrays, as in your example. Each sub-array has `Cn` items, where `n` is its index inside `\$input`, and its key is `Kn`. I will refer to the `i`th item of the `n`th sub-array as `Vn,i`.

The algorithm below can be proved to work (barring bugs) by induction:

1) For N = 1, the cartesian product is simply `array(0 => array(K1 => V1,1), 1 => array(K1 => V1,2), ... )` -- C1 items in total. This can be done with a simple `foreach`.

2) Assume that `\$result` already holds the cartesian product of the first N-1 sub-arrays. The cartesian product of `\$result` and the Nth sub-array can be produced this way:

3) In each item (array) inside `\$product`, add the value `KN => VN,1`. Remember the resulting item (with the added value); I 'll refer to it as `\$item`.

4a) For each array inside `\$product`:

4b) For each value in the set `VN,2 ... VN,CN`, add to `\$product` a copy of `\$item`, but change the value with the key `KN` to `VN,m` (for all `2 <= m <= CN`).

The two iterations 4a (over `\$product`) and 4b (over the Nth input sub-array) ends up with `\$result` having `CN` items for every item it had before the iterations, so in the end `\$result` indeed contains the cartesian product of the first N sub arrays.

Therefore the algorithm will work for any N.

This was harder to write than it should have been. My formal proofs are definitely getting rusty...

## Code

``````function cartesian(\$input) {
\$result = array();

while (list(\$key, \$values) = each(\$input)) {
// If a sub-array is empty, it doesn't affect the cartesian product
if (empty(\$values)) {
continue;
}

// Seeding the product array with the values from the first sub-array
if (empty(\$result)) {
foreach(\$values as \$value) {
\$result[] = array(\$key => \$value);
}
}
else {
// Second and subsequent input sub-arrays work like this:
//   1. In each existing array inside \$product, add an item with
//      key == \$key and value == first item in input sub-array
//   2. Then, for each remaining item in current input sub-array,
//      add a copy of each existing array inside \$product with
//      key == \$key and value == first item of input sub-array

// Store all items to be added to \$product here; adding them
// inside the foreach will result in an infinite loop
\$append = array();

foreach(\$result as &\$product) {
// Do step 1 above. array_shift is not the most efficient, but
// it allows us to iterate over the rest of the items with a
// simple foreach, making the code short and easy to read.
\$product[\$key] = array_shift(\$values);

// \$product is by reference (that's why the key we added above
// will appear in the end result), so make a copy of it here
\$copy = \$product;

// Do step 2 above.
foreach(\$values as \$item) {
\$copy[\$key] = \$item;
\$append[] = \$copy;
}

// Undo the side effecst of array_shift
array_unshift(\$values, \$product[\$key]);
}

// Out of the foreach, we can add to \$results now
\$result = array_merge(\$result, \$append);
}
}

return \$result;
}
``````

## Usage

``````\$input = array(
'arm' => array('A', 'B', 'C'),
'gender' => array('Female', 'Male'),
'location' => array('Vancouver', 'Calgary'),
);

print_r(cartesian(\$input));
``````
Tuesday, June 1, 2021

46

Use crossing from the `tidyr` package:

``````x <- data.frame(x=c("a","b","c"))
y <- data.frame(y=c(1,2,3))

crossing(x, y)
``````

Result:

``````   x y
1 a 1
2 a 2
3 a 3
4 b 1
5 b 2
6 b 3
7 c 1
8 c 2
9 c 3
``````
Saturday, July 3, 2021

31

A hard code solution would be:

``````for (int a1 : {0}) {
for (int a2 : {0,1,2}) {
for (int a3 : {0,1,2,3}) {
for (int a4 : {0}) {
for (int a5 : {0,1}) {
do_job(a1, a2, a3, a4, a5);
}
}
}
}
}
``````

You may use the following for a generic way (putting all `a`s into vector):

``````bool increase(const std::vector<std::size_t>& v, std::vector<std::size_t>& it)
{
for (std::size_t i = 0, size = it.size(); i != size; ++i) {
const std::size_t index = size - 1 - i;
++it[index];
if (it[index] > v[index]) {
it[index] = 0;
} else {
return true;
}
}
return false;
}

void iterate(const std::vector<std::size_t>& v)
{
std::vector<std::size_t> it(v.size(), 0);

do {
do_job(it);
} while (increase(v, it));
}
``````

Live Demo

Friday, August 6, 2021

68

You can use the `Sets.cartesianProduct()` method from Google's Guava libraries to generate Cartesian products:

``````com.google.common.collect.Sets.cartesianProduct(Set[] yourSets)
``````

If only everything was that easy!

Saturday, August 7, 2021

97

After the precision in the question, here's a revised version. I'm keeping the previous answer since it can be useful too and uses a less complex order.

``````# yields the possible cartesian products of [first, *rest], where the total
# of the indices that are "distributed" is exactly +nb+ and each index doesn't
# go beyong +depth+, but at least one of them is exactly +depth+
def distribute(nb, depth, reached, first, *rest)
from  = [nb - rest.size * depth, 0].max
to    = [first.size-1, depth, nb].min
from.upto(to) do |i|
obj = first[i]
reached ||= i == depth
if rest.empty?
yield [obj] if reached
else
distribute(nb - i, depth, reached, *rest) do |comb|
yield [obj, *comb]
end
end
end
end

def depth_first_cartesian(*arrays)
lengths = arrays.map(&:length)
total = lengths.inject(:+)
lengths.max.times do |depth|
depth.upto(arrays.size * depth) do |nb|
distribute(nb, depth, false, *arrays) {|c| yield c}
end
end
end

p depth_first_cartesian([1, 2, 3], [1, 2, 3, 4], [1, 2, 3]).to_a
# => [[1, 1, 1], [1, 1, 2], [1, 2, 1], [2, 1, 1], [1, 2, 2], [2, 1, 2], [2, 2, 1], [2, 2, 2],
#     [1, 1, 3], [1, 3, 1], [3, 1, 1], [1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2],
#     [3, 2, 1], [1, 3, 3], [2, 2, 3], [2, 3, 2], [3, 1, 3], [3, 2, 2], [3, 3, 1], [2, 3, 3],
#     [3, 2, 3], [3, 3, 2], [3, 3, 3], [1, 4, 1], [1, 4, 2], [2, 4, 1], [1, 4, 3], [2, 4, 2],
#     [3, 4, 1], [2, 4, 3], [3, 4, 2], [3, 4, 3]]
``````
Saturday, November 27, 2021