Asked  7 Months ago    Answers:  4   Viewed   31 times

I'm testing an algorithm for 2d bin packing and I've chosen PHP to mock it up as it's my bread-and-butter language nowadays.

As you can see on http://themworks.com/pack_v0.2/oopack.php?ol=1 it works pretty well, but you need to wait around 10-20 seconds for 100 rectangles to pack. For some hard to handle sets it would hit the php's 30s runtime limit.

I did some profiling and it shows that most of the time my script goes through different parts of a small 2d array with 0's and 1's in it. It either checks if certain cell equals to 0/1 or sets it to 0/1. It can do such operations million times and each times it takes few microseconds.

I guess I could use an array of booleans in a statically typed language and things would be faster. Or even make an array of 1 bit values. I'm thinking of converting the whole thing to some compiled language. Is PHP just not good for it?

If I do need to convert it to let's say C++, how good are the automatic converters? My script is just a lot of for loops with basic arrays and objects manipulations.

Edit. This function gets called more than any other. It reads few properties of a very simple object, and goes through a very small part of a smallish array to check if there's any element not equal to 0.

function fits($bin, $w, $h, $x, $y) {

    $w += $x;
    $h += $y;

    for ($i = $x; $i < $w; $i++) {

        for ($j = $y; $j < $h; $j++) {

            if ($bin[$i][$j] !== 0) {
                return false;
            }
        }
    }

    return true;    
}

Update: I've tried using 1d array instead of 2d as one of the answers suggested. Since I needed to always have current bin width accessible, I decided to wrap everything in the object. Also, now in every loop the index needs to be calculated. Now the script takes even more time to run. Other techniques didn't bring much performance boost, rather made code less readable. Time for HipHop I guess.

Update: since hiphop php only runs on linux, and I don't have one, I've decided to rewrite the whole thing in C++. It's nice to freshen up the old skills. Also, if I do find a way to use hiphop, it'll be interesting to compare hand-written C++ code and the one hiphop would generate.

Update: I rewrote this thing in c++, on average it works 20 times faster and uses much less memory. Let me see if I can make it even faster.

 Answers

72

Array access in PHP can certainly be slow. PHP uses hash tables to implement arrays, i.e. in order to access an element in an array it has to calculate a hash and traverse a linked list. Using a compiled language with real arrays will definitely improve performance, because there a direct memory access is made. For the interested: Code for hash access with string and with integer.

Concerning your code, there are several points I would optimize:

  • return directly, don't break twice.
  • put $file->get_width() and $file->get_height into simple variables. I assume that the height or width doesn't change throughout the process. Remember: Functions in PHP are slow.
  • Use a one-dimensional array, instead of nested arrays. You save one hash lookup per iteration that way. Actually a one-dimensional array is only marginally faster or even slightly slower. Comparison of several ways of saving the data concerning performance and memory usage.

.

function fits($bin, $x, $y, $w, $h) {
    $w += $x;
    $h += $y;

    for ($i = $x; $i < $w; ++$i) {
        for ($j = $y; $j < $h; ++$j) {
            if ($bin[$i][$j] !== 0) {
                return false;
            }
        } 
    }

    return true;   
}

Though I'm not sure, why you add $x to the $width / $y to the $height. Don't you want to iterate from the current coordinates to the image boundaries?

Wednesday, March 31, 2021
 
DilbertDave
answered 7 Months ago
96

My personal opinion is to use what makes sense in the context. Personally I almost never use for for array traversal. I use it for other types of iteration, but foreach is just too easy... The time difference is going to be minimal in most cases.

The big thing to watch for is:

for ($i = 0; $i < count($array); $i++) {

That's an expensive loop, since it calls count on every single iteration. So long as you're not doing that, I don't think it really matters...

As for the reference making a difference, PHP uses copy-on-write, so if you don't write to the array, there will be relatively little overhead while looping. However, if you start modifying the array within the array, that's where you'll start seeing differences between them (since one will need to copy the entire array, and the reference can just modify inline)...

As for the iterators, foreach is equivalent to:

$it->rewind();
while ($it->valid()) {
    $key = $it->key();     // If using the $key => $value syntax
    $value = $it->current();

    // Contents of loop in here

    $it->next();
}

As far as there being faster ways to iterate, it really depends on the problem. But I really need to ask, why? I understand wanting to make things more efficient, but I think you're wasting your time for a micro-optimization. Remember, Premature Optimization Is The Root Of All Evil...

Edit: Based upon the comment, I decided to do a quick benchmark run...

$a = array();
for ($i = 0; $i < 10000; $i++) {
    $a[] = $i;
}

$start = microtime(true);
foreach ($a as $k => $v) {
    $a[$k] = $v + 1;
}
echo "Completed in ", microtime(true) - $start, " Secondsn";

$start = microtime(true);
foreach ($a as $k => &$v) {
    $v = $v + 1;
}
echo "Completed in ", microtime(true) - $start, " Secondsn";

$start = microtime(true);
foreach ($a as $k => $v) {}
echo "Completed in ", microtime(true) - $start, " Secondsn";

$start = microtime(true);
foreach ($a as $k => &$v) {}    
echo "Completed in ", microtime(true) - $start, " Secondsn";

And the results:

Completed in 0.0073502063751221 Seconds
Completed in 0.0019769668579102 Seconds
Completed in 0.0011849403381348 Seconds
Completed in 0.00111985206604 Seconds

So if you're modifying the array in the loop, it's several times faster to use references...

And the overhead for just the reference is actually less than copying the array (this is on 5.3.2)... So it appears (on 5.3.2 at least) as if references are significantly faster...

Wednesday, March 31, 2021
 
muaddhib
answered 7 Months ago
72

I would just code it like:

$c = $a;
foreach ($b as $removeKey) {
    unset($c[$removeKey]);
}
Saturday, May 29, 2021
 
Dail
answered 5 Months ago
17

After quite some fiddling, I came up with the following code.

The idea is to identify conflicting elements and swap them to a column where they are no longer a problem. For cases where this is not applicable a random selection is done. The code works recursive and thus there are edge-cases where it takes very long to complete.

An extreme edge-case is an input where all rows consist of exactly the same values.

<?php
declare(strict_types=1);

final class SwapSolver
{
    /**
     * @param array $input
     *
     * @return array
     */
    public function solve(array $input): array
    {
        $input = array_values($input);

        return $this->swapDuplicates($this->prepare($input, $this->getMinRowLength($input)));
    }

    /**
     * @param array $input
     *
     * @return array
     */
    private function swapDuplicates(array $input): array
    {
        $unswappable = [];

        foreach ($this->duplicates($input) as $position) {
            list($r, $a) = $position;

            $swapped = false;
            foreach ($this->swapCandidates($input, $r, $a, true) as $b) {
                $input[$r] = $this->swap($input[$r], $a, $b);
                $swapped = true;
                break;
            }

            if (!$swapped) {
                $unswappable[] = $position;
            }
        }

        // still unswappable
        $unswappable = array_values(array_filter($unswappable, function (array $position) use ($input): bool {
            return $this->isDuplicate($input, ...$position);
        }));

        // tie breaker
        if (count($unswappable) > 0) {
            list($r, $a) = $unswappable[mt_rand(0, count($unswappable) - 1)];

            $candidates = [];
            foreach ($this->swapCandidates($input, $r, $a, false) as $b) {
                $candidates[] = $b;
            }

            $input[$r] = $this->swap($input[$r], $a, $candidates[mt_rand(0, count($candidates) - 1)]);

            return $this->swapDuplicates($input);
        }

        return $input;
    }

    /**
     * @param array $input
     *
     * @return Generator
     */
    private function duplicates(array &$input): Generator
    {
        foreach ($input as $r => $row) {
            foreach ($row as $c => $value) {
                if ($this->isDuplicate($input, $r, $c)) {
                    yield [$r, $c];
                }
            }
        }
    }

    /**
     * @param array $input
     * @param int   $row
     * @param int   $column
     *
     * @return bool
     */
    private function isDuplicate(array $input, int $row, int $column): bool
    {
        $candidate = $input[$row][$column];

        if (is_null($candidate)) {
            return false;
        }

        foreach (array_column($input, $column) as $r => $value) {
            if ($r !== $row && $value === $candidate) {
                return true;
            }
        }

        return false;
    }

    /**
     * @param array $input
     * @param int   $row
     * @param int   $column
     * @param bool  $strict
     *
     * @return Generator
     */
    private function swapCandidates(array &$input, int $row, int $column, bool $strict): Generator
    {
        foreach ($input[$row] as $c => $dst) {
            if ((!$strict || !in_array($input[$row][$column], array_column($input, $c), true))
                && (is_null($dst) || !in_array($dst, array_column($input, $column), true))
            ) {
                yield $c;
            }
        }
    }

    /**
     * @param array $row
     * @param int   $from
     * @param int   $to
     *
     * @return array
     */
    private function swap(array $row, int $from, int $to): array
    {
        $tmp = $row[$to];
        $row[$to] = $row[$from];
        $row[$from] = $tmp;

        return $row;
    }

    /**
     * @param array $input
     * @param int   $padSize
     *
     * @return array
     */
    private function prepare(array $input, int $padSize): array
    {
        return array_map(function (array $row) use ($padSize): array {
            $row = array_pad($row, $padSize, null);
            shuffle($row);

            return $row;
        }, $input);
    }

    /**
     * @param array $input
     *
     * @return int
     */
    private function getMinRowLength(array $input): int
    {
        return max(
            ...array_values(array_count_values(array_merge(...$input))),
            ...array_map('count', $input)
        );
    }
}

Usage:

<?php
$solver = new SwapSolver();
$solution = $solver->solve($input);

More code at: https://github.com/Yoshix/so-47261385

Sunday, August 8, 2021
 
DHranger
answered 3 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :