What is output buffering and why is one using it in PHP?
Output Buffering for Web Developers, a Beginner’s Guide:
Without output buffering (the default), your HTML is sent to the browser in pieces as PHP processes through your script. With output buffering, your HTML is stored in a variable and sent to the browser as one piece at the end of your script.
Advantages of output buffering for Web developers
- Turning on output buffering alone decreases the amount of time it takes to download and render our HTML because it's not being sent to the browser in pieces as PHP processes the HTML.
- All the fancy stuff we can do with PHP strings, we can now do with our whole HTML page as one variable.
- If you've ever encountered the message "Warning: Cannot modify header information - headers already sent by (output)" while setting cookies, you'll be happy to know that output buffering is your answer.
The output of your application should only contain one output encoding. If you have multiple chunks that are encoded differently, then the browser will get a result that it is impossible to work with. Hence the encoding error.
Kohana itself makes already use of the output buffer. If you want to combine that with your ob_gzhandler output buffer, you need to start your buffer before kohana initialized it's own. That's because output buffer are stackable. When kohana has finished it's output buffering, yours will apply:
ob_start('ob_gzhandler'); # your buffer: ob_starts and ends by kohana
So whenever kohana has done some output, these chunks will get passed on into your output callback (
ob_gzhandler()) and will be gz-encoded.
The browser should then only get gz-encoded data as it was the output buffer at the topmost level.
Using ob_gzhandler and manually echo'ing the buffer
If you make use of
ob_start('ob_gzhandler') to let PHP deal with the compression and you then
echo ob_get_clean(), you will create an unreliable output. That's related to how the compression togther with output buffering works:
PHP will buffer chunks of output. That means, PHP starts to compress the output but keeps some bytes to continue compressing. So ob_get_clean() returns the so-far compressed part of the buffer. Often that result is not complete.
To deal with that, flush the buffer first:
ob_start('ob_gzhandler') OR ob_start(); echo 'eh?'; ob_flush(); $gz = ob_get_clean(); echo $gz;
And ensure you don't have any more output after that.
If you would have PHP reached the end of your script, it would have taken care of that: Flushing and outputting.
Now you need to manually call
ob_flush() to explicitly make PHP push the buffer through the callbacks.
Inspecting HTTP Compression Problems with Curl
As firefox will return an error, another tool to inspect what's causing the encoding error is needed. You can use
curl to track what's going on:
curl --compress -i URL
Will request the URL with compression enabled while displaying all response headers and the body unencoded. This is necessary as PHP transparently enables / disables compression of the
ob_gzhandler callback based on request headers.
A response also shows that PHP will set the needed response headers as well. So no need to specify them manually. That would be even dangerously, because only by calling
ob_start('ob_gzhandler') you can not say if compression is enabled or not.
In case the compression is broken,
curl will give an error description but would not display the body.
Following is such a curl error message provoked with an incompletely generated output by a faulty php script:
HTTP/1.1 200 OK X-Powered-By: PHP/5.3.6 Content-Encoding: gzip ... curl: (23) Error while processing content unencoding: invalid code lengths set
By adding the
--raw switch, you can even peak into the raw response body:
curl --compress --raw -i URL
That can give an impression what's going wrong, like uncompressed parts within the body.
ob_start() at the start of the script before any output (not even an empty space).
When u want to output use
The main differences:
1.) you can use "normal" output syntax, so for example an
echo statement. You don't have to rewrite your problem.
2.) you have better control about the buffering, since buffers can be stacked. You don't have to know about naming conventions and the like, this makes implementations easier where the writing and using side are implemented separate from each other.
3.) no additional logic require to output buffered content, you just
flush. Especially interesting if the output stream is something special. Why burden the controlling scope with dealing with that?
4.) you can use the same output implementation regardless of an output buffer has been created. THis is a question of transparency.
5.) you can 'catch' accidentially out bubbled stuff like warnings and the like and simply swallow it afterwards.
Output buffers are stackable, that is, you may call ob_start() while another ob_start() is active. Just make sure that you call ob_end_flush() the appropriate number of times. If multiple output callback functions are active, output is being filtered sequentially through each of them in nesting order.
So it is perfectly valid to assume that an
ob_end/get will end/return the matching
ob_start(); echo "<div class=outer>"; ob_start(); echo "<div class=inner></div>"; $inner = ob_get_clean(); // <div class=inner></div> echo "</div>"; $outer = ob_get_clean(); // <div class=outer></div>