Inspecting HTTP Response Headers Without Downloading Body with Guzzle
In a recent project, the author needed to inspect the HTTP response headers of a large file download to determine if it was necessary to download the file. They specifically wanted to check the ETag header to see if it had changed since a previous download. However, simply issuing a HEAD request to the URL didn't work in this case, as some of the URLs were signed S3 URLs that only allowed GET requests.
To solve this problem, the author turned to Guzzle, a popular PHP HTTP client. They discovered the on_headers
request option, which allows you to inspect the response headers before the body of the response is downloaded. By throwing a custom exception inside the on_headers
callable, they were able to abort the downloading of the response body.
Here's an example of how the getHeaders()
function was implemented:
use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;
function getHeaders($url)
{
$client = new Client();
try {
$responseWithOnlyHeaders = $client->get($url, [
'on_headers' => function ($response) {
// Save the response with only headers for later use
$this->responseWithOnlyHeaders = $response;
// Abort the downloading of the response body
throw new BlockResponseBodyDownload();
}
]);
} catch (RequestException $e) {
// Check if the exception is our custom exception
if ($e->getPrevious() instanceof BlockResponseBodyDownload) {
// Ignore the exception
} else {
// Re-throw the exception
throw $e;
}
}
// Continue with the rest of the code...
}
Note that the BlockResponseBodyDownload
exception is a custom exception used specifically for this purpose. By checking the previous exception in the RequestException
, the author was able to differentiate between their custom exception and Guzzle's native exceptions.
One caveat the author discovered is that Guzzle no longer automatically follows redirects when using on_headers
. To handle redirects, they had to manually implement the redirect logic.
This solution allowed the author to inspect the response headers without downloading the response body, saving time and resources when dealing with large file downloads.