Looking for something?

Welcome to thinklikeamage.com

Here you'll find in-depth articles and information gathered throughout the author's professional software development career.

Overview

Compression is a fairly common optimization that can in most cases reduce the amount of information sent over the wire to the client.  So common in fact that it should not come as a surprise that Nginx supports such a mechanism out of the box.  However what may come as a surprise to some readers  is that Nginx does not offer a single compression-related module, but rather three.  The versatility of these compression modules has helped Nginx claim its reputation of being one of the best tools in the industry for serving static assets over the web.

 

Compression

The author assumes that the audience is already well acquainted with the theory and general implementation of compression.  Specifically, Nginx relies on the gzip compression library which is installed on virtually all Linux-based systems by default.   Gzip is pretty much the de-facto standard for well rounded compression algorithms in terms of size reduction and speed.  It supports nine levels of compression to facilitate a user specified trade-off of speed versus compression.  Nginx supports tweaking the compression level based upon the hardware specifications of the server, the type of content, expected load, and response time expectations.

ngx_http_gzip_module

The ngx_http_gzip_module is comprised of nine directives and a single embedded variable that may prove useful for gathering compression statistics.  The following sections describe the various directives associated with the ngx_http_gzip_module and also offer further insights into the effects of changing the values from the defaults.

gzip

The gzip directive either enable or disables gzip processing of responses in any (or all) of the http, server, or location contexts.  While this might seem obvious at a glance, there are a few particular instances where the use of this directive can sap performance if used incorrectly with other directives.  One notable situation would be serving pre-compressed content without pre-filtering by mime-type leading to redundantly compressing data.

The default value for the gzip directive is off and should be explicitly enabled in almost all cases.  From experience, the http context is usually the place that this directive is enabled so that it will cover all server and location contexts.  There may be some unique cases that go against this rule of thumb:

  • Legacy / unique applications where it is known for certain that absolutely no clients support gzip encoding.
  • Applications that only serve assets that do not benefit from gzip compression such as compressed archives or optimized images.

gzip_buffers

This directive is probably the least publicized among all of the ngx_http_gzip_module directives -- likely because of how involved the process of properly tuning the buffers actually is.  The gzip_buffers directive takes two arguments: number - the maximum number of buffers that Nginx is allowed to allocate and size - the size in kilobytes of each buffer.  By default, the size argument is the same as the system memory page size which is either 4k or 8k depending on the platform.  For platforms with 4k memory pages, a number of 32 buffers are specified whereas on platforms with 8k memory pages, 16 buffers are specified.  The gzip_buffers directive may appear in any (or all) of the http, server, and location contexts.

Depending on the hardware specifications of the system, the gzip_buffers can be configured to favor either lessen processing power or improve memory management.  Setting a higher size and a lower number means that space may be wasted when allocating a large buffer for small content.  Likewise, setting a lower size and a higher number will result in more efficient memory management at the expense of increased processing costs.  For this reason, it may be beneficial to utilize the $gzip_ratio embedded variable in order to find the average transfer size and adjust the buffer size to make as few allocations as possible while not excessively wasting memory.

gzip_comp_level

The gzip_comp_level directive takes a single argument: level, which allows for leveraging the speed of compression against the quality of compression.  Acceptable values for level range from 1 to 9 where 1 utilizes the fastest, albeit lowest quality algorithm and 9 utilizes the slowest, but highest quality algorithm.

One noteworthy point about the gzip_comp_level directive is that the Nginx default compression level does not match the GNU default.  By default, Nginx sets the gzip_comp_level to 1 whereas GNU gzip sets the level to 6.  It is generally recommended to follow suit with the GNU default of 6 unless there is a reason to adjust it.

gzip_disable

 The gzip_disable directive takes an unspecified amount of regular expressions as arguments that match user agents that should be excluded from the normal response gzip compression.  This allows for supporting older browsers, proxy servers, and firewalls that do not provide an appropriate Accept-Encoding request header.

There also exists a special value for this directive, msie6 that corresponds to the regular expression MSIE [4-6]\. that is not subject to pattern matching and is therefore faster.  Unless there are additional special circumstances, it is generally recommended to use msie6 as the value for this directive.

gzip_http_version

The gzip_http_version directive allows the web-server to determine the minimum http protocol version to enable response compression.  At the time of this writing, the current http version is 1.1, which also happens to be the default value for this directive.

It is conceivable in some circumstances that 1.0 can be specified, thus enabling compression on http 1.0 connections.  This does come at a subtle cost however, as the Content-Length header will not be set, thus making keep-alive connections impossible.1  Determining whether or not to trade keep-alive potential for response compression is determined on a case by case basis depending on the content being served versus the traffic patterns of the clients.

gzip_proxied

The gzip_proxied directive allows for specialized treatment of requests that bear a Via http header field (that is, requests that appear to have come from a proxy).  Normally this directive comes into play when Nginx is used as the origin server behind one or more reverse caches.  It is general advisable to not compress any responses going back to a proxy that will be cached for a time on said proxy.  Consider the following setup:

topology

 Consider for a moment that in the above setup the origin server (Nginx) was configured to cache the responses to all requests coming in from the CDN cluster. In the above scenarios, "normal use" - which consists of HTTP/1.1 requests that are normally cached by the CDN cluster would function without any unnecessary overhead.  The second scenario however would require one of the CDN nodes to decompress the cached response, look up the requested byte-range, and then optionally re-compress the requested range before sending it back to the client.  The third scenario, a client that does not support gzip encoding, would fall into the same situation - forcing the CDN to decompress the content before sending the response to the client.

By conditionally compressing the responses to incoming requests based upon whether or not the CDN is expected to cache the result, it is possible to prevent the above scenarios.  As a final note, if it is known ahead of time that the majority of requests will not fall under the second and third scenarios, then it may be beneficial to implement pre-compiled assets via the gzip_static directive.  The gzip_static directive is covered later in this article under the ngx_http_gzip_static_module section.

There are several values that can be used in conjunction with each-other to offer granular control of how proxied requests are handled.

off

Setting the off parameter takes precedence over any and all others.  Absolutely no proxied requests will be subject to compression.  The default value for gzip_proxied is off.

expired

Setting the expired parameter will compress the response to any proxied requests that have the Expires header and have been invalidated.

no-cache

Setting the no-cache parameter will compress the response to any proxied requests that have the Cache-Control header set with a no-cache parameter included.

no-store

Setting the no-store parameter will compress the response to any proxied requests that have the Cache-Control header set with a no-store parameter included.

private

Setting the private parameter will compress the response to any proxied requests that have the Cache-Control header set with a private parameter included.

no_last_modified

Setting the no_last_modified parameter will compress the response to any proxied requests that do not include the Last-Modified header.

no_etag

Setting the no_etag parameter will compress the response to any proxied requests that do not include the ETag header.

auth

Setting the auth parameter will compress the response to any proxied requests that include the Authorization header.

any

Setting the any parameter will compress the response to any and all proxied requests.

 

gzip_types

The effectiveness of compression will also largely vary based upon what is actually being compressed.  For example most images will likely have little to no gain from being compressed with gzip.  The following chart indicates the size reduction achieved by compressing the logo of this site with compression levels 1-9:

Compression Level Original Size Compressed Size Reduction
No Compression 535 Bytes N/A N/A
1 535 Bytes 502 Bytes 6.168224299%
2 535 Bytes 502 Bytes 6.168224299%
3 535 Bytes 502 Bytes 6.168224299%
4 535 Bytes 502 Bytes 6.168224299%
5 535 Bytes 501 Bytes 6.355140187%
6 535 Bytes 501 Bytes 6.355140187%
7 535 Bytes 501 Bytes 6.355140187%
8 535 Bytes 501 Bytes 6.355140187%
9 535 Bytes 501 Bytes 6.355140187%

Text on the other hand can show dramatic reduction:

Compression Level Original Size Compressed Size Reduction
No Compression 1212073 Bytes N/A N/A
1 1212073 Bytes 77357 Bytes 93.6177936477%
2 1212073 Bytes 76238 Bytes 93.7101148198%
3 1212073 Bytes 75635 Bytes 93.7598642986%
4 1212073 Bytes 69167 Bytes 94.2934955237%
5 1212073 Bytes 68001 Bytes 94.3896943500%
6 1212073 Bytes 67842 Bytes 94.4028123718%
7 1212073 Bytes 67795 Bytes 94.4066900260%
8 1212073 Bytes 66970 Bytes 94.4747552333%
9 1212073 Bytes 65918 Bytes 94.5615486855%

This is where the gzip_types directive comes into play.  This directive allows for conditional compression based upon the mime-type of the content being served.

 

gzip_vary

The gzip_vary directive causes nginx to insert the Vary: Accept-Encoding header into responses if any of the gzipgzip_static, or gunzip are active.  It is important to enable gzip_vary in such cases to ensure proper and efficient delivery of content to the client.  If gzip_vary is not enabled, then there are a couple of detrimental things that can happen:

  1. If a caching solution is being utilized and a client that does not support compression falls into a cache-miss situation, then the cache has the potential to retain an uncompressed response which will cause all subsequent clients to be served uncompressed content.  This can cause excessive bandwidth usage and slow down content delivery.
  2. Alternatively, if a caching solution is being utilized and a client that does support compression falls into a cache-miss situation, then the cache has the potential to retain a compressed response which will cause any subsequent clients that do not support compression to be served compressed content.  This will result in a page full of garbled text for these users.

gzip_vary allows for caching multiple versions of content based upon the clients' Accept-Encoding capabilities.

$gzip_ratio

The $gzip_ratio embedded variable grants insight into the effectiveness of response compression and can be used in logging just as any other variable.

http {
    log_format compression '$remote_addr | $body_bytes_sent | "$gzip_ratio"';

    server {
... access_log /spool/logs/nginx-access.log compression;
... } }

 119.7.1 Compatibility with HTTP/1.0 Persistent Connections