The easiest way to compress the data that is being served to the visitors of your web application is to make use of mod_deflate. Once you have enabled that module and provided it with a suitable configuration file, it will compress all releant files on the fly as it is serving them.
Given that I was already going to minify my Javascript and CSS files ahead of time (i.e. not using mod_pagespeed), I figured that there must be a way for me to serve gzipped files directly.
"Compiling" Static Files
I decided to treat my web application like a c program. After all, it starts as readable source code and ends up as an unreadable binary file.
So I created a Makefile to minify and compress all CSS and Javascript files using YUI Compressor and gzip
:
all: build
build:
find static/css -type f -name "[^.]*.css" -execdir yui-compressor -o {}.css {} \;
find static/js -type f -name "[^.]*.js" -execdir yui-compressor -o {}.js {} \;
cd static/css && for f in *.css.css ; do gzip -c $$f > `basename $$f .css`.gz ; done
cd static/js && for f in *.js.js ; do gzip -c $$f > `basename $$f .js`.gz ; done
clean:
find static/css -name "*.css.css" -delete
find static/js -name "*.js.js" -delete
find static/css -name "*.css.gz" -delete
find static/js -name "*.js.gz" -delete
find -name "*.pyc" -delete
This leaves the original files intact and adds minified .css.css
and .js.js
files as well as minified and compressed .css.gz
and .js.gz
files.
How browsers advertise gzip support
The nice thing about serving compressed content to browsers is that browsers that support receiving gzipped content (almost all of them nowadays) include the following HTTP header in their requests:
Accept-Encoding = gzip,deflate
(Incidently, if you want to test what non-gzipped enable browsers see, just browse to about:config
and remove what's in the network.http.accept-encoding
variable.)
Serving compressed files to clients
To serve different files to different browsers, all that's needed is to enable Multiviews in our Apache configuration (as suggested on the Apache mailing list):
<Directory /var/www/static/css>
AddEncoding gzip gz
ForceType text/css
Options +Multiviews
SetEnv force-no-vary
Header set Cache-Control "private"
</Directory>
<Directory /var/www/static/js>
AddEncoding gzip gz
ForceType text/javascript
Options +Multiviews
SetEnv force-no-vary
Header set Cache-Control "private"
</Directory>
The ForceType
directive is there to force the mimetype (as described in this solution) and to make sure that browsers (including Firefox) don't download the files to disk.
As for the SetEnv
directive, it turns out that on Internet Explorer, most files with a Vary header (added by Apache) are not cached and so we must make sure it gets stripped out before the response goes out.
Finally, the Cache-Control
headers are set to private
to prevent intermediate/transparent proxies from caching our CSS and Javascript files, while allowing browsers to do so. If intermediate proxies start caching compressed content, they may incorrectly serve it to clients without gzip support.
I am sorry, but however I read the link you claim to support that Vary header should be removed (https://code.google.com/speed/page-speed/docs/caching.html#LeverageProxyCaching), it claims you should leave the Vary in place and use Cache-Control: public.
It says IE does not cache resources with Vary anything BUT Accept-Encoding (and User-Agent) -- and Apache would be adding Vary: Accept-Encoding. So IE will, at least according to that page, cache properly. Plus the page says all proxies in common use DO understand Vary: Accept-Encoding and can match it to client Accept-Encoding, so you want Cache-Control: public than.
Even worse, according to the apache documentation force-no-vary implies force-response-1.0 and you don't want that performance killer (it implies no keepalive). Apache will automatically use it when talking to known-broken user agents.
Hi François. Great idea!
What happens if you have images inside your CSS directory? As far as I can see, you treat the whole content inside the css/ and js/ directories the same way...
I suppose there's a way to exclude certain files in the Apache configuration, but that's making it a bit more trickier. Next time I'm boosting a web application, I'll be sure to try that out.
@Jan Thanks for catching that. It turns out that the anchor on that link was pointing to the wrong section.
The link should have been:
https://code.google.com/speed/page-speed/docs/caching.html#LeverageBrowserCaching
which says:
"Internet Explorer does not cache any resources that are served with the Vary header and any fields but Accept-Encoding and User-Agent. To ensure these resources are cached by IE, make sure to strip out any other fields from the Vary header, or remove the Vary header altogether if possible."