Setting up Varnish for Wordpress

Varnish is a web application caching proxy. It sits in front of your web backend (ex. apache/nginx) and provides an awesome caching layer in front of it to reduce the load on the webserver. It is great for sites that are not completely dynamic so that content can be cached for longer and round trips to the server are less frequent. However Varnish cannot be used to cache https requests as it doesn't have ssl termination.

This article is about setting up Varnish on a CentOS 6 server. This is also assuming that apache has already been configured to listen on port 80.

Installing Varnish

Installing Varnish is fairly easy on a RHEL 6 based system.

$ rpm --nosignature -i http://repo.varnish-cache.org/redhat/varnish-3.0/el6/noarch/varnish-release/varnish-release-3.0-1.el6.noarch.rpm
# install varnish via yum
$ yum install varnish

Setting up Apache

This is assuming you have set up apache to listen to port 80. It is suggested that you change the port apache listens on so that Varnish can be changed to listen on port 80. This is so a minimal amount of work needs to be done if a loadbalancer is not set up.

Edit /etc/httpd/conf/httpd.conf so that it includes these 2 lines. If it already exists, modify the values.

NameVirtualHost 127.0.0.1:81
Listen 127.0.0.1:81

Setting up Varnish

Edit /etc/varnish/default.vcl so it includes partial files instead of having a monolithic config file.

# include the list of backends
include "backends.vcl";
# include list of hosts that can purge
include "purge.vcl";
# include wordpress specific configuration
include "wordpress.vcl";

Edit /etc/varnish/backends.vcl to setup the server that Varnish will be proxying to.

backend default {
  # this is the server where apache is running
  .host = "127.0.0.1";
  # this is the port apache is listening on
  .port = "81";
}

Edit /etc/varnish/purge.vcl to setup which network locations can purge the cache.

acl purge {
  "localhost";
  "127.0.0.1";
  "192.168.1.0"/24;
}

Edit /etc/varnish/wordpress.vcl to setup the wordpress specific configuration of the cache.

# Called after a document has been successfully retrieved from the backend.
sub vcl_fetch {
  # Uncomment to make the default cache "time to live" is 5 minutes, handy
  # but it may cache stale pages unless purged. (TODO)
  # By default Varnish will use the headers sent to it by Apache (the backend server)
  # to figure out the correct TTL.
  # WP Super Cache sends a TTL of 3 seconds, set in wp-content/cache/.htaccess

  # set beresp.ttl   = 300s;

  # Strip cookies for static files and set a long cache expiry time.
  if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") {
    unset beresp.http.set-cookie;
    set beresp.ttl = 24h;
  }

  # If WordPress cookies found then page is not cacheable
  if (req.http.Cookie ~"(wp-postpass|wordpress_logged_in|comment_author_)") {
    set beresp.ttl = 0s;
  }

  # Varnish determined the object was not cacheable
  if (beresp.ttl == 0s) {
    set beresp.http.X-Cacheable = "NO:Not Cacheable";
  } else if (req.http.Cookie ~"(wp-postpass|wordpress_logged_in|comment_author_)") {
    # Skip caching content for logged in users
    set beresp.http.X-Cacheable = "NO:Got Session";
    set beresp.ttl = 0s;
    return(hit_for_pass);
  } else if (beresp.http.cache-control ~ "(no-cache|private)" || beresp.http.pragma ~ "no-cache") {
    # You are respecting the Cache-Control=private header from the backend
    set beresp.http.X-Cacheable = "NO:Cache-Control=private";
    set beresp.ttl = 0s;
    return(hit_for_pass);
  } else if (beresp.ttl < 1s) {
    # You are extending the lifetime of the object artificially
    set beresp.ttl   = 300s;
    set beresp.grace = 300s;
    set beresp.http.X-Cacheable = "YES:Forced";
  } else {
    # Varnish determined the object was cacheable
    set beresp.http.X-Cacheable = "YES";
  }
  if (beresp.status == 404 || beresp.status >= 500) {
    set beresp.ttl = 0s;
  }

  # Fix a strange problem: HTTP 301 redirects to the same page sometimes go in
  if (beresp.http.Location == req.proto + "://" + req.http.host + req.url
    || beresp.http.Location == req.http.X-Forwarded-Proto + "://" + req.http.host + req.url
  ) {
    if (req.restarts > 2) {
      unset beresp.http.Location;
      # set beresp.http.X-Restarts = req.restarts;
    } else {
      return(restart);
    }
  }

  # Deliver the content
  return(deliver);
}

sub vcl_hash {
  # Each cached page has to be identified by a key that unlocks it.
  # Add the browser cookie only if a WordPress cookie found.
  if (req.http.Cookie ~ "(wp-postpass|wordpress_logged_in|comment_author_)") {
    hash_data(req.http.Cookie);
  }

  # fix ssl termination in LB redrect loop
  # http://akashbhunchal.blogspot.com/2013/09/https-redirection-with-elb-and-varnish.html
  hash_data(req.http.host);
  hash_data(req.url);
  hash_data(req.http.X-Forwarded-Proto);
  return(hash);
}

# Deliver
sub vcl_deliver {
  # Comment out these lines to debug Varnish.
  remove resp.http.X-Varnish;
  remove resp.http.Via;
  remove resp.http.Age;
  remove resp.http.X-Powered-By;
  remove resp.http.X-Cacheable;
}

# vcl_recv is called whenever a request is received
sub vcl_recv {
  # remove ?ver=xxxxx strings from urls so css and js files are cached.
  # Watch out when upgrading WordPress, need to restart Varnish or flush cache.
  set req.url = regsub(req.url, "\?ver=.*$", "");

  # Remove "replytocom" from requests to make caching better.
  set req.url = regsub(req.url, "\?replytocom=.*$", "");

  # Overwrite X-Forwarded-For header with the client IP
  #remove req.http.X-Forwarded-For;
  #set  req.http.X-Forwarded-For = client.ip;
  # OR append the forwarded ip in the header
  if (req.http.x-forwarded-for) {
    set req.http.X-Forwarded-For = req.http.X-Forwarded-For + "," + client.ip;
  } else {
    set req.http.X-Forwarded-For = client.ip;
  }

  # Exclude this site because it breaks if cached
  #if (req.http.host == "example.com") {
  #  return(pass);
  #}

  # Serve objects up to 2 minutes past their expiry if the backend is slow to respond.
  set req.grace = 120s;
  # Strip cookies for static files:
  if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") {
    unset req.http.Cookie;
    return(lookup);
  }
  # Remove has_js and Google Analytics __* cookies.
  set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js)=[^;]*", "");
  # Remove a ";" prefix, if present.
  set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");
  # Remove empty cookies.
  if (req.http.Cookie ~ "^\s*$") {
    unset req.http.Cookie;
  }
  if (req.request == "PURGE") {
    if (!client.ip ~ purge) {
      error 405 "Not allowed.";
    }
    return(lookup);
    error 200 "Purged.";
  }

  # Pass anything other than GET and HEAD directly.
  if (req.request != "GET" && req.request != "HEAD") {
    return(pass);
  }    /* We only deal with GET and HEAD by default */

  # remove cookies for comments cookie to make caching better.
  set req.http.cookie = regsub(req.http.cookie, "1231111111111111122222222333333=[^;]+(; )?", "");

  # never cache the admin pages, or the server-status page
  if (req.request == "GET" && (req.url ~ "(wp-admin|bb-admin|server-status)")) {
    return(pipe);
  }
  # skip caching authenticated sessions
  if (req.http.Cookie && req.http.Cookie ~ "(wordpress_|PHPSESSID)") {
    return(pass);
  }
  # skip caching ajax requests
  if(req.http.X-Requested-With == "XMLHttpRequest" || req.url ~ "nocache" || req.url ~ "(control.php|wp-comments-post.php|wp-login.php|bb-login.php|bb-reset-password.php|register.php)") {
    return(pass);
  }
  return(lookup);
}

sub vcl_hit {
  if (req.request == "PURGE") {
    purge;
    error 200 "Purged.";
  }
}

sub vcl_miss {
  if (req.request == "PURGE") {
    purge;
    error 200 "Purged.";
  }
}

This setup of Varnish features:

  • not caching the admin
  • multiple host names (different sites)
  • loadbalancer support in front of Varnish.

You can test the Varnish config with the following command:

varnishd -C -f /etc/varnish/default.vcl

Modifying wordpress wp-config.php

You will have to modify wp-config.php so that it knows what the client ip address is since it will now be passed through a the X-Forwarded-For header.

if (!empty($_SERVER['HTTP_X_FORWARDED_FOR'])) {
    $_SERVER['REMOTE_ADDR'] = trim(array_shift(explode(',', $_SERVER['HTTP_X_FORWARDED_FOR'])));
}

Starting Varnish Service

If you want to have Varnish listen on port 80 you will need to make this change in /etc/sysconfig/varnish. Make sure VARNISH_LISTEN_PORT is set to the appropriate port, otherwise it will listen to port 6081.

VARNISH_LISTEN_PORT=80

Start the Varnish service with:

service varnish start

Start Varnish on boot

chkconfig --level 345 varnish on

Configuring logging with Varnish

You can log Varnish requests in the same format as the apache access log but will write to the file: /var/log/varnish/varnishncsa.log

The default configuration should be fine. However if you have multiple domains pass through Varnish or have a loadbalancer in front of Varnish, the following changes should be made to: /etc/sysconfig/varnishncsa.

# Configuration file for varnish
#
# /etc/init.d/varnish expects the variable $DAEMON_OPTS to be set from this
# shell script fragment.
#
LOG_FORMAT='%{Host}i %{X-Forwarded-For}i %{Varnish:handling}x %l %u %t "%r" %s %b "%{Referer}i" "%{User-Agent}i"'
DAEMON_OPTS="$DAEMON_OPTS $PREFER_X_FORWARDED_FOR -F '$LOG_FORMAT'"

Start the Varnish logging service with:

service varnishncsa start

Start Varnish logging on boot

chkconfig --level 345 varnishncsa on

Conclusion

At this point, Varnish should be running and your site should cache requests for up to 3 minutes with the provided configuration file. Pages should load up much faster after the initial page load as they will pass through Varnish to be cached.

You might need to set up W3 Total Cache or some other plugin to purge the cache when changes are made to pages in Wordpress.

These sites were particularly helpful with helping me set up Varnish.