Advanced Drupal 8 cache with Pound & Varnish 4 on Ubuntu

Last modified
Thursday, April 16, 2020 - 16:21

On a previous post I already explained briefly what Varnish is and the advantages of running it along with Pound to get an amazing caching architecture under HTTPS for Drupal. On this tutorial I'll upgrade the steps so we can run both, Drupal 7 and Drupal 8 sites with Varnish 4.x which at the time of this post, is the latest supported by Drupal 8. Also, the Ubuntu version used is 16.04 which is the latest LTS release from the Ubuntu guys.

Alright, let's begin by getting the required packages. First, we need to get Pound, a version superior to 2.7 which implements fixes for Poodle attacks. Visit this page to get a full and updated release of Pound 2.7, find the "The Yakkety Yak (current stable release)" preferable:

$cd ~
$wget https://launchpad.net/ubuntu/+archive/primary/+files/pound_2.7-1.2_amd64.deb

Let's install Pound:

$sudo dpkg -i pound_2.7-1.2_amd64.deb

Now, let's grab Varnish 4 from the Ubuntu repositories, running this command will install Varnish version 4.1 which is the stable available in the Ubuntu 16.04 repositories:

$sudo apt-get install varnish

Ok now that we have both daemons installed and running, let's start with the configuration. Consider the following scenario:

  • Pound is the server facing the internet, so Pound will be listening on ports 80 and 443 and will be redirecting all requests to Varnish.
  • Varnish will get the http/https requests from Pound on port 8080 and redirect traffic to the designated backend server.
  • Pound and Varnish will be running in the same physical server.
  • For this approach we are assuming we have already an Apache or Nginx server running and configured listening on port 80.

Let's go with configuring Varnish first.

Since Ubuntu 16.04 uses systemd, the new way to configure the varnish daemon is not through the old /etc/default/varnish file but by editing the new existing systemd configuration file located in /lib/systemd/system/, open and edit as follows:

$sudo nano /lib/systemd/system/varnish.service

update the line ExecStart, replace with the following:

ExecStart=/usr/sbin/varnishd -j unix,user=vcache -F -a :8080 -T localhost:6082 \
-f /etc/varnish/default.vcl \
-S /etc/varnish/secret \
-s malloc,1G \
-p workspace_client=1024k \
-p http_req_size=128000 \
-p http_req_hdr_len=64000 \
-p feature=+esi_disable_xml_check

Reload the daemon configuration so changes take place and restart varnish:

$sudo systemctl daemon-reload
$sudo systemctl restart varnish.service

Now we need to edit the Varnish backends and the caching specifics for Drupal sites:

$sudo nano /etc/varnish/default.vcl

Paste this configuration below, don't forget to edit the backend servers accordingly, in this sample case, Varnish is assuming the default backed server (Apache/Nginx) has the IP 192.168.10.5 and is listening on port 80:

# This is a basic VCL configuration file for varnish.  See the vcl(7)
# man page for details on VCL syntax and semantics.
#
# Default backend definition.  Set this to point to your content
# server.
#

# Marker to tell the VCL compiler that this VCL has been adapted to the
# new 4.0 format.
vcl 4.0;

import std;
import directors;

# Default backend server
backend default {
  .host = "192.168.10.5";
  .port = "80";
  .max_connections = 300;
  .connect_timeout = 900s;
  .first_byte_timeout = 900s;
  .between_bytes_timeout = 900s;
}

acl purge_ban {
  # ACL we'll use later to allow purges
  "localhost";
  "127.0.0.1";
  "::1";
}

# Init
sub vcl_init {
  # Called when VCL is loaded, before any requests pass through it.
  # Typically used to initialize VMODs.

  new vdir = directors.round_robin();
  vdir.add_backend(default);
  # vdir.add_backend(server...);
  # vdir.add_backend(servern);

  return (ok);
}

# Recv
sub vcl_recv {

  # Set Default Backend
  set req.backend_hint = vdir.backend(); # send all traffic to the vdir director

  # Server Redirects
  # Uncomment this if you want to redirect to different servers based on domains.
  # Make sure you comment the "Default Backend" line above though.
  # 
  # if ( req.http.host == "example.com" ) {
  #   set req.backend_hint = prod_dir.backend();
  # } else {  # Set Default Backend
  #   set req.backend_hint = vdir.backend(); # send all traffic to the vdir director
  # }
  # End Server redirect.

  # Remove the proxy header (see https://httpoxy.org/#mitigate-varnish)
  unset req.http.proxy;

  # Normalize the query arguments
  set req.url = std.querysort(req.url);

  # Normalize the header, remove the port (in case you're testing this on various TCP ports)
  set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");

  # Strip hash, server doesn't need it.
  if (req.url ~ "\#") {
    set req.url = regsub(req.url, "\#.*$", "");
  }

  # Strip a trailing ? if it exists
  if (req.url ~ "\?$") {
    set req.url = regsub(req.url, "\?$", "");
  }

  # save the cookies before the built-in vcl_recv
  # allow caching when backend sets cookies
  set req.http.Cookie-Backup = req.http.Cookie;

  # Modify HTTP X-Forwarded-For header.
  # This will replace Varnish's IP with actual client's.
  # Leave commented since Pound is taking care of this.
  # unset req.http.X-Forwarded-For;
  # set   req.http.X-Forwarded-For = client.ip;

  # Purge logic
  # See https://www.varnish-cache.org/docs/4.0/users-guide/purging.html#http-purging
  # SeeV3 https://www.varnish-software.com/static/book/Cache_invalidation.html#removing-a-single-object
  if ( req.method == "PURGE" ) {
    if ( client.ip !~ purge_ban ) {
      return (synth(405, "Not allowed."));
    }
    return (purge);
  }

  # Ban logic
  # See https://www.varnish-cache.org/docs/4.0/users-guide/purging.html#bans
  if ( req.method == "BAN" ) {
    if ( client.ip !~ purge_ban ) {
      return (synth(405, "Not allowed."));
    }
    if (req.http.Purge-Cache-Tags) {
      ban(  "obj.http.X-Host == " + req.http.host +
        " && obj.http.Purge-Cache-Tags ~ " + req.http.Purge-Cache-Tags
        );
    }
    else {
      # Assumes req.url is a regex. This might be a bit too simple
      ban(  "obj.http.X-Host == " + req.http.host +
        " && obj.http.X-Url ~ " + req.url
        );
    }
    return (synth(200, "Ban added"));
  }

  # Verify HTTP request methods
  # Only deal with "normal" types.
  if (req.method != "GET" &&
      req.method != "HEAD" &&
      req.method != "PUT" &&
      req.method != "POST" &&
      req.method != "TRACE" &&
      req.method != "OPTIONS" &&
      req.method != "PATCH" &&
      req.method != "DELETE") {
    /* Non-RFC2616 or CONNECT which is weird. */
    
    return (pipe);
  }

  # Normalize Accept-Encoding header
  # Although Varnish 4 handles gziped content itself by default, just to be
  # sure we want to remove Accept-Encoding for some compressed formats.
  # See https://www.varnish-cache.org/docs/4.0/phk/gzip.html#what-does-http-gzip-support-do
  # See https://www.varnish-cache.org/docs/4.0/users-guide/compression.html
  # See https://www.varnish-cache.org/docs/4.0/reference/varnishd.html?highlight=http_gzip_support
  # See (for older configs) https://www.varnish-cache.org/trac/wiki/VCLExampleNormalizeAcceptEncoding
  if ( req.http.Accept-Encoding ) {
    if ( req.url ~ "(?i)\.(7z|avi|bz2|flv|gif|gz|jpe?g|mpe?g|mk[av]|mov|mp[34]|og[gm]|pdf|png|rar|swf|tar|tbz|tgz|woff2?|zip|xz)(\?.*)?$"
    ) {
      /* Already compressed formats, no sense trying to compress again */
      unset req.http.Accept-Encoding;
    } 
    if (req.http.Accept-Encoding ~ "gzip") {
      set req.http.Accept-Encoding = "gzip";
    } elsif (req.http.Accept-Encoding ~ "deflate" && req.http.user-agent !~ "MSIE") {
      set req.http.Accept-Encoding = "deflate";
    } else {
      unset req.http.Accept-Encoding;
    }
  }

  # Implementing websocket support (https://www.varnish-cache.org/docs/4.0/users-guide/vcl-example-websockets.html)
  if (req.http.Upgrade ~ "(?i)websocket") {
    return (pipe);
  }

  # Drupal's batch mode will behave in a funky manner since all cookies except
  # for the session get stripped out below.  This makes batch fall into
  # op=do_nojs mode, which isn't really needed.  Just get Varnish out of the way.
  if (req.url ~ "(^/batch)") {
    return (pipe);
  }

  # Only cache GET or HEAD requests. This makes sure the POST requests are always passed.
  if (req.method != "GET" && req.method != "HEAD") {
    /* We only deal with GET and HEAD by default */
    return (pass);
  }
  
  # Dont cache Authorization
  if ( req.http.Authorization ) {
    /* Not cacheable by default */
    return (pass);
  }
	
  # Modify (remove) progress.js request parameters.
  if (req.url ~ "^/misc/progress\.js\?[0-9]+$") {

    set req.url = "/misc/progress.js";
  }

  # Send Surrogate-Capability headers to announce ESI support to backend
  set req.http.Surrogate-Capability = "key=ESI/1.0";

  # Do not cache these paths.
  if (req.url ~ "^/status\.php$" ||
      req.url ~ "^/update\.php" ||
      req.url ~ "^/install\.php" ||
      req.url ~ "^/ooyala/ping$" ||
      req.url ~ "^/apc\.php$" ||
      req.url ~ "^/admin" ||
      req.url ~ "^/admin/.*$" ||
      req.url ~ "^/user" ||
      req.url ~ "^/user/.*$" ||
      req.url ~ "^/users/.*$" ||
      req.url ~ "^/info/.*$" ||
      req.url ~ "^/flag/.*$" ||
      req.url ~ "^.*/ajax/.*$" ||
      req.url ~ "^.*/ahah/.*$" ||
      req.url ~ "^/system/files/.*$") {

    return (pass);
  }

  # Pipe these paths directly to backend for streaming.
  if ( req.url ~ "^/admin/content/backup_migrate/export"
    || req.url ~ "^/admin/config/system/backup_migrate"
  ) {
    return (pipe);
  }
  if ( req.url ~ "^/system/files" ) {
    return (pipe);
  }

  # Large static files are delivered directly to the end-user without
  # waiting for Varnish to fully read the file first.
  # Varnish 4 fully supports Streaming, so set do_stream in vcl_backend_response()
  if (req.url ~ "^[^?]*\.(mp[34]|rar|tar|tgz|gz|wav|zip|bz2|xz|7z|avi|mov|ogm|mpe?g|mk[av]|webm)(\?.*)?$") {
    unset req.http.Cookie;
    return (hash);
  }

  # Some generic cookie manipulation, useful for all templates that follow
  # Remove the "has_js" cookie
  set req.http.Cookie = regsuball(req.http.Cookie, "has_js=[^;]+(; )?", "");

  # Remove any Google Analytics based cookies
  set req.http.Cookie = regsuball(req.http.Cookie, "__utm.=[^;]+(; )?", "");
  set req.http.Cookie = regsuball(req.http.Cookie, "_ga=[^;]+(; )?", "");
  set req.http.Cookie = regsuball(req.http.Cookie, "_gat=[^;]+(; )?", "");
  set req.http.Cookie = regsuball(req.http.Cookie, "_gid=[^;]+(; )?", "");
  set req.http.Cookie = regsuball(req.http.Cookie, "utmctr=[^;]+(; )?", "");
  set req.http.Cookie = regsuball(req.http.Cookie, "utmcmd.=[^;]+(; )?", "");
  set req.http.Cookie = regsuball(req.http.Cookie, "utmccn.=[^;]+(; )?", "");

  # Remove DoubleClick offensive cookies
  set req.http.Cookie = regsuball(req.http.Cookie, "__gads=[^;]+(; )?", "");

  # Remove the Quant Capital cookies (added by some plugin, all __qca)
  set req.http.Cookie = regsuball(req.http.Cookie, "__qc.=[^;]+(; )?", "");

  # Remove the AddThis cookies
  set req.http.Cookie = regsuball(req.http.Cookie, "__atuv.=[^;]+(; )?", "");

  # Remove a ";" prefix in the cookie if present
  set req.http.Cookie = regsuball(req.http.Cookie, "^;\s*", "");

  # Remove any Piiwik based cookies
  set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(_pk_(ses|id)[\.a-z0-9]*)=[^;]*", ""); # removes Piwik cookies

  # Remove all cookies for static files
  # A valid discussion could be held on this line: do you really need to cache static files that don't cause load? Only if you have memory left.
  # Sure, there's disk I/O, but chances are your OS will already have these files in their buffers (thus memory).
  # Before you blindly enable this, have a read here: https://ma.ttias.be/stop-caching-static-files/
  # Always cache the following static file types for all users.
  # Use with care if we control certain downloads depending on cookies.
  # Be careful also if appending .htm[l] via Drupal's clean URLs.
  if ( req.url ~ "(?i)\.(bz2|css|eot|gif|gz|html?|ico|jpe?g|js|mp3|ogg|otf|pdf|png|rar|svg|swf|tbz|tgz|ttf|woff2?|zip)(\?(itok=)?[a-z0-9_=\.\-]+)?$"
    && req.url !~ "/system/storage/serve"
  ) {
      unset req.http.Cookie;
  }

  # Remove all cookies that Drupal doesn't need to know about. We explicitly
  # list the ones that Drupal does need, the SESS and NO_CACHE. If, after
  # running this code we find that either of these two cookies remains, we
  # will pass as the page cannot be cached.
  if (req.http.Cookie) {
    # 1. Append a semi-colon to the front of the cookie string.
    # 2. Remove all spaces that appear after semi-colons.
    # 3. Match the cookies we want to keep, adding the space we removed
    #    previously back. (\1) is first matching group in the regsuball.
    # 4. Remove all other cookies, identifying them by the fact that they have
    #    no space after the preceding semi-colon.
    # 5. Remove all spaces and semi-colons from the beginning and end of the
    #    cookie string.
    set req.http.Cookie = ";" + req.http.Cookie;
    set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
    set req.http.Cookie = regsuball(req.http.Cookie, ";(SESS[a-z0-9]+|SSESS[a-z0-9]+|NO_CACHE)=", "; \1=");
    set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");

    if (req.http.Cookie == "") {
      # If there are no remaining cookies, remove the cookie header. If there
      # aren't any cookie headers, Varnish's default behavior will be to cache
      # the page.
      unset req.http.Cookie;
    }
    else {
      # If there is any cookies left (a session or NO_CACHE cookie), do not
      # cache the page. Pass it on to Apache directly.
      return (pass);
    }
  }

  if (req.http.Cache-Control ~ "(?i)no-cache") {
    #if (req.http.Cache-Control ~ "(?i)no-cache" && client.ip ~ editors) { # create the acl editors if you want to restrict the Ctrl-F5
    # http://varnish.projects.linpro.no/wiki/VCLExampleEnableForceRefresh
    # Ignore requests via proxy caches and badly behaved crawlers
    # like msnbot that send no-cache with every request.
    if (! (req.http.Via || req.http.User-Agent ~ "(?i)bot" || req.http.X-Purge)) {
      #set req.hash_always_miss = true; # Doesn't seems to refresh the object in the cache
      return(purge); # Couple this with restart in vcl_purge and X-Purge header to avoid loops
    }
  }

   return (hash);
}

# This function is used when a request is sent by our backend (Nginx server)
sub vcl_backend_response {
  
  # Ban lurker friendly bans support
  # See https://www.varnish-cache.org/docs/4.0/users-guide/purging.html#bans
  set beresp.http.X-Host = bereq.http.host;
  set beresp.http.X-Url = bereq.url;

  # Drupal 8's Big Pipe support
  # Tentative support, maybe: set beresp.ttl = 0s; is also needed
  if ( beresp.http.Surrogate-Control ~ "BigPipe/1.0" ) {
    set beresp.do_stream = true;
    # Varnish gzipping breaks streaming of the first response
    set beresp.do_gzip = false;
  }

  # Pause ESI request and remove Surrogate-Control header
  if (beresp.http.Surrogate-Control ~ "ESI/1.0") {
    unset beresp.http.Surrogate-Control;
    set beresp.do_esi = true;
  }

  # Enable cache for all static files
  # The same argument as the static caches from above: monitor your cache size, if you get data nuked out of it, consider giving up the static file cache.
  # Before you blindly enable this, have a read here: https://ma.ttias.be/stop-caching-static-files/
  /* Strip cookies from the following static file types for all users. */
  if ( bereq.url ~ "(?i)\.(bz2|css|eot|gif|gz|html?|ico|jpe?g|js|mp3|ogg|otf|pdf|png|rar|svg|swf|tbz|tgz|ttf|woff2?|zip)(\?(itok=)?[a-z0-9_=\.\-]+)?$"
  ) {
    unset beresp.http.set-cookie;
  }

  # Large static files are delivered directly to the end-user without
  # waiting for Varnish to fully read the file first.
  # Varnish 4 fully supports Streaming, so use streaming here to avoid locking.
  if (bereq.url ~ "^[^?]*\.(mp[34]|rar|tar|tgz|gz|wav|zip|bz2|xz|7z|avi|mov|ogm|mpe?g|mk[av]|webm)(\?.*)?$") {
    unset beresp.http.set-cookie;
    set beresp.do_stream = true;  # Check memory usage it'll grow in fetch_chunksize blocks (128k by default) if the backend doesn't send a Content-Length header, so only enable it for big objects
    set beresp.do_gzip = false;   # Don't try to compress it for storage
  }

  # Sometimes, a 301 or 302 redirect formed via Apache's mod_rewrite can mess with the HTTP port that is being passed along.
  # This often happens with simple rewrite rules in a scenario where Varnish runs on :80 and Apache on :8080 on the same box.
  # A redirect can then often redirect the end-user to a URL on :8080, where it should be :80.
  # This may need finetuning on your setup.
  #
  # To prevent accidental replace, we only filter the 301/302 redirects for now.
  if (beresp.status == 301 || beresp.status == 302) {
    set beresp.http.Location = regsub(beresp.http.Location, ":[0-9]+", "");
  }

  # Set 2min cache if unset for static files
  if (beresp.ttl <= 0s || beresp.http.Set-Cookie || beresp.http.Vary == "*") {
    /*
    * Mark as "Hit-For-Pass" for the next 2 minutes
    */
    set beresp.ttl = 120s; # Important, you shouldn't rely on this, SET YOUR HEADERS in the backend
    set beresp.uncacheable = true;
    return (deliver);
  }

  # Allow stale content, in case the backend goes down.
  # make Varnish keep all objects for 6 hours beyond their TTL
  set beresp.grace = 6h;

  # Only allow cookies to be set if we're in admin area
  # if (beresp.http.Set-Cookie && bereq.url !~ "^/wp-(login|admin)") {
  #   unset beresp.http.Set-Cookie;
  # }

  # don't cache response to posted requests or those with basic auth
  if ( bereq.method == "POST" || bereq.http.Authorization ) {
    set beresp.uncacheable = true;
    set beresp.ttl = 120s;
    return (deliver);
  }

  # don't cache search results
  # if ( bereq.url ~ "\?s=" ){
  #  set beresp.uncacheable = true;
  #  set beresp.ttl = 120s;
  #  return (deliver);
  # }

  # only cache status ok
  # if ( beresp.status != 200 ) {
  #  set beresp.uncacheable = true;
  #  set beresp.ttl = 120s;
  #  return (deliver);
  # }

  # A TTL of 24h
  # set beresp.ttl = 24h;
  # Define the default grace period to serve cached content
  # set beresp.grace = 30s;

  return (deliver);
}

# Pipe
sub vcl_pipe {

  # Called upon entering pipe mode.
  # In this mode, the request is passed on to the backend, and any further data from both the client
  # and backend is passed on unaltered until either end closes the connection. Basically, Varnish will
  # degrade into a simple TCP proxy, shuffling bytes back and forth. For a connection in pipe mode,
  # no other VCL subroutine will ever get called after vcl_pipe.

  # Note that only the first request to the backend will have
  # X-Forwarded-For set.  If you use X-Forwarded-For and want to
  # have it set for all requests, make sure to have:
  # set bereq.http.connection = "close";
  # here.  It is not set by default as it might break some broken web
  # applications, like IIS with NTLM authentication.

  set bereq.http.Connection = "Close";

  # Implementing websocket support (https://www.varnish-cache.org/docs/4.0/users-guide/vcl-example-websockets.html)
  if (req.http.upgrade) {
    set bereq.http.upgrade = req.http.upgrade;
  }

  return (pipe);
}

# Pass
sub vcl_pass {
  # Called upon entering pass mode. In this mode, the request is passed on to the backend, and the
  # backend's response is passed on to the client, but is not entered into the cache. Subsequent
  # requests submitted over the same client connection are handled normally.

  return (fetch);
}

# Hash
sub vcl_hash {
  # Called after vcl_recv to create a hash value for the request. This is used as a key
  # to look up the object in Varnish.

  hash_data(req.url);

  if (req.http.host) {
    hash_data(req.http.host);
  } else {
    hash_data(server.ip);
  }

  # If the client supports compression, keep that in a different cache
  if (req.http.Accept-Encoding) {
    hash_data(req.http.Accept-Encoding);
  }

  # Uncomment if different languages are served at the same URL.
  #if( req.http.Accept-Language ) {
  #   hash_data(req.http.Accept-Language);
  #}

  # hash cookies for requests that have them
  if (req.http.Cookie) {
    hash_data(req.http.Cookie);
  }

  # restore the cookies before the lookup if any
  if (req.http.Cookie-Backup) {
   set req.http.Cookie = req.http.Cookie-Backup;
   unset req.http.Cookie-Backup;
  }

  # Use special internal SSL hash for https content
  # X-Forwarded-Proto is set to https by Pound
  if (req.http.X-Forwarded-Proto ~ "https") {
    hash_data(req.http.X-Forwarded-Proto);
  }

  return (lookup);
}

# Hit
sub vcl_hit {

  # Called when a cache lookup is successful.

  if (obj.ttl >= 0s) {
    # A pure unadultered hit, deliver it
    return (deliver);
  }

  /* Allow varnish to serve up stale content if it is responding slowly */
  # See https://www.varnish-cache.org/docs/4.0/users-guide/vcl-grace.html
  # See https://www.varnish-software.com/blog/grace-varnish-4-stale-while-revalidate-semantics-varnish
  if ( obj.ttl + 60s > 0s ) {
    // Object is in grace, deliver it
    // Automatically triggers a background fetch
    set req.http.X-Varnish-Grace = "normal";
    return (deliver);
  }
  /* Allow varish to serve up stale content if all backends are down */
  # See https://www.varnish-cache.org/docs/4.0/users-guide/vcl-grace.html
  # See https://www.varnish-software.com/blog/grace-varnish-4-stale-while-revalidate-semantics-varnish
  if ( ! std.healthy(req.backend_hint)
    && obj.ttl + obj.grace > 0s
  ) {
    // Object is in grace, deliver it
    // Automatically triggers a background fetch
    set req.http.X-Varnish-Grace = "extended";
    return (deliver);
  }
  /* Bypass built-in logic */
  # We make sure no built-in logic is processed after ours returning
  # inconditionally.
  // fetch & deliver once we get the result
  return (fetch);
}

# Miss
sub vcl_miss {

  # Called after a cache lookup if the requested document was not found in the cache. Its purpose
  # is to decide whether or not to attempt to retrieve the document from the backend, and which
  # backend to use.

  return (fetch);
}

# Deliver - Set a header to track a cache HIT/MISS.
sub vcl_deliver {

  # Ban lurker friendly bans support
  # See https://www.varnish-cache.org/docs/4.0/users-guide/purging.html#bans
  unset resp.http.X-Host;
  unset resp.http.X-Url;

  # Drupal 8 Purge's module header cleanup
  # Purge's headers can become quite big, causing issues in upstream proxies, so we clean it here
  unset resp.http.Purge-Cache-Tags;

  # Debugging headers
  # Please consider the risks of showing publicly this information, we can wrap
  # this with an ACL.
  # Add whether the object is a cache hit or miss and the number of hits for
  # the object.
  # SeeV3 https://www.varnish-cache.org/trac/wiki/VCLExampleHitMissHeader#Addingaheaderindicatinghitmiss
  # In Varnish 4 the obj.hits counter behaviour has changed (see bug 1492), so
  # we use a different method: if X-Varnish contains only 1 id, we have a miss,
  # if it contains more (and therefore a space), we have a hit.
  if ( resp.http.X-Varnish ~ " " ) {
    set resp.http.X-Varnish-Cache = "HIT";
    # Since in Varnish 4 the behaviour of obj.hits changed, this might not be
    # accurate.
    # See https://www.varnish-cache.org/trac/ticket/1492
    set resp.http.X-Varnish-Cache-Hits = obj.hits;
  } else {
    set resp.http.X-Varnish-Cache = "MISS";
    /* Show the results of cookie sanitization */
    if ( req.http.Cookie ) {
      set resp.http.X-Varnish-Cookie = req.http.Cookie;
    }
  }
  # See https://www.varnish-software.com/blog/grace-varnish-4-stale-while-revalidate-semantics-varnish
  if ( req.http.X-Varnish-Grace ) {
    set resp.http.X-Varnish-Grace = req.http.X-Varnish-Grace;
  }

  # Restart count
  if ( req.restarts > 0 ) {
    set resp.http.X-Varnish-Restarts = req.restarts;
  }

  # Add the Varnish server hostname
  set resp.http.X-Varnish-Server = server.hostname;
  # If we have setted a custom header with device's family detected we can show
  # it:
  # if ( req.http.X-UA-Device ) {
  #   set resp.http.X-UA-Device = req.http.X-UA-Device;
  # }
  # If we have recived a custom header indicating the protocol in the request we
  # can show it:
  # if ( req.http.X-Forwarded-Proto ) {
  #   set resp.http.X-Forwarded-Proto = req.http.X-Forwarded-Proto;
  # }

  # Vary header manipulation
  # Empty in simple configs.
  # By example, if we are storing & serving diferent objects depending on
  # User-Agent header we must set the correct Vary header:
  if ( resp.http.Vary ) {
    set resp.http.Vary = resp.http.Vary + ",User-Agent";
  } else {
    set resp.http.Vary = "User-Agent";
  }

  # Please note that obj.hits behaviour changed in 4.0, now it counts per objecthead, not per object
  # and obj.hits may not be reset in some cases where bans are in use. See bug 1492 for details.
  # So take hits with a grain of salt
  set resp.http.X-Cache-Hits = obj.hits;

  # Remove some headers: Apache version & OS
  #unset resp.http.Server;
  #unset resp.http.X-Drupal-Cache;
  #unset resp.http.X-Varnish;
  #unset resp.http.Via;
  #unset resp.http.Link;
  #unset resp.http.X-Generator;
  #unset resp.http.X-Powered-By;

  return (deliver);
}

# vcl_purge: Called after the purge has been executed and all its variants have
# been evited.
# See https://www.varnish-cache.org/docs/4.0/users-guide/vcl-built-in-subs.html#vcl-purge
sub vcl_purge {
  # Only handle actual PURGE HTTP methods, everything else is discarded
  if (req.method != "PURGE") {
    # restart request
    set req.http.X-Purge = "Yes";
    return(restart);
  }

  return (synth(200, "Purged"));
}

# Synth
sub vcl_synth {
  if (resp.status == 720) {
    # We use this special error status 720 to force redirects with 301 (permanent) redirects
    # To use this, call the following from anywhere in vcl_recv: return (synth(720, "http://host/new.html"));
    set resp.http.Location = resp.reason;
    set resp.status = 301;
    return (deliver);
  } elseif (resp.status == 721) {
    # And we use error status 721 to force redirects with a 302 (temporary) redirect
    # To use this, call the following from anywhere in vcl_recv: return (synth(720, "http://host/new.html"));
    set resp.http.Location = resp.reason;
    set resp.status = 302;
    return (deliver);
  }

  return (deliver);
}

# Error
sub vcl_backend_error {
  set beresp.http.Content-Type = "text/html; charset=utf-8";
  set beresp.http.Retry-After = "5";
  synthetic( {"
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    <title>"} + beresp.status + " " + beresp.reason + {"</title>
  </head>
  <body>
    <h1>Error "} + beresp.status + " " + beresp.reason + {"</h1>
    <p>"} + beresp.reason + {"</p>
    <h3>Guru Meditation:</h3>
    <p>XID: "} + bereq.xid + {"</p>
    <hr>
    <p>Varnish cache server</p>
  </body>
</html>
"} );
  
  return (deliver);
}

Restart Varnish one last time and make sure the status of the service is ok:

$sudo systemctl restart varnish.service
$sudo systemctl status varnish.service

You should get an output similar to the following:

varnish status

Ok, that's it with Varnish, let's move to Pound.

First, set the daemon to auto start with the system, edit the following file:

$sudo nano /etc/default/pound

and set startup = 1 just as shown below:

# Defaults for pound initscript
# sourced by /etc/init.d/pound
# installed at /etc/default/pound by the maintainer scripts

# prevent startup with default configuration
# set the below varible to 1 in order to allow pound to start
startup=1

Before moving to the next step, make sure you have configured SSL support for Pound otherwise restarting Pound will fail!

Once SSL is configured, we need to tell Pound to start listening on ports 80 and 443 and redirect all requests to our Varnish, edit the following file:

$sudo nano /etc/pound/pound.cfg

Replace safely with the following:

## Minimal sample pound.cfg
##
## see pound(8) for details


######################################################################
## global options:

User		"www-data"
Group		"www-data"
#RootJail	"/chroot/pound"

## Logging: (goes to syslog by default)
##	0	no logging
##	1	normal
##	2	extended
##	3	Apache-style (common log format)
LogLevel	1

## check backend every X secs:
Alive		30

## use hardware-accelleration card supported by openssl(1):
#SSLEngine	"<hw>"

# poundctl control socket
Control "/var/run/pound/poundctl.socket"

TimeOut 180

######################################################################
## listen, redirect and ... to:

## HTTP
ListenHTTP
  HeadRemove "X-Forwarded-Proto"
  AddHeader "X-Forwarded-Proto: http"
  Address 0.0.0.0
  Port 80
  ## allow PUT and DELETE also (by default only GET, POST and HEAD)?:
  xHTTP 0
  # Ensure pound doesn't rewrite location headers, as this can cause a redirect loop
  ReWriteLocation 0

  ## Default -> all requests to Varnish
  Service
    BackEnd
     Address	127.0.0.1
     Port	8080
    End
  End

End

## HTTPS ##
ListenHTTPS

  HeadRemove "X-Forwarded-Proto"
  AddHeader "X-Forwarded-Proto: https"
  Address 0.0.0.0
  Port 443
  ## allow PUT and DELETE also (by default only GET, POST and HEAD)?:
  xHTTP 0
  # Ensure pound doesn't rewrite location headers, as this can cause a redirect loop
  ReWriteLocation 0
  ## Prevent Poodle attacks
  Disable SSLv3
  ## Load any certs available
  Cert "/etc/pound/server.pem"

  ## Default -> All Requests to Varnish
  Service
    Backend
      Address 127.0.0.1
      Port 8080
    End
  End

End

Restart the Pound service and also make sure the service is running ok:

$sudo systemctl restart pound.service
$sudo systemctl status pound.service

you should see an output as in the next screenshot:

pound status

We are almost there! Now we need to move to our Apache/Nginx server - basically where our Drupal lives. Since Drupal is going to be running behind this proxy architecture we need to configure the necessary elements so it is aware of the external IP's that are accessing it. As mentioned before this tutorial assumes Apache as webserver so we are going to focus on mod_remoteip for such task, but here's a nice tutorial for Nginx you can try.

Enable mod_remoteip in Apache:

$sudo a2enmod remoteip

Running the above should get you an output like the following:

enabled remoteip

Now, we need to configure the module, edit the next file:

$sudo nano /etc/apache2/mods-available/remoteip.conf

and replace with the following directives:

#RemoteIPHeader X-Real-IP
RemoteIPHeader X-Forwarded-For
RemoteIPInternalProxy 192.168.10.15

Where RemoteIPInternalProxy is the IP address of your Pound/Varnish Server and RemoteIPHeader = X-Forwarded-For is the standard used for the X-Forwarded-For head. If interested, you can find here more information about these headers.

You might want to change the Logging format of your Apache Logs so they include the real IP that accessed your server, to do so, just edit your apache2.conf file,

$sudo nano /etc/apache2/apache2.conf

and go to the Log Format section, replace as follows:

LogFormat "%a %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%a %l %u %t \"%r\" %>s %b" common

Don't forget to restart Apache:

$sudo systemctl restart apache2.service

At this point we are all set up! Point your browser to your Pound/Varnish server and open up firebug in firefox or your google dev console in google chrome and look into the Network section for the Varnish activity, on the following screenshot I'm showing a default Apache page through Varnish:

apache varnish console

And finally, add these configuration directives to your settings.php of your Drupal 8 installation:

// Tell Drupal that we are behind a reverse proxy server
$settings['reverse_proxy'] = TRUE;
$settings['reverse_proxy_addresses'] = array('192.168.10.15'); #Pound/Varnish Server IP
#$settings['reverse_proxy_header'] = #'X_FORWARDED_FOR'; don't change unless is changed on remoteip mod configuration!

From here you can go ahead and configure the Varnish Drupal module which at the time of creation of this post is still in dev for Drupal 8 and not fully usable yet, but there is another solution, Purge module for Drupal 8.
You can always clear your Varnish caches by manually tell it what cache to clear per host name:

$sudo varnishadm -T localhost:6082 -S /etc/varnish/secret ban "req.http.host == example.com"

Alright folks! you have configured a very powerful caching architecture for your Drupal sites, not only for development and testing purposes but for small to medium web sites hosting.

Please let me know your thoughts!

Bibliography:

https://www.drupal.org/docs/7/caching-to-improve-performance/varnish-4x-configuration
https://github.com/iraneagle/Varnish-Drupal/blob/master/default.vcl
https://www.listekconsulting.com/articles/varnish-cache-migrate-3-4/
https://www.ignoredbydinosaurs.com/posts/283-varnish-4-vcl-drupal-7
https://www.getpagespeed.com/server-setup/varnish-4-cookie-handling
https://github.com/mattiasgeniar/varnish-4.0-configuration-templates/issues/37
http://www.geoffstratton.com/varnish-and-pound-apache
https://github.com/engintron/engintron/issues/278
https://www.youtube.com/watch?v=ybMdQWUjp2E&t=3459s
http://stackoverflow.com/questions/30616715/how-to-set-varnish-to-run-on-port-80-malfunction-of-daemon-opts-set-in-etc-def
http://deshack.net/how-to-varnish-listen-port-80-systemd/
https://www.varnish-software.com/wiki/content/tutorials/drupal/drupal_vcl.html
https://info.varnish-software.com/blog/yet-another-post-on-caching-vs-cookies
https://github.com/NITEMAN/varnish-bites/blob/master/varnish4/drupal-base.vcl

Some useful commands that will help mitigate any issue:

$sudo su #run all as root
#varnishlog -a -w /var/log/varnish/varnish50x.log -q "RespStatus >= 500 or BerespStatus >= 500" to see logs of error 500
#varnishlog -g request -q 'ReqMethod eq "PURGE"' to see logs of PURGE

 

Comments

I love using Pound and Varnish with Apache. Consistently get sites in the top 1% of performance. It's really nice to see someone else doing things this way, so I know it's not just me!

Add new comment

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.