Implementing Nginx PageSpeed module's Shard domains with Drupal site

The web browsers by default limits number of concurrent connections for each host. When that limit exceeds, the resources (like CSS, images, etc.) will remain in queue waiting to be downloaded until the prior downloads are completed. This results slow page load times user experience. One solution to reduce page load time is by domain sharding. This technique splits the resources download across multiple sub-domains (eg. static1.webfoobar.com, static2.webfoobar.com) resulting an increase in simultaneous connections. Although concurrent connections is one of the main feature of new HTTP 2, this domain sharding is useful for those who don't have budget for SSL certificate.

This article will show how to implement the Nginx PageSpeed module's domain sharding on a Drupal site. Please follow this article for instructions in compiling Nginx with PageSpeed. The Nginx configuration scripts that will be presented here will have a relationship in the Nginx configurations that were discussed here which setup as reverse proxy. And the initial PageSpeed settings can be followed/copied in this article.

I setup a Drupal site and created a node content that will demonstrate the application of PageSpeed domain sharding (http://test7.webfoobar.com/node/2). This content contains 200 image resources so we can easily visualize in testing the domain sharding. It is actually a 800x400 image cut into 200 small 40x40 images, arranged in 10 rows and 20 columns and glued together by CSS. I disabled all the PageSpeed image optimization (like sprite images, convert to webp, etc.) to have a clean demonstration of domain sharding.

I used four sub-domains for my domain sharding demo. Two to four sub-domains are the optimal ratio as shown in experiment of Yahoo! and more than this will just degrade the performance. The sub-domains are:

  • st7a.webfoobar.com
  • st7b.webfoobar.com
  • st7c.webfoobar.com
  • st7d.webfoobar.com

Steps:

  1. Open the domain's DNS manager and create CNAME records:

    • st7a.webfoobar.com
    • st7b.webfoobar.com
    • st7c.webfoobar.com
    • st7d.webfoobar.com

    All are alias of test7.webfoobar.com.

  2. Create the Nginx configuration script for the four static sub-domains. The script should only process static files (such as CSS, images, etc.) and prevent the static sub-domains outputting any Drupal contents (to avoid duplicate content penalty).

    
    vi /etc/nginx/sites-available/st7x.webfoobar.conf
    
    

    ... add the following to it:

    
    # Static domains  
    server {
      ## Replace XXX.XXX.XXX.XXX with your server's IPv4 address
      listen XXX.XXX.XXX.XXX:80;
      ## Replace XXXX:XXXX::XXXX:XXXX:XXXX:XXXX with your server's IPv6 address
      listen [XXXX:XXXX::XXXX:XXXX:XXXX:XXXX]:80;
      server_name st7a.webfoobar.com st7b.webfoobar.com st7c.webfoobar.com st7d.webfoobar.com;
      access_log off;
      error_log  /var/log/nginx/error_log error;
      root /var/www;
      location / {
        ## Drupal generated static files
        include apps/drupal/static_files_handler.conf;
      }
    }
    
    
  3. Create the file /etc/nginx/apps/drupal/static_files_handler.conf and add the following to it:

    
    location / {
      ## Regular private file serving (i.e. handled by Drupal).
      location ^~ /system/files/ {
        proxy_pass http://phpapache;
        proxy_http_version 1.1; # keep alive to the Apache upstream
        proxy_set_header Connection '';
        ## Rewrite the 'Host' header to the value in the client request,
        ## or primary server name
        proxy_set_header Host $host;
        ## For not signaling a 404 in the error log whenever the
        ## system/files directory is accessed add the line below.
        ## Note that the 404 is the intended behavior.
        log_not_found off;
      }
      ## Trying to access private files directly returns a 404.
      location ^~ /sites/[\.\-[:alnum:]]+/files/private/ {
        internal;
      }
      ## Support for the file_force module
      ## http://drupal.org/project/file_force.
      location ^~ /system/files_force/ {
        proxy_pass http://phpapache;
        proxy_http_version 1.1; # keep alive to the Apache upstream
        proxy_set_header Connection '';
        ## Rewrite the 'Host' header to the value in the client request,
        ## or primary server name
        proxy_set_header Host $host;
        ## For not signaling a 404 in the error log whenever the
        ## system/files directory is accessed add the line below.
        ## Note that the 404 is the intended behavior.
        log_not_found off;
      }
      ## If accessing an image generated by Drupal imagecache, serve it
      ## directly if available, if not relay the request to Drupal to (re)generate
      ## the image.
      location ~* /imagecache/ {
        ## Image hotlinking protection. If you want hotlinking
        ## protection for your images uncomment the following line.
        include hotlinking_protection.conf;
        access_log off;
        expires 30d;
        try_files $uri $uri/ @drupal-noexp;
      }
      ## Drupal generated image handling, i.e., imagecache in core. See:
      ## http://drupal.org/node/371374.
      location ~* /files/styles/ {
        ## Image hotlinking protection. If you want hotlinking
        ## protection for your images uncomment the following line.
        include hotlinking_protection.conf;
        access_log off;
        expires 30d;
        try_files $uri $uri/ @drupal-noexp;
      }
      ## Advanced Aggregation module CSS/JS
      ## support. http://drupal.org/project/advagg.
      location ~ ^/sites/[\.\-[:alnum:]]+/files/advagg_(?:css|js)/ {
        expires max;
        gzip_static on;
        add_header ETag '';
        add_header Accept-Ranges '';
        # Set a far future Cache-Control header to 52 weeks.
        add_header Cache-Control 'max-age=31449600, no-transform, public';
        location ~* (?:css|js)[_\-[:alnum:]]+\.(?:css|js)(\.gz)?$ {
          access_log off;
          try_files $uri $uri/ @drupal-noexp;
        }
      }
      ## All static files will be served directly.
      location ~* ^.+\.(?:css|cur|js|jpe?g|gif|htc|ico|png|htm|html|xml|txt|otf|ttf|eot|woff|svg|webp|webm|zip|gz|tar|rar)$ {
        access_log off;
        expires 30d;
        ## No need to bleed constant updates. Send the all shebang in one
        ## fell swoop.
        tcp_nodelay off;
        ## Set the OS file cache.
        open_file_cache max=3000 inactive=120s;
        open_file_cache_valid 45s;
        open_file_cache_min_uses 2;
        open_file_cache_errors off;
        try_files $uri $uri/ @drupal-noexp;
      }
      ## PDFs and powerpoint files handling.
      location ~* ^.+\.(?:pdf|pptx?)$ {
        access_log off;
        expires 30d;
        ## No need to bleed constant updates. Send the all shebang in one
        ## fell swoop.
        tcp_nodelay off;
        try_files $uri $uri/ @drupal-noexp;
      }
      ## MP3 and Ogg/Vorbis files are served using AIO when supported. Your OS must support it.
      location ~ ^/sites/[\.\-[:alnum:]]+/files/audio/mp3 {
        location ~* .*\.mp3$ {
          access_log off;
          directio 4k; # for XFS
          ## If you're using ext3 or similar uncomment the line below and comment the above.
          #directio 512; # for ext3 or similar (block alignments)
          tcp_nopush off;
          aio on;
          output_buffers 1 2M;
          try_files $uri $uri/ @drupal;
        }
      }
      location ~ ^/sites/[\.\-[:alnum:]]+/files/audio/ogg {
        location ~* .*\.ogg$ {
          access_log off;
          directio 4k; # for XFS
          ## If you're using ext3 or similar uncomment the line below and comment the above.
          #directio 512; # for ext3 or similar (block alignments)
          tcp_nopush off;
          aio on;
          output_buffers 1 2M;
          try_files $uri $uri/ @drupal;
        }
      }
      ## Pseudo streaming of FLV files:
      ## http://wiki.nginx.org/HttpFlvStreamModule.
      ## If pseudo streaming isn't working, try to comment
      ## out in nginx.conf line with:
      ## add_header X-Frame-Options SAMEORIGIN;
      location ~ ^/sites/[\.\-[:alnum:]]+/files/video/flv {
        location ~* .*\.flv$ {
          access_log off;
          flv;
          try_files $uri $uri/ @drupal;
        }
      }
      ## Pseudo streaming of H264/AAC files. This requires an Nginx
      ## version greater or equal to 1.0.7 for the stable branch and
      ## greater or equal to 1.1.3 for the development branch.
      ## Cf. http://nginx.org/en/docs/http/ngx_http_mp4_module.html.
      location ~ ^/sites/[\.\-[:alnum:]]+/files/video/mp4 { # videos
        location ~* .*\.(?:mp4|mov)$ {
          access_log off;
          mp4;
          mp4_buffer_size 1M;
          mp4_max_buffer_size 5M;
          try_files $uri $uri/ @drupal;
        }
      }
      location ~ ^/sites/[\.\-[:alnum:]]+/files/audio/m4a { # audios
        location ~* .*\.m4a$ {
          access_log off;
          mp4;
          mp4_buffer_size 1M;
          mp4_max_buffer_size 5M;
          try_files $uri $uri/ @drupal;
        }
      }
      ## Advanced Help module makes each module provided README available.
      location ^~ /help/ {
        location ~* ^/help/[^/]*/README\.txt$ {
          access_log off;
          proxy_pass http://phpapache;
          proxy_http_version 1.1; # keep alive to the Apache upstream
          proxy_set_header Connection '';
          ## Rewrite the 'Host' header to the value in the client request,
          ## or primary server name
          proxy_set_header Host $host;
        }
      }
      ## Replicate the Apache  directive of Drupal standard
      ## .htaccess. Disable access to any code files. Return a 404 to curtail
      ## information disclosure. Hide also the text files.
      location ~* ^(?:.+\.(?:htaccess|make|txt|engine|inc|info|install|module|profile|po|pot|sh|.*sql|test|theme|tpl(?:\.php)?|xtmpl)|code-style\.pl|/Entries.*|/Repository|/Root|/Tag|/Template)$ {
        return 404;
      }
    }
    ## Restrict access to the strictly necessary PHP files. Reducing the
    ## scope for exploits. Handling of PHP code and the Drupal event loop.
    location @drupal {
      proxy_pass http://phpapache;
      proxy_http_version 1.1; # keep alive to the Apache upstream
      proxy_set_header Connection '';
      ## Rewrite the 'Host' header to the value in the client request,
      ## or primary server name
      proxy_set_header Host $host;
      ## Proxy microcache
      include microcache_proxy.conf;
      ## The Cache-Control and Expires headers should be delivered untouched
      ## from the upstream to the client.
      proxy_ignore_headers Cache-Control Expires;
      ## To avoid any interaction with the cache control headers we expire
      ## everything on this location immediately.
      expires epoch;
    }
    ## Restrict access to the strictly necessary PHP files. Reducing the
    ## scope for exploits. Handling of PHP code and the Drupal event loop.
    location @drupal-noexp {
      proxy_pass http://phpapache;
      proxy_http_version 1.1; # keep alive to the Apache upstream
      proxy_set_header Connection '';
      ## Rewrite the 'Host' header to the value in the client request,
      ## or primary server name
      proxy_set_header Host $host;
      ## Proxy microcache.
      include microcache_proxy.conf;
    }
    ## Disallow access to .bzr, .git, .hg, .svn, .cvs directories
    ## Return 404 as not to disclose information.
    location ^~ /.bzr {
      return 404;
    }
    location ^~ /.git {
      return 404;
    }
    location ^~ /.hg {
      return 404;
    }
    location ^~ /.svn {
      return 404;
    }
    location ^~ /.cvs {
      return 404;
    }
    ## Disallow access to patches directory.
    location ^~ /patches {
      return 404;
    }
    ## Disallow access to drush backup directory.
    location ^~ /backup {
      return 404;
    }
    ## Disable access logs for robots.txt.
    location = /robots.txt {
      access_log off;
      ## Add support for the robotstxt module
      ## http://drupal.org/project/robotstxt.
      try_files $uri $uri/ @drupal;
    }
    ## RSS feed support.
    location = /rss.xml {
      try_files $uri $uri/ @drupal;
    }
    ## XML Sitemap support.
    location = /sitemap.xml {
      try_files $uri $uri/ @drupal;
    }
    ## Support for favicon.
    ## Return an 1x1 transparent GIF if it doesn't exist.
    location = /favicon.ico {
      expires 30d;
      try_files /favicon.ico @empty;
    }
    ## Return an in memory 1x1 transparent GIF.
    location @empty {
      expires 30d;
      empty_gif;
    }
    ## Any other attempt to access PHP files returns a 404.
    location ~* ^.+\.php$ {
      return 404;
    }
    
    

    The script above should only allow processing static files.

  4. Lets now create the Nginx configuration for test7.webfoobar.com Drupal site domain:

    
    vi /etc/nginx/sites-available/test7.webfoobar.com.conf
    
    

    ... add the following script into it:

    
    # Drupal site domain test7.webfoobar.com
    server {
      ## Replace XXX.XXX.XXX.XXX with your server's IPv4 address
      listen XXX.XXX.XXX.XXX:80;
      ## Replace XXXX:XXXX::XXXX:XXXX:XXXX:XXXX with your server's IPv6 address
      listen [XXXX:XXXX::XXXX:XXXX:XXXX:XXXX]:80;
      server_name test7.webfoobar.com;
      pagespeed ShardDomain http://test7.webfoobar.com http://st7a.webfoobar.com,http://st7b.webfoobar.com,http://st7c.webfoobar.com,http://st7d.webfoobar.com;
      access_log off;
      error_log  /var/log/nginx/error_log error;
      root /var/www;
      index index.php;
    
      include apps/drupal/drupal.conf;
    }
    
    
  5. Now, enable the static domains and Drupal site domain:

    
    ln -s /etc/nginx/sites-available/test7.webfoobar.com.conf /etc/nginx/sites-enabled/test7.webfoobar.com.conf
    ln -s /etc/nginx/sites-available/st7x.webfoobar.com.conf /etc/nginx/sites-enabled/st7x.webfoobar.com.conf
    systemctl restart nginx
    
    
  6. In Apache configuration, append the following scripts under the ServerAlias test7.webfoobar.com in /etc/httpd/conf/httpd.conf file:

    
    ServerAlias st7a.webfoobar.com
    ServerAlias st7b.webfoobar.com
    ServerAlias st7c.webfoobar.com
    ServerAlias st7d.webfoobar.com
    
    

    So it will look something like this:

    
    ServerAlias test7.webfoobar.com
    ServerAlias st7a.webfoobar.com
    ServerAlias st7b.webfoobar.com
    ServerAlias st7c.webfoobar.com
    ServerAlias st7d.webfoobar.com
    DocumentRoot /var/www/
    ErrorLog /var/log/httpd/error_log
    DirectoryIndex index.php
    ...
    
    

    Restart Apache:

    
    systemctl restart httpd
    
    

The following are the results of domain sharding tests performed using webpagetest.org:

Waterfall View of test7.webfoobar.com without PageSpeed domain sharding (http://www.webpagetest.org/result/160507_TN_CZQ)

Waterfall view no domain sharding

Waterfall View of test7.webfoobar.com with PageSpeed domain sharding enabled (http://www.webpagetest.org/result/160507_VY_EZ0)

Waterfall view with domain sharding

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.