Basics & Architecture
What is Percolate and how does it work?
Percolate Pagespeed Optimizer is a reverse proxy that sits in front of your website, just like Varnish. On pages that can be cached it uses many static optimisation techniques to greatly increase the pagespeed score and performance of your website.
Percolate is build on top of Kubernetes. For each application, a separate Percolate instance is started on the cluster. None of the components, settings or version used are shared between applications. This ensures maximum stability and complete control over the update process.
Internals
Percolate consists of a few components, you should not be bothered by most of them but for the purpose of understanding how Percolate works, it is useful to keep these in mind.
- Proxy - Proxy requests to the application backend, creates new optimize jobs and handles API requests.
- Worker - Runs optimize jobs.
- RabbitMQ - Jobqueue for optimize jobs.
- Redis - Caches stored optimized responses.
- MongoDB - Stores optimize state for optimize requests and handled optimize requests.
- Chromium - Render engine.
- Cypress - Test environment.
- Test server - Test server to mock behavior of a real webserver during tests.
- Imgproxy - Service for resizing and optimizing images.
Cache objects
Each url has different versions in cache. These versions differ on:
- Url, with search and query parameters defined in
QUERY_PARAM_STRIP
stripped. - User agent, normalized to user agent types:
desktop
,mobile
,desktopbot
ormobilebot
. - Vary cookie, the value of the cookie with name defined in
VARY_COOKIE
.
The cache object state flow:
Cache objects in state Optimized
, Refresh Queued
and Refresh Processing
will result in a cache hit.
Client request
A request from the client browser is picked up by the K8S Ingress controller and send to your percolate instance. The request from a client that can result in a cache hit or miss.
For a detailed explanation on when Percolate queues a resource optimisation, see what-to-optimize.
Each page gets optimized in 4 different:
- desktop
- mobile
- desktop bot
- mobile bot
Note: The vary
header is ignored when creating or fetching cache objects.
This is done to ensure all clients receive a page that is optimal for the device.
Optimize job
Flow when optimize job is picked up by one of the workers.
Purging
When Percolate receives a purge command, Percolate by default does not purge content immediately (direct purge), but instead places a new optimize job into the queue (delayed purge). Due to this design decision, served pages are always optimized and from cache.
This behavior could be changed with the CACHE_DELAYED_PURGE
variable, so that purges will be executed directly without queuing.
All pages are tagged with the tags defined in the tags header. This header is configured using the CACHE_TAG_HEADER
variable. The purge request must contain the http header defined in CACHE_PURGE_HEADER
and should contain a regex
that matches all the tags to be flushed.
The Percolate API accepts PURGE requests for both ah-hoc flushing the entire cache and purging a single url. This can be useful in certain situations:
# flush entire cache (direct purge all resources)
curl -XPURGE https://www.example.com/.percolate/cache/flush
# purge a single url (delayed or direct purge depends on settings)
curl -XPURGE https://www.example.com/some-path
This can also be done using the status page: /.percolate/status
.
Optimisation pipeline
You can control which Percolate optimisations will run as part of the optimisation pipeline, by enabling / disabling their respective modules. For example when your website already does lazy loading images, this could just be disabled with a variable.
- Render Critical CSS and lazy load regular CSS.
- Minify CSS and check if there was a significant decrease in size. If not original CSS is kept.
- Optimize images by: stripping data, lossless compress and use new formats like webp when the browser supports it.
- Create multiple sizes of an image and inject srcset so images are downloaded in the format's they are actually rendered.
- Inject
dns-prefetch
andpreconnect
meta tags based on the domains used on the page. - Minify HTML
- For bots, pages are also rendered and the HTML saved. This results in pages that are optimal for crawlers and they do not need to execute any javascript involved.