How to prevent website downtime with WordPress VIP
What does it actually mean for a site to be considered down?
Often that depends on whom you ask.
For a website to be considered down, it may mean a number of different things:
- The website is completely unavailable.
- The website is online but unusably slow.
- The website is giving error messages for certain users or locations.
- The website is working for most visitors, but some simply can’t log in to their CMS, for example, to create, edit, or publish content.
No matter the cause or degree, the impact of website downtime can be serious, from lost ecommerce orders and frustrated users to weakened customer trust.
In this blog, we explore classic root causes of website downtime and the role WordPress VIP can play in avoiding that.
- Not enough caching
The most important thing you can do to ensure a site is performant and stable is to make sure any full page that can be cached, is cached. Uncached pages need to be built on the server each time they are requested, which is a slower process and more prone to errors.
- Caching challenges
Because they demand a personalized, fully interactive experience, some sites, particularly ecommerce ones, simply can’t be cached at the page-cache level.
Often a compromise can be found whereby a static page is served by edge cache, with dynamic features (e.g., logged-in status, shopping carts) added via JavaScript. Asynchronous requests from JavaScript can then be used to communicate with a WordPress REST API endpoint designed with a much lower overhead than a full page load.
Alternatively, this is where object caching comes into play. The page can remain dynamic but parts of the page and any data used in it can be stored and retrieved in object cache to avoid needing to query the database.
- Untested code deployments
This is another common culprit of website downtime and pretty easy to diagnose, based on pure cause and effect.
If your website has just deployed untested code, leading to immediate site issues, there’s your likely cause. If you can, revert the suspect code to the previous version ASAP.
The best thing to do to avoid this situation? Thoroughly test every piece of code on a separate development or staging environment before releasing to production.
- PHP errors
WordPress uses PHP code on the server. A PHP error might be “fatal,” meaning that once the error occurs, the web page, script, or command will stop running. These will almost always surface as visible errors somewhere, and will be recorded in the PHP logs.
Note: Some PHP warnings in PHP 7 become fatal errors in PHP 8, so it’s important to take these errors seriously.
- Slow MySQL database queries
Every WordPress website uses a database to store website content and configuration data. Database queries fetch that content data for web pages, but sometimes those queries are written inefficiently. They may work fine for sites with only a few hundred pages, but stall when handling large amounts of data (some websites on our platform have millions of stored records).
A slow query ties up database resources, potentially impacting site stability—not just for the page, script, or command running the SQL, but across the whole application. Sites often struggle because single or multiple database queries are slow, e.g., any query that takes longer than 0.75 seconds to execute.
That said, slow database queries can’t always be resolved simply by adding additional database resources. That’s why we advise customers to monitor slow database queries by using Query Monitor and New Relic. These highlight where queries originate in the database, so your development team can refactor them to optimize performance. Finally, InfoBeans’ Application Support and Premier Engineers can also help your team find and analyze these queries, and suggest ways to improve them for speed and efficiency.
- Excessive database writes
Sometimes a feature, such as custom logging or tracking code, updates the database on every request. This can lead to instability for two reasons:
- Foregoing database replicas: All write queries are directed to the primary database; subsequent database queries for the same table (or tables) in the same page request will also be directed there. By not taking advantage of database replicas, this limits the scalability of the site.
- Bypassing page caching: For a database write to happen on every page request, page caching must be bypassed. But doing so means the first (and best) line of defence has been compromised.
- Plugins
There are thousands of popular, helpful third-party plugins in the WordPress ecosystem that provide fantastic features and functionality. Some, though, have challenges scaling, potentially leading to downtime issues when added to a website with tons of content and traffic.
- Custom logging
Custom logging is a powerful debugging tool, often the only viable method to track down a bug or issue that seems to happen only on a production site. On numerous occasions, however, we’ve seen custom logging built in PHP on a high-traffic site slow down things or put a site in danger of downtime through excessive database writes.
- Remote API calls
Some websites take advantage of server-side REST API calls to other applications or services. These are pretty fast under normal circumstances, but sometimes the underlying application code leads to a slow response, times out, or throws an error.
When your business is on the line, you can’t afford to send new business elsewhere and tarnish your brand by having your content management system (CMS) deliver a poor digital experience.
High-traffic days ought to be a cause for celebration, not a nightmare for engineers on their collective back foot trying to keep a site and applications up and humming to handle the load—and your reputation intact.
If you want to implement WordPress VIP or find out more about it, get in touch with us.