Website errors always shatter your peace of mind. Especially, when you have no clue on what causes it.
A typical example is Apache 503 error on your website. Unfortunately, the web server just says it as “service unavailable.”
But, the tricky part is finding the real reason that made service unavailable.
That’s why, we often get requests from customers to solve Apache 503 errors as part of our Technical Support Services.
Today, we’ll see how 1 onlyhost’ Engineers solved Apache 503 error for our customer.
More About Apache 503 service unavailable error
Firstly, lets try to understand the important things on Apache 503 error.
A 503 service unavailable error simply means that the server was temporarily unable to handle the request for the website. And, at other times it works perfectly. Therefore, in Apache, this happens when there is temporary overloading. Or, it can be a problem with the applications that process website data.
For example, in PHP based websites, when site requests exceed PHP timeouts, memory limits, etc. it shows the 503 error. Similarly, it can happen when Apache web server is incapable of handling website requests too.
Thus, the underlying reason can vary depending on the actual server setup. And, that’s where our server administration experience helps.
Background of the Apache 503 error
Now, its time to get some background information about the recent request to fix Apache 503 error.
My website runs on a Virtualmin system which hosts 6 websites. 5 of them run flawless. The 6th website (xxx.com) experiences regular timeouts (Error 503 Service Unavailable). Can you fix it please?
That was the exact request from our customer. And, the error showed up as below.
When our Dedicated Engineers checked, we could see that the server was already patched. It had Apache’s processing module set as Prefork. Also, PHP 5.x was running in FastCGI mode.
Also, we confirmed that the server had enough free resources to support all the websites.
Additionally, the Apache 503 error on the website was intermittent. Mostly, a reload in the browser made the website working again.
How we fixed Apache 503 error
Now, we’ll see how our Support Engineers found the real reason for the service unavailable error and fixed it. This involved multiple steps and let’s see each of them in detail.
1. Checking logs
As always, we began by checking the logs. The error was Apache related. Obviously, the right place was to check the Apache log files at /usr/local/apache/logs. We searched the apache access and error logs with the problem domain name. And, we could see detailed errors related to the website user in the logs as :
[Wed Mar 13 07:37:32.473096 2019] [fcgid:warn] [pid 14628] [client 40.xx.xx.103:5181] mod_fcgid: can't apply process slot for /home/abc/fcgi-bin/php5.fcgi
Additionally, the logs were showing these messages:
[Thu Mar 21 07:02:23.056338 2019] [fcgid:warn] [pid 24842] mod_fcgid: process 10087 graceful kill fail, sending SIGKILL
[Thu Mar 21 07:02:23.056353 2019] [fcgid:warn] [pid 24842] mod_fcgid: process 10083 graceful kill fail, sending SIGKILL
2. Tweaking PHP FCGI parameters
Luckily, in this case the logs gave enough hints about the underlying cause of the Apache 503 error.
Firstly, we began with tweaking the FCGI parameters of the website. In simple terms, FastCGI is a method for connecting interactive programs with a Apache web server. Apache allows setting Fastcgi limits for each website. Therefore, we increased the parameter “FcgidMaxRequestsPerProcess” value to 500. Additionally, we tweaked related FastCGI parameters too. And, the final configuration file for the website looked as shown below.
IdleTimeout 3600
ProcessLifeTime 7200
IPCConnectTimeout 8
IPCCommTimeout 600
BusyTimeout 300
MaxRequestLen 15728640
FcgidMaxRequestsPerProcess 500
Again, these values depend on the amount of server resources available for Apache. With a web server restart, we could solve the 503 error and make the website working. However, since the issue was intermittent, we kept on monitoring the website. For this, our Dedicated Engineers configured automatic alerts too. Thus, it helps us to check the server at the time of error.
3. Tweaking Apache
The website was working again. But, that was not enough. We wanted to give a complete solution to the customer. The server logs showed performance issues in Apache. And, that’s why, we suggested customer to change the Multi Processing Module (MPM) of the Apache server too.
The solution was to switch Apache MPM from Prefork to Worker. Prefork is a less efficient MPM. But, the change in MPM is a major web server configuration change, and removes Apache from owning PHP requests. Instead, it needs a separate service named FPM to handle PHP.
Unfortunately, these changes come with a risk of website application failing to work properly with new apache environment of “worker”. Therefore, we scheduled the Apache change in a way that there was enough time to plan things ahead. We made customer check with the website developers of each website about the change in the server environment.
Finally, our Support Engineers proceeded with the task to change Apache MPM to worker at off peak hours. Thus, we could minimize the business impact on the websites. Luckily, the change did not take much time and sites started loading fine.
4. Monitoring Websites
Doing major changes on the Apache web server can have impact on websites. That’s where website monitoring helps. Our Dedicated Engineers kept the server on our watch list and ensured that the server is performing up to the mark. This involves monitoring the server resources, Apache memory usage, etc. on the peak hours as well.
Additionally, we always do load testing on Apache servers to foresee the server performance well in advance.
[Is your Apache websites showing up 503 errors? We can fix it for you.]
Conclusion
In short, Apache 503 error happens mainly when there are problems at the web server settings. But, as 503 error occurs intermittently, the fix can be tricky. Today, we saw how our Support Engineers nailed down the exact reason for Apache 503 error and fixed it for a customer in production server.