Site is Down: Diagnostic Reference

This is a set of techniques to diagnose when you find your site down or unstable due to an unknown reason. If these don’t work, please submit a ticket, providing all log files and responses to each diagnostic step reported below.

Checking Log Files

  • Identify Important Log Files and look for errors and warnings that can help you understand what has happened to your instance, tools and applications.

Checking the Dashboard

Can you access the Dashboard?

  • If not, go to http://status.engineyard.com/ to check if Engine Yard Cloud platform has had any recent incident reported. 
    • To be immediately advised of critical incidents at Engine Yard Cloud, please click on button Subscribe to Updates and select how you want to be notified: email, SMS, Twitter or RSS Feed.

Engine_Yard_Status.png

  • If yes, after you have logged in, check if your application presents red status circles, that usually indicate problems with the instance.
    • Check the Base log (log output for Engine Yard’s Chef scripts) and the Custom log (log output for your custom Chef scripts). You can find configuration problems in the logs.
      • Action: Fix these and click Apply to re-run these scripts.
    • Review the Alerts on the Environment page. They can indicate if the instance has issues with its resources.
      • Note: Alerts are enabled by default, but to receive email notification, enable email alerts.
      • Action: If your application is using too many resources and causing alerts, you can move to a larger instance size or reconfigure your environment to not use too many resources on one instance.
    • Click View Log and review the entries, to ensure that the most recently deployment logged was successful.

Checking the Environment

If the Dashboard hasn't clarified root cause of the problem, then SSH into your instance and check these items:

  • Go to /data/<appname>/current/log and view your application log. This log indicates if there are problems with your running application.
$ cd /data/myapp/current/log 
$ tail production.log

Checking the Cluster

If you have a cluster of instances:

  1. Confirm that HAProxy is running: $ps ax|grep haproxy
  2. If HAProxy is not running, run /etc/init.d/haproxy start

Checking the Web Server

  1. On the application instances, ensure that Nginx is running: 
$ sudo /etc/init.d/nginx status 
* status: started 
  1. If Nginx is not running, run sudo /etc/init.d/nginx start.

Checking the Workers

If Passenger is in use:

  1. On the application instances, confirm Passenger is running:
$ passenger-status 
  1. If Passenger is not running, restart Nginx:
$ sudo /etc/init.d/nginx restart 

if Sidekiq is in use:

  • Look for useful entries in the log files to understand if there is any impact to the workers functioning. Follow wiki article Sidekiq Logging for guidance on how to find the logs.

Checking the Database

1. On your single instance or database instance, confirm that the database is running:

  • For a PostgreSQL 9.1 database:

    $ sudo /etc/init.d/postgresql-9.1 status

    If PostgreSQL is not running, restart PostgreSQL:

    $ sudo /etc/init.d/postgresql-9.1 start
  • For a MySQL database:

    $ sudo /etc/init.d/mysql status 

    If MySQL is not running, restart MySQL:

    $ sudo /etc/init.d/mysql start 

2. Investigate pending processes in the server associated to the Database, following the article Checking for Connections. 

3. Look for errors reported by the Database on the log files, following the article Troubleshoot Your Database.

  • For MySQL: files are typically located at path /var/log/mysqld.log in your instance, that is defined in the LOG_ERROR configuration variable. MySQL server log files are usually identified by mysql.nameOfLogFile
  • For PostgreSQL: review Product Documentation Guide for complete information on log files location and configuration.

 

Comments

0 comments

Please sign in to leave a comment.