Troubleshooting Applications

Page last updated:

Note: The procedures in this topic refer to versions of Cloud Foundry that use the older Droplet Execution Agent (DEA) architecture instead of the newer Diego architecture.

Tracking Down an Application by CPU Usage

Applications do not always functions as designed. If you notice a sharp decrease in available staging VMs or an unanticipated increase in CPU, disk utilization, or memory utilization across the VMs in your deployment, use the following steps to determine if a specific application is at fault.

  1. Run bosh ssh to open a secure shell into the DEA node that is experiencing the unexpected high resource usage.

  2. Install the htop interactive process viewer. In a terminal window, run htop to launch the application.

    Htop open

  3. Press the F6 key and use the arrow keys to sort the htop output by the “TIME+” column. Processes using excessive resources display a TIME+ value significantly greater than other processes. Note the names of each of these processes.

    Htop name

  4. Press the F5 key to view all the processes as a tree.

  5. Press the / key to search by the name of the processes. Locate the names of any processes found to be using excessive resources. Note the wshd command above each of these processes. Each of these wshd commands is a Warden handle for an application.

    In this example, the wshd command is 17vg99ooo1n:

    Htop find

  6. Press the F10 key to exit htop.

  7. Run sudo su - to enter the root environment with root privileges.

  8. Search /var/vcap/data/dea_next/db/instances.json to find the Warden handles identified using htop. The JSON object containing a Warden handle also contains the application GUID and name.

    In this example, the JSON object containing the Warden handle 17vg99ooo1n also contains the application GUID a8a5fb5e-f41d-49b5-af18-576a28f80e7f and name console-workers.

    Htop log handle

    Htop log name

Accessing a Warden Container

Cloud Foundry runs each application in an isolated environment called a Warden container. When troubleshooting an issue with an application, you might want direct access to the Warden container running the application.

To access a Warden container:

  1. Run bosh ssh to open a secure shell into the DEA node containing the application you want to troubleshoot.

  2. Use the process described in Tracking Down an Application by CPU Usage to find the Warden handle for the application.

  3. On the DEA node, use the following command to open an interactive shell in the Warden container running your application:

/var/vcap/packages/warden/warden/src/wsh/wsh --socket /var/vcap/data/warden/depot/WARDEN-HANDLE/run/wshd.sock --user vcap

$ /var/vcap/packages/warden/warden/src/wsh/wsh --socket /var/vcap/data/warden/depot/17vg99ooo1n/run/wshd.sock --user vcap

password for vcap: ********
Logged into vcap@17vg99ooo1n
$

Preserving a Warden Container

By default, a Warden container exists only as long as the application it runs exists. If an application consistently crashes while starting, you might not be able to access the Warden container running the application without changing this default behavior.

To keep a Warden container from being deleted even when the application it runs has stopped, set the container_grace_time parameter in the Warden linux.yml file to ~.

---
server:
  container_klass: Warden::Container::Linux

  container_grace_time: ~

  unix_domain_permissions: 0777
  . . .

Monitoring NATS Message Bus Traffic

To help troubleshoot an issue with an application, you can monitor NATS message bus traffic related to the application. To do this:

  1. Run cf target to target the space containing the application.

  2. Run cf app APPLICATION-NAME --guid to show the GUID of the application.

    $ cf app my-new-app --guid
    
    0a709d5b-0d9b-400b-900a-81594583b634
    
  3. Run bosh ssh to open a secure shell into a NATS node.

  4. Run nats-sub -s "nats://${NATS_USERNAME}:${NATS_PASSWORD}@${NATS_HOST}:${NATS_PORT}" ">" | grep APPLICATION-GUID to subscribe to NATS and show messages related to the application.

Viewing the HM9000 Data Store

The HM9000 data store contains information about every VM and application in a deployment. Reviewing this data store can provide insights into the state of a Cloud Foundry deployment as an aid to troubleshooting.

Use the following processes to view the contents of the HM9000 data store.

  1. Run bosh ssh hm9000_z1/0 to open a secure shell into the HM9000 VM.
  2. Run /var/vcap/packages/hm9000/hm9000 dump --config=/var/vcap/jobs/hm9000/config.json > /tmp/hm9000ds. This command outputs the data store into a text file, /tmp/hm9000ds.

Accessing Apps with SSH

If you need to troubleshoot an instance of an app, you can gain SSH access to the app with the Diego SSH. See the Accessing Apps with Diego-SSH topic for details about using the commands, or the Diego SSH Overview for broader details and procedures.

View the source for this page in GitHub