UAA Performance Metrics

Introduction

User Account and Authentication (UAA) emits metrics constantly. These metrics can help you understand performance over time, troubleshoot problems, and asses the health of your installation in real-time. These metrics are emitted through Loggregator.

This topic describes different metrics that UAA, virtual machines (VMs), and Java Virtual Machines (JVMs) emit.

Understanding UAA Performance

These tables explain different types of UAA and UAA-related metrics you can view. There are three different metric areas discussed in the following tables:

Each table depicts the following information:

  • Name: The name of the metric.
  • Type: How the metric is displayed (for example, counters, gauges, or timers).
  • Description: An explanation of what values this metric displays.
  • Example: A code sample of this metric’s output.
  • Indicator: A discussion of what changes in the metric’s value may indicate over time.
  • Status: Active metrics may change between metrics emissions, whereas static metrics are fixed and do not change.

Note: If consuming UAA metrics through the Firehose, incremental metrics (in other words, metrics that capture an increment or decrement in a value since last emission) are expressed as cumulative values. For more information, see statsd-injector.

Global Performance Metrics

This section describes metrics that UAA emits.

Name Type Description Example Indicator Status
requests.global.completed.count Counter Number of HTTP requests the server has processed since last metric emission. This metric includes all requests sent to the server, including health checks. uaa.requests.global.completed.count:1|c You can use this metric to calculate request load and throughput. Active.
requests.global.completed.time Gauge. Average time in milliseconds spent per HTTP request. This metric is calculated as an average across all completed requests, including health checks. uaa.requests.global.completed.time:60|g A rise may indicate problems with server or database. Active.
server.inflight.count Gauge. Number of requests the server is currently processing (also known as in flight requests). uaa.server.inflight.count:1|g If this number climbs continuously, it can indicate that servers are getting saturated and are unable to handle the incoming load. Active.
requests.global.unhealthy.count Counter (gauge in 4.9.0). Number of completed requests that exceeded the tolerable response time since last metric emission. Each URL group can have a different tolerable completion time, which is preconfigured in each UAA release. These values are currently not configurable. uaa.requests.global.unhealthy.count:1|c If the number of requests not meeting tolerable completion time is growing, than either the tolerable request time needs to be fine tuned for false negatives or the server does not have enough capacity to handle the request load. The actual cause for this can be the need for an increase in server or database resources and further metrics are needed to actually make a scaling decision. Active.
requests.global.unhealthy.time Gauge. Average time in milliseconds per completed HTTP request that did not finish within the set tolerable time since startup. uaa.requests.global.unhealthy.time:250|g It can be useful to compare this metric to uaa.requests.global.completed.time. Active.
requests.global.status_4xx.count Counter. Number of HTTP requests that returned 400 codes (client error) since last metrics emission. These do not indicate server errors. A 400 code may indicate an invalid request to the server. uaa.requests.global.status_4xx.count:1|c This metric gives the client the ability calculate error rates. It is often used to detect faulty applications that may be causing unnecessary processing on the server. Active.
requests.global.status_5xx.count Counter. Number of HTTP requests that returned 500 codes (server errors) since last meetrics emission. uaa.requests.global.status_5xx.count:1|c This metric gives the client the ability calculate error rates and determine if further investigation is needed. Active.
server.up.time Timer. The number of milliseconds that have elapsed since this server instance started. uaa.server.up.time:42346751|g This metric indicates the time since last startup. Active.
server.idle.time Timer. The number of milliseconds that the server has spent in an idle state, when no requests were being processed. This allows a client to calculate the amount of actual, rather than cumulative, time the server has spent processing requests with up.time-idle.time. uaa.server.idle.time:2346751|g This metric allows the client to calculate when the server is receiving load time. Active.
database.global.completed.count Counter. Number of database queries the server has processed since last metrics emission. uaa.database.global.completed.count:1|c This metric allows you to track the number of queries that have reached and been processed by the server over a period of time. Active.
database.global.completed.time Gauge. The average amount of time in milliseconds per database query. uaa.database.global.completed.time:248|g This metric allows you to track the time to complete a database query on average. Active.
database.global.unhealthy.count Counter. Number of database queries that failed or didn’t meet tolerated response time since last metrics emission. The response time is not configurable during runtime. By default, it is currently set to 3 seconds. uaa.database.global.unhealthy.count:1|c This metric allows you to monitor database query success and failure over time. Active.
database.global.unhealthy.time Timer. The average amount of time in milliseconds per database query that was not within the tolerated response time. uaa.database.global.unhealthy.time:4678623|g This metric allows you to monitor database response time. Active.

Understanding UAA Vitals

This section describes metrics that the UAA VM and JVM emit.

Virtual Machine Vitals

Name Type Description Example Indicator Status
vitals.vm.cpu.count Gauge. How many CPUs are on this VM as reported by the Java VM. This metric is useful when you want to read system load average. The number reported by load average must be correlated to the number of CPUs. uaa.vitals.vm.cpu.count:4|g This metric is required for a proper CPU load calculation. Active.
vitals.vm.cpu.load Gauge. Average system CPU load as reported by the Java VM. The value is reported as a whole number multiplied by .01. For example, a value of 163 is read as 1.63. uaa.vitals.vm.cpu.load:50|g If the value of (cpu.load / 100.0 / cpu count) is more than 2.0, this is an indicator that the system may be overloaded and processing data slowly. Active.
vitals.vm.memory.total Gauge. Total OS memory, in bytes, as reported by Java VM. uaa.vitals.vm.memory.total:1073741824|g Use this metric in conjunction with other performance metrics to assess system health. Active.
vitals.vm.memory.total Gauge. Total OS memory, in bytes, as reported by Java VM. uaa.vitals.vm.memory.total:1073741824|g Use this metric in conjunction with other performance metrics to assess system health. Active.
vitals.vm.memory.committed Gauge. OS memory, in bytes, committed to UAA processes, as reported by Java VM. uaa.vitals.vm.memory.committed:1073741824|g Use this metric in conjunction with other performance metrics to assess system health. Active.
vitals.vm.memory.free Gauge. Free OS memory, in bytes, as reported by Java VM. uaa.vitals.vm.memory.free:1073741824|g Use this metric in conjunction with other performance metrics to assess system health. Active.

Java Virtual Machine Vitals

Name Type Description Example Indicator Status
vitals.jvm.cpu.load Gauge. The UAA Process CPU load average as reported by Java VM. This value will be multiplied by 100 and reported as a whole number representing the CPU load on the VM incurred by the UAA process, excluding any other processes on the VM. uaa.vitals.jvm.cpu.load:25|g Health/Scaling. If CPU load is showing high, this metric can be used to confirm that it is indeed the UAA using up the CPU and not other jobs on the same VM. Active.
vitals.jvm.thread.count Gauge. Number of threads running inside the UAA process, as reported by Java VM. uaa.vitals.jvm.thread.count:53|g Use this metric in conjunction with other performance metrics to assess system health. Active.
vitals.jvm.heap.init Gauge. Minimum amount of OS memory, in bytes, requested by the UAA JVM process to be used as part of the Java heap memory, as reported by Java VM. To learn more about JVM memory, see the Oracle documentation. uaa.vitals.jvm.heap.init:1073741824|g Use this metric in conjunction with other performance metrics to assess system health. Static. This value does not change.
vitals.jvm.heap.committed Gauge. Guaranteed amount of Java heap memory, in bytes, committed to the UAA JVM process, as reported by Java VM. To learn more about JVM memory, see the Oracle documentation. uaa.vitals.jvm.heap.committed:1073741824|g Use this metric in conjunction with other performance metrics to assess system health. Active.
vitals.jvm.heap.used Gauge. Java heap memory, in bytes, currently in use by the UAA process as reported by Java VM. To learn more about JVM memory, see the Oracle documentation. uaa.vitals.jvm.heap.used:1073741824|g Use this metric in conjunction with other performance metrics to assess system health. Static. This value does not change.
vitals.jvm.heap.max Gauge. Java heap memory, in bytes, that is the upper limit for the UAA processes reported by Java VM. To learn more about JVM memory, see the Oracle documentation. uaa.vitals.jvm.heap.max:1073741824|g Use this metric in conjunction with other performance metrics to assess system health. Static. This value does not change.
vitals.jvm.non-heap.init Gauge. Minimum non Java memory, in bytes, acquired by the UAA process as reported by Java VM. To learn more about JVM memory, see the Oracle documentation. uaa.vitals.jvm.non-heap.init:1073741824|g Use this metric in conjunction with other performance metrics to assess system health. Active.
vitals.jvm.non-heap.committed Gauge. Guaranteed non Java memory, in bytes, committed by the UAA process as reported by Java VM. To learn more about JVM memory, see the Oracle documentation. uaa.vitals.jvm.non-heap.committed:1073741824|g Use this metric in conjunction with other performance metrics to assess system health. Active.
vitals.jvm.non-heap.used Gauge. Current non Java memory, in bytes, that the UAA process can use, as reported by Java VM. To learn more about JVM memory, see the Oracle documentation. uaa.vitals.jvm.non-heap.used:1073741824|g Use this metric in conjunction with other performance metrics to assess system health. Active.
vitals.jvm.non-heap.max Gauge. Upper limit of non Java memory, in bytes, that the UAA process can use, as reported by Java VM. To learn more about JVM memory, see the Oracle documentation. uaa.vitals.jvm.non-heap.max:1073741824|g Use this metric in conjunction with other performance metrics to assess system health. Active.
Create a pull request or raise an issue on the source for this page in GitHub