Configuring Cell Disk Cleanup Scheduling

Page last updated:

This topic describes how to configure disk cleanup scheduling on Diego cells in Cloud Foundry (CF).

What is Disk Cleanup

CF isolates application instances (AIs) from each other using containers that run inside Diego cells. Containers enforce a set of isolation layers including file system isolation. A CF container file system can either be a CF stack or the result of pulling a Docker image.

For performance reasons, the cells cache the Docker image layers and the CF stacks that running AIs use. When CF destroys an AI (e.g.: as a result of cf delete) or when CF reschedules an AI to a different cell, there is a chance that certain Docker image layers, or an old CF stack will become unused. If CF does not clean these unused layers the cell ephemeral disk will slowly fill.

Disk cleanup is the process of removing unused layers from the cell disk. The disk cleanup process will remove all unused Docker image layers and old CF Stacks, regardless of their size or age.

Options for Disk Cleanup

CF provides the following options for scheduling the disk cleanup process on Diego cells:

  • Never clean up the Cell Disk: We do not recommend using this option for production environments.
  • Routinely clean up the Cell Disk: This option makes the cell schedule a disk cleanup every time a container gets created. Running the disk cleanup process so frequently may have a negative impact on the cell performance.
  • Clean up Disk space once a threshold is reached: Choosing this option makes the cell schedule the disk cleanup process only when a configurable disk space threshold is reached or exceeded.

See the Configure Disk Cleanup Scheduling section for how to choose one these options.

Recommendations

Choosing the best option for disk cleanup depends on the workload that the Diego cells run: Docker images or Cloud Foundry buildpack-based apps.

For CF installations that mainly run buildpack-based applications we recommend using the second option: Routinely clean up Cell Disk space. The Routinely clean up Cell Disk space option ensures that when a new stack becomes available on a cell, the old stack is dropped immediately from the cache.

For CF installations that mainly run Docker images, or both Docker images and buildpack-based apps, we recommend using the third option: Clean up Disk space once a threshold is reached along with a reasonable threshold. Choosing a threshold is described in Choosing a Threshold.

Choosing a Threshold

To choose a realistic value when configuring the disk cleanup threshold, you must identify some of the most frequently used Docker images in your CF installation. Docker images tend to be constructed by creating layers over existing, base, images. In some cases, you may find it easier to identify which base Docker images are most frequently used.

Follow the steps below to configure the disk cleanup threshold:

  1. Identify the most frequently used Docker images or base Docker images.

    Example: The most frequently used images in a test deployment are openjdk:7, nginx:1.13, and php:7-apache.

  2. Using the Docker CLI, measure the size of those images.

    Example:

    # Pull identified images locally
    $> docker pull openjdk:7
    $> docker pull nginx:1.13
    $> docker pull php:7-apache

    # Measure their sizes $> docker images REPOSITORY TAG IMAGE ID CREATED SIZE php 7-apache 2720c02fc079 2 days ago 391 MB openjdk 7 f45207c01009 5 days ago 586 MB nginx 1.13 3448f27c273f 5 days ago 109 MB ...

  3. Calculate the threshold as the sum of the frequently used image sizes plus a reasonable buffer such as 15-20%.

    Example: Using the output above, the sample threshold calculation is ( 391 MB + 586 MB + 109 MB ) * 1.2 = 1303.2 MB

Configure Disk Cleanup Scheduling

To control how Diego cells schedule their disk cleanup, set the BOSH property garden.graph_cleanup_threshold_in_mb to one of the following options:

  • -1: the cell never runs disk cleanup.
  • 0: the cell runs disk cleanup whenever it creates a new container.
  • A positive number: the cell runs disk cleanup whenever its total disk usage exceeds the number, in MB. The number acts as the threshold.

    Example: As calculated in the previous step, you would set the BOSH property to 1303.

If you are deploying CF with GrootFS then please configure the grootfs.graph_cleanup_threshold_in_mb option instead.

Create a pull request or raise an issue on the source for this page in GitHub