Cloud Controller Blobstore

This topic describes how Cloud Controllers and Diego interact with the CC’s blobstore.

Overview

Cloud Foundry uses a blobstore to store the source code that developers push, stage, and run.

This topic references staging and treats all blobstores as generic (S3-style) object stores.

For more general information about staging, see How Apps Are Staged.

For more information about how specific 3rd party blobstores can be configured, see Cloud Controller Blobstore Configuration.

How staging uses the blobstore

This section describes how staging buildpack apps utilizes the blobstore.

The following diagram illustrates, in-depth, how the staging process uses the blobstore:

The staging process for buildpack apps includes a developer and the following components: CF Command Line, Cloud Controller (CCNG), Blobstore, cc-uploader, Diego Cell (Staging), and Diego Cell (Running). Step 1 is CF Push from Developer to CF Command Line. Step 2 is Checksum source files from Developer to CF Command Line. Step 3 is Resource Match from CF Command Line to the CCNG. Step 4 is Check file existence from the CCNG to Blobstore. Step 5 is Upload unmatched files from CF Command Line to CCNG. Step 6 is Download cached files from Blobstore to CCNG. Step 7 is Upload complete package from CCNG to Blobstore. Step 8 is Download package and buildpack from Blobstore to Diego Cell (Staging). Step 9 is Upload droplet from Diego Cell (Staging) through cc-uploader, then CCNG, to the Blobstore. Step 10 is Download droplet from Blobstore to CCNG.

For a description of each step where staging a buildpack app interacts with the blobstore, see the following:

  1. A developer runs cf push.

  2. The cf CLI gets all the developers local source code files and computes a checksum of each.

  3. The cf CLI makes a “Resource Match” request to the cloud controller, listing those files and their checksums.

  4. This step, “Check file existence”, includes the following:

    1. The Cloud Controller makes a series of “HEAD”-type requests to the blobstore to find out which files it has cached.
    2. Cloud Controller content-addresses its cached files so that changes to a file result in it being stored as a different object.
    3. Eventually Cloud Controller is able to compute which files it has and which it needs the CLI to upload.
    4. In response to the resource match request, Cloud Controller lists the files the CLI needs to upload .

  5. The CLI compresses and uploads the unmatched files to the Cloud Controller.

  6. Asynchronously, The Cloud Controller downloads the matched files from the blobstore to its local disk.

  7. This step includes the following:

    1. The Cloud Controller zips together the freshly uploaded files with the downloaded cached files.
    2. The Cloud Controller uploads the complete package to the blobstore.

  8. The Diego Cell downloads the package and its buildpack(s) into a container and stages the app.

  9. This step includes the following:

    1. After the app has been staged, the Diego Cell uploads the complete droplet to the cc-uploader.
    2. The cc-uploader makes a multipart upload request to upload the droplet to the Cloud Controller.
    3. The Cloud Controller enqueues an asyncronous job to upload the blobstore.

  10. This step includes the following:

    1. The Diego Cell attempts to download the droplet from the Cloud Controller into the app’s container so it can run it.
    2. The Cloud Controller redirects the Cell’s droplet download request to the blobstore.
    3. The Diego Cell downloads the app’s droplet from the blobstore and runs it.

Blobstore load

The shape of the load CC generates on its blobstore is a bit unusual because of resource matching. Many blobstores that perform very well under normal read, write, and delete load are overwhelmed by CC’s heavy usage of HEAD requests during resource matching.

Pushing an app with large number of files will cause the CC to check the blobstore for the existence of each and every one of those files.

Parallel BOSH deploys of Diego Cells can also generate significant read load on CC’s Blobstore as the cells perform evacuation.

How CC reaps expired packages, droplets, and buildpacks

As new droplets and packages are created, the oldest ones associated with an app are marked as “EXPIRED” if they exceed the configured limits for packages and droplets stored per app.

Nightly, starting at midnight, the Cloud Controller runs a series of jobs to delete the data associated with expired packages, droplets, and buildpacks.

Enabling the native versioning feature on your blobstore will increase the number of resources stored and increase costs.

Blobstore Interaction timeouts

The Cloud Controller inherits its default blobstore operation timeouts from excon. Excon defaults to a 60 second read, write, and connect timeouts.

Create a pull request or raise an issue on the source for this page in GitHub