Troubleshooting Diego for Windows

Page last updated:

This topic describes how to troubleshoot a Windows cell in a Diego deployment.

Resolve Application Errors

Ensure that your .NET app is ready for deployment. You usually see the following errors after pushing an app.

Error

NoCompatibleCell This error usually indicates that the RepService has not yet registered your Windows cell with the rest of your Cloud Foundry (CF) deployment. The RepService attempts to reconnect on an interval, and can sometimes resolve itself within a few minutes.

diegoWindows-no-compatible-cell

Resolution

Restart the RepService within your cell to trigger an immediate reconnection.

diegoWindows-no-compatible-cell


Error

Start Unsuccessful This error usually indicates that your app is misconfigured for your CF Windows environment, but can also indicate your app does not contain the required .dlls and dependencies.

diegoWindows-no-compatible-cell

diegoWindows-no-compatible-cell

Resolution

Push your app from a directory containing either a .exe binary or a valid Web.config file for .NET apps. Alternatively, add the -p flag to your cf push command and specify the path to the directory that contains the .exe or Web.config file.

Ensure that your app dependencies are contained in your pushed app.

Find Errors Using Hakim

Hakim is a diagnostic tool that reveals common configuration issues with Windows cells.

Install and Run Hakim

  1. Download the hakim.exe corresponding to your installed DiegoWindows version from the GitHub release page.

  2. In a shell window, navigate to the directory that contains the downloaded binary.

  3. Execute the binary. Here is example hakim output:

        PS C:\Users\Administrator\Downloads> .\hakim.exe
        2016/02/26 21:04:35 The following processes are not running: garden-windows.exe
        2016/02/26 21:04:36 Failed to create container
        Post http://api/containers: dial tcp 127.0.0.1:9241: ConnectEx tcp: No connection could be made because the target machine actively refused it.
    

Resolve Common Errors

Hakim only outputs to the console if it detects errors. Here are some common errors and resolutions:

Error

The following processes are not running This usually indicates a failed deployment.

Resolution

Re-provision your Windows components.


Error

Failed to resolve consul host This usually indicates interference with DNS resolution on your Windows cell.

Resolution

To resolve this error, set localhost 127.0.0.1 as the primary DNS server for the active network adapter.


Error

Fair Share CPU Scheduling must be disabled

Resolution

You must disable this setting for your Windows cell to function properly. Turn this off through the Group Policy Management console, and then restart your Windows cell.


Error

Windows firewall service is not enabled Diego for Windows enforces CF security group settings for apps running on the cell through the Windows firewall. Apps can run without this, but security groups do not work correctly and apps have unrestricted network access.

Resolution

Enable the Windows firewall service.


Error

There was an error detecting ntp synchronization on your machine Clock skew with other CF components can occur if NTP is not configured. Clock skew can result in odd errors: for example, not receiving any application metrics for apps running on the affected machine.

Resolution

For your Windows cell, use the same NTP server as the rest of your CF deployment.


Error

Failed to create container This usually indicates an issue with the Windows containerization service.

Troubleshoot Other Issues

Look at the Event Viewer logs in Windows to troubleshoot other issues:

  1. Navigate to Windows Logs > Application.

  2. Review log messages from the services running in DiegoWindows.

  3. To isolate the issue, clear the log, reproduce the issue, and review the latest messages.

event-viewer

Create a pull request or raise an issue on the source for this page in GitHub