NTP client time out of Sync?

Starter:

We know NTP is used to synchronize time from authorized time server to the client to keep the client local time consistent with standard time.

However, what if we successfully set up the NTP servers, but the client is still out of sync with NTP server time to time which causes authentication issue? We need to know how to troubleshoot the related issue.

 

Knowledge Prerequisite:

Before troubleshooting the NTP related issues, you should have following knowledge:

  • What is a reference clock?
  • How will NTP use a reference clock?
  • How will NTP know about Time Sources?
  • What happens if the Reference Time changes?
  • How is Time synchronized?
  • Which Network Protocols are used by NTP?
  • When are the Servers polled?
  • How frequently will the System Clock be updated?
  • How frequently are Correction Values updated?

If you don’t exactly know the answers to each question, Please check the answers for above questions from NTP Website, there is one page especially for those Q/A and more

 

Real-life Scenario:

Issue Description:
======================

We deploy a file server and join to active directory domain, but it repeatedly alerting us the time skew issue, and sometimes file server even encountering AD domain offline issue that caused all the authenticated users failed to access the file server leading data unavailable issue.

 

 

We eagerly want to know, what caused this issue since we already had NTP configuration there. We have the following information in our hand:

  • This file server is a cluster contains several nodes.
  • By define, the first three nodes sync time with External NTP servers, the remaining nodes are sync with those three nodes.
  • Each node time is always well synced from following ntpq view at each node, two NTP servers set on this cluster: 150.99.100.26 & 10.25.17.10

 

 

Troubleshooting steps:

  • Even though we think ntpd daemon working normally, we follow below procedure to update NTP conf file and sync time with NTP server and gracefully restart the service.

 

 

  • Unluckily, the issue reoccurs after above steps taken, in order to figure it why we need to know what is going behind the scene.
  • There is still have sharpness packets analysis tools from our toolkit as the last resort, so we capture packets at one of the node’s aggregation interface which had traffic with NTP servers.
  • We found there is two-way communication between file cluster and separate NTP servers. We randomly picked up four frames, since the same process repeatedly occurs.

 

 

 

After reviewing the NTP section of those packets, the final answers popped up. Have you found that?

The issue was caused by two different stratum of NTP servers the file cluster ask for time synchronization, and for these two NTP servers, at the higher stratum server also sync with another NTP server at the lower stratum. That’s really a bad idea and implementation. After remove 10.25.17.10 from NTP server list, this issue was gone.

Great hints we can take away from this troubleshooting steps are all from knowledge prerequisites and you can also learn the meaning of each item within each NTP section of the packet.

 

Referenced Links and documents:

Current RFC:

  • RFC 1305Network Time Protocol (Version 3)

Obsoleted RFCs:

  • RFC 958Network Time Protocol
  • RFC 1059Network Time Protocol (Version 1) Specification and Implementation
  • RFC 1119Network Time Protocol (Version 2) Specification and Implementation

Other Information:

3 thoughts on “NTP client time out of Sync?

Leave a Reply

Your email address will not be published. Required fields are marked *