KodeShip
Home Solid Shots Software Python Philosophy Personal Movies Finance

Lunar Bug

Published Sun 16 January 2022 in Software

by Sridhar Parasuraman Software Systems Java Linux Debugging

On the morning of 1st July 2015, I was notified of high CPU loads for most of our servers (Open Suse Linux). Needless to say, this caused many of our customer-facing applications to be unavailable. The Engineers who were troubleshooting were at their wit's end. No code had been deployed. No new script had been run. There were no hacker attacks. We had not run any huge campaigns.


Most Java processes exhibited the high loads. We ran strace and found that the system call being run was futex. Futex is a lock primitive, used as building blocks for higher synchronization constructs.Turns out that the reason most of our Java processes were spinning on futex was due to "leap second". Because of the Moon slowing down Earth's rotation, the international time keepers add a second occasionally, to accommodate the difference between precise time and imprecise observed solar time. As a result "NTP" (a protocol to keep all servers clock time in sync) too counted 23:59:59 twice on June 30, 2015. But the internal server clocks didn't. Futex calls were always timing out.


To gve an analogy, it is like, you decide to clean your inbox by finding very old emails and deleting them to make space. Because 1 second for CPUs is like eons, all locks acquired were treated as old emails from the jurassic era and were cleaned up before they could even be opened. This continuous denial of lock requests causes spinning. Spinning locks that never resolve, leaves the CPU in a vegetative state.


The fix was simple in the end. Run this command as root and restart the processes.

date -s "`date`"


We blamed lunar gravity in our RCA.


The Moon slowing down Earth's rotation. Frozen servers as a result. Wow!! I felt the heavens were pointing us to the glitch in the matrix !!

Other articles

Greenspun, his 10th and I

Thu 13 January 2022

This post is about my adversarial relationship with Greenspun's 10th rule and how it all changed when I...

Fun Dojo

Sun 02 January 2022

In this Dojo, we work out our functional muscles.. reframetemp.frontend.app.init();

Understanding Decentralized Exchanges

Thu 23 December 2021

To understand decentralized exchanges (DEX), we should try and understand how its centralized counterpart...

Fireworks

Sat 11 December 2021

In this post, we will learn how to put together a spectacular animation like Anthony Galea's Fireworks. I have...

Rule 30

Fri 15 October 2021

In Fun Life, we looked at Conway's Game of Life and implemented it in functional style. In this post, we see...

Fun Life

Mon 11 October 2021

Computer simulations can be excellent teachers. This essay explores the beauty of one such simulation -...

D3 - Part 2 Intro

Thu 07 October 2021

This is part 2 of our introduction to d3. In the first part, we looked at bar charts based on divs. In that...

D3 - Part 1 Intro

Tue 14 September 2021

This series explores the popular JavaScript data visualization library d3 through the lens of ClojureScript....

srimux

Sun 31 May 2020

This post is a journey of abstractions. These abstractions stand on the shoulders of a few command line...

Previous Next page
  • Page 1 / 2

Other Pages

  • About Me
  • Atom Feed

Social

  • Sridhar Parasuraman

© KodeShip