When is “none” not really empty - this time, in DNS (it’s always DNS, right?)
Last week we had an interesting issue pop up on Lagoon. We use continuous integration (CI) widely, and it didn’t seem to be working. Troubleshooting the usual suspects (ahem, Jenkins), told us nothing. Jobs just hung indefinitely until they were killed, in both Jenkins and Github Actions.
We managed to narrow down the issue to Lagoon waiting an excessive amount of time when it created a project, causing subsequent API requests to timeout without an error message, but this couldn’t be reproduced locally, or in another similar setup in a different CI repo.
It turns out that we were using “none.com” as the placeholder URL for a deprecated Harbor configuration (installing Harbor connected to lagoon-core) - which used to work just fine! Until, it would seem, the owners of that domain set it to just eat any requests for 5 minutes, instead of returning a value, or returning nothing at all. Neither the alternate (and local) configurations we tested were using the none.com domain set as a fallback, hence their success!
From now on, we’ll be using the .invalid domain - it’s guaranteed to never return a value (https://www.rfc-editor.org/rfc/rfc6761#section-6.4)! We’re also working hard to remove this now obsoleted code from the codebase.
On the bright side, we’ve been able to update a lot of the test harness to add in useful debug and failure options, and have also identified an additional (unrelated) stability issue.