Operations grimoire/DNS: Difference between revisions

From Nasqueron Agora
No edit summary
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
DNS is currently not hosted directly on Nasqueron:
DNS is provided both on Nasqueron and Hurricane Electric infrastructures.
* for nasqueron.org, IPv6 blocks and most domains, DNS is hosted by Hurricane Electric -> reach Dereckson for modifications
* for eglide.org, it's managed by Gandi -> reach Sandlayth for modifications


KnotDNS is currently being installed in our servers by Dorian, to act as primary server.
== Nasqueron primary DNS server ==
Content of the zones will be stored in operation repository.
Knot serves our DNS records from dns-001 as ns1.nasqueron.org primary server.


== DNS change workflow ==
Zone is then automatically submitted to Hurricane Electric as secondary servers.


# If needed, open a task, or include to an existing task you need the DNS change
=== Scope ===
# Reach DNS contact, ask them to comment back on the task when done
; ICANN Domains
* nasqueron.org: {{Ops file|roles/dns/knot/files/zones/nasqueron.org.zone}}


=== Tips ===
=== Edit a zone ===
To update records for our DNS:
 
# Edit the relevant zone file in {{Ops file|roles/dns/knot/files/zones/}}
# Deploy from Complector with <code>salt dns-001 state.apply roles/dns/knot/config</code>
# Lint the zone on dns-001 with <code>zonecheck /var/db/knot/nasqueron.org.zone</code>
# Reload Knot server with <code>service knot reload</code>
# Check new record with nslookup or dig against ns1.nasqueron.org
# Check a little later against ns2.he.net
 
== Hurricane Electric as primary DNS servers ==
Some DNS zones aren't currently hosted directly on Nasqueron.
 
In such case, open a task on DevCentral or reach Dorian or Dereckson for modifications.
 
=== Scope ===
* IPv6 blocks (nasqueron account)
* Most extra domains (dereckson account)
 
== Special cases ==
=== Eglide ===
The eglide org domain is managed by Gandi -> reach Sandlayth for modifications
 
== Tips ==


Web domains use CNAME, see [https://netbox.nasqueron.org/ipam/services/?filter=web-domains-cname CNAME for web domains] (NetBox) for the full list.
Web domains use CNAME, see [https://netbox.nasqueron.org/ipam/services/?filter=web-domains-cname CNAME for web domains] (NetBox) for the full list.


== Proposals to host DNS at Nasqueron ==
== Troubleshoot ==
=== Knot DNS ===
==== kzonecheck: error: failed to run semantic checks ====
Did you invoke kzonecheck directly on a Jinja template? If so, it can't validate it.
 
You can run {{Ops file|_tests/roles/python/dns/test_dns_zones.py}} through <code>(cd _tests && make test-roles-dns)</code>, it will resolve all the pair (pillar configuration, zone file) we have in the repository and then call kzonecheck.
 
If you run it directly in /usr/local/etc/knot against a .zone file, there is a syntax error.
Check there is no jinja like <code><nowiki>{{ identity }}</nowiki></code> in it, if so perhaps it has been copied instead of deployed through Salt.
 
==== kzonecheck: 1    missing SOA at the zone apex ====
The top domain of the zone is the "apex"; the SOA record needs to exist and to use '@' syntax:
 
<syntaxhighlight lang="diff">
-nasqueron.org. 172800 IN SOA {{ identity }}. ops-dns.nasqueron.org. (
+@ 172800 IN SOA {{ identity }}. ops-dns.nasqueron.org. (
    2025090200 ;serial
    10800      ;refresh
    1800      ;retry
    604800    ;expire
    86400 )    ;minimum
</syntaxhighlight>
 
==== kzonecheck: 1    missing NS at the zone apex ====
The top domain of the zone is the "apex". The NS records need to exist and to use '@' syntax:
 
<syntaxhighlight lang="diff">
+@ 86400 IN NS {{ identity }}.
+@ 86400 IN NS ns1.he.net.
+@ 86400 IN NS ns2.he.net.
+@ 86400 IN NS ns3.he.net.
+@ 86400 IN NS ns4.he.net.
+@ 86400 IN NS ns5.he.net.
+
-nasqueron.org. 86400 IN NS {{ identity }}.
-nasqueron.org. 86400 IN NS ns1.he.net.
-nasqueron.org. 86400 IN NS ns2.he.net.
-nasqueron.org. 86400 IN NS ns3.he.net.
-nasqueron.org. 86400 IN NS ns4.he.net.
-nasqueron.org. 86400 IN NS ns5.he.net.
</syntaxhighlight>


Network:
==== Notify is refused ====
* 172.27.27.2/28 is reserved for a primary DNS server -> we'd also need an IPv4
A known issue with the current configuration is HE refuses DNS notify requests:
* Secondary can be hosted in another datacenter, or to an external provider with zone replication (HE?)


How we want to work:
    Oct 18 22:34:21 dns-001 knot[20816]: warning: [nasqueron.org.] notify, outgoing, remote 216.218.130.2@53 TCP, server responded with error 'REFUSED'
* Git repository with the direct configuration files or YAML template to generate it
* Web visualisation of the current zone
* DNSSEC


Products comments:
The task {{T|2148}} has been created to track this issue.
* We liked in the past:
** djbdns isn't maintained anymore (fork dnbns neither)
** Unbound, but it's not an authoritative server
* The decision is to be analyzed amongst
** BIND - de facto standard
** CoreDNS - a newcomer for Kubernetes - is it suitable for non-Docker workload too? If not, we could use CoreDNS with Kubernetes subdomains, and another product for other records.
** Knot DNS - maintained actively by CZ.NIC, the .CZ domain registry, oriented security (DNSSEC), registries are first stakeholders, but with features like DynDNS support, it's a full authoritative server
** PowerDNS - used by various ISP


== Useful links ==
== Useful links ==


* [https://netbox.nasqueron.org/ipam/services/?filter=web-domains-cname CNAME for web domains] (NetBox)
* [https://netbox.nasqueron.org/ipam/services/?filter=web-domains-cname CNAME for web domains] (NetBox)

Latest revision as of 20:38, 20 October 2025

DNS is provided both on Nasqueron and Hurricane Electric infrastructures.

Nasqueron primary DNS server

Knot serves our DNS records from dns-001 as ns1.nasqueron.org primary server.

Zone is then automatically submitted to Hurricane Electric as secondary servers.

Scope

ICANN Domains

Edit a zone

To update records for our DNS:

  1. Edit the relevant zone file in rOPS: roles/dns/knot/files/zones/
  2. Deploy from Complector with salt dns-001 state.apply roles/dns/knot/config
  3. Lint the zone on dns-001 with zonecheck /var/db/knot/nasqueron.org.zone
  4. Reload Knot server with service knot reload
  5. Check new record with nslookup or dig against ns1.nasqueron.org
  6. Check a little later against ns2.he.net

Hurricane Electric as primary DNS servers

Some DNS zones aren't currently hosted directly on Nasqueron.

In such case, open a task on DevCentral or reach Dorian or Dereckson for modifications.

Scope

  • IPv6 blocks (nasqueron account)
  • Most extra domains (dereckson account)

Special cases

Eglide

The eglide org domain is managed by Gandi -> reach Sandlayth for modifications

Tips

Web domains use CNAME, see CNAME for web domains (NetBox) for the full list.

Troubleshoot

Knot DNS

kzonecheck: error: failed to run semantic checks

Did you invoke kzonecheck directly on a Jinja template? If so, it can't validate it.

You can run rOPS: _tests/roles/python/dns/test_dns_zones.py through (cd _tests && make test-roles-dns), it will resolve all the pair (pillar configuration, zone file) we have in the repository and then call kzonecheck.

If you run it directly in /usr/local/etc/knot against a .zone file, there is a syntax error. Check there is no jinja like {{ identity }} in it, if so perhaps it has been copied instead of deployed through Salt.

kzonecheck: 1 missing SOA at the zone apex

The top domain of the zone is the "apex"; the SOA record needs to exist and to use '@' syntax:

-nasqueron.org. 172800 IN SOA {{ identity }}. ops-dns.nasqueron.org. (
+@ 172800 IN SOA {{ identity }}. ops-dns.nasqueron.org. (
     2025090200 ;serial
     10800      ;refresh
     1800       ;retry
     604800     ;expire
     86400 )    ;minimum

kzonecheck: 1 missing NS at the zone apex

The top domain of the zone is the "apex". The NS records need to exist and to use '@' syntax:

+@ 86400 IN NS {{ identity }}.
+@ 86400 IN NS ns1.he.net.
+@ 86400 IN NS ns2.he.net.
+@ 86400 IN NS ns3.he.net.
+@ 86400 IN NS ns4.he.net.
+@ 86400 IN NS ns5.he.net.
+
-nasqueron.org. 86400 IN NS {{ identity }}.
-nasqueron.org. 86400 IN NS ns1.he.net.
-nasqueron.org. 86400 IN NS ns2.he.net.
-nasqueron.org. 86400 IN NS ns3.he.net.
-nasqueron.org. 86400 IN NS ns4.he.net.
-nasqueron.org. 86400 IN NS ns5.he.net.

Notify is refused

A known issue with the current configuration is HE refuses DNS notify requests:

   Oct 18 22:34:21 dns-001 knot[20816]: warning: [nasqueron.org.] notify, outgoing, remote 216.218.130.2@53 TCP, server responded with error 'REFUSED'

The task T2148 has been created to track this issue.

Useful links