It is not DNS, it is you! #
It is !always DNS #
It has really started grating on me that people are always blaming DNS, including the infrastructure peeps that defend a crappy internal DNS setup - Calling it a design would be way too nice, it is most often a result of a wizard that was ran by someone that struggled even spelling DNS. While there has been some high profile errors that can attributed to DNS, the root cause of some of those were due to bad design - Specifically centralization.
For internal/corporate DNS designs, it’s way too often “we ran a wizard when installing Active Directory” on top of that many have bundled a number of forwarders to make matters even worse.
That’s why I made this meme (Yeah, need to up my meme game).
Or if you do not trust me, take it from the one and only Psychotic Network Ferret over on Mastodon / infosec.exchange.
Specifically nuintari’s rules of networking 0x07
All too often DNS Designs are bad - Way too often. I see it everywhere, and a lot of well-meaning people argue that “it’ll break if we touch it, it is so fragile”. The problem here is that yes, it is fragile, but that is because you need to change the original “design”. Until this changes you will keep experiencing issues caused by badly implemented DNS - and who would want a core element of your network to be fragile? On top of that those organizations are missing out on one of the best log sources there is. While this is quite a huge topic, hopefully this blog post provide a little help towards building better DNS Designs.
Risks #
Some of the risks you must consider when running DNS Servers are:
- Denial of Service (DoS): Flooding the DNS server with name resolution requests and forcing the system to 100% utilization, preventing the server from processing name resolutions. Denial of service (DoS) attacks and Distributed Denial of Service Attacks (DDoS) can, even if only directed against one DNS Server, affect the whole network, i.e. by hindering valid users in resolving public DNS names. Other networks can be affected by an attack called DNS Amplification attack.
- IP Spoofing: Intruders use IP addresses (often acquired through footprinting) to gain access to network resources and send corrupted packets to recipient computers. Spoofing can enable packets to pass through filters that are designed to block traffic. Spoofing attacks that target DNS Server to cache false Resource Records (RRs) can direct requests from users towards unknown system in the belief that they are known. This may result in hosts being infected with unwanted software. This phenomenon is known as Cache Pollution, and has been used actively in a number of attacks and defacements.
- False dynamic updates: Where adversaries, through errors in configuration or software are able to add or change Resource Records in new or existing zones.
- DNS Tunnels: The DNS query/reply payload is used to create a tunnel out of the internal network.
- Data Exfiltration: The DNS query/reply payload is used to exfiltrate data.
- Reconnaissance: information gathered from a seemingly innocent zone-transfer or a large number of lookups, can be leveraged to gather information about the internal infrastructure, helping an adversary selecting targets.
- Reputation: Your DNS server can potentially be used as a bot against other systems by adversaries. This may lead to legal action, but will almost always result in negative publicity. You are going to have a bad day if your DNS Servers are used in a DNS Reflection attack.
DNS Servers #
In this blog post the following terms are used.
- Authoritative DNS server: Name server, serving public zones for your organization.
- Recursive DNS server: Name server resolving public names on behalf of your internal infrastructure.
- Externally facing DNS server, DNS server, or just “server”: When used here, specifies both authoritative and recursive DNS servers.
- Internal DNS Server: Specifies servers hosting zones that must not be accessible externally.
- Other server: Specifies other supporting systems, such as management servers.
- Your organization: Take this to mean the network you’re trying to protect, whether it is your homenetwork, the corporate network belonging to your employer, your own company, and/or the NGO you’re helping.
Authoritative DNS servers #
These are your DNS Servers serving public DNS Zones.
The only duty that the public Authoritative servers have is answering queries from the Internet on zones these servers are authoritative for. These servers do not make external name resolution requests and do not perform recursive queries. Attack surface on these servers can therefore be minimized. It is very important that these servers cannot perform recursion. For BIND9 add recursion no; to named.conf.options. If recursion is allowed your server may be used in reflection (DDoS) attacks, Which happen when an adversary leveraging a botnet to generate spoofed DNS requests, the target will experience a flood of DNS replies, all coming from UDP source port 53, and if not configured correctly your server will be one of them.
Regularly test your Authoritative DNS Servers for misconfigurations, including requesting a recursive query of a Domain not on that server, e.g. dig google.com @yourauthoritativeserver. The result should including something similar to:
And the DNS Server log should show something like this:
If, on the contrary, your server returns the information requested, you must go back and configure your Authoritative DNS Server to not allow recursion immediately.
Recursive DNS servers #
The Recursive servers are responsible for querying on behalf of internal hosts. Queries against these servers are answered directly when the servers are authoritative for a zone or through recursive queries to the internet when the servers are not authoritative for the zone, and do not have the answer cached. This means that these servers perform DNS queries as per the DNS RFC’s, and thus must be allowed to communicate with other DNS servers on the internet. It is therefore imperative that these servers are configured with correct and up-to-date Root Hints. Automate updating of the Root Hints with a simple Cron job or with a systemd service.
In order for the recursive servers to be as effective as possible, configure them as secondaries for all your relevant zones (stealth secondaries for the external zones so as to not give away internal infrastructure information), as this will speed up queries for your own domains while providing a single resolution, logging and monitoring plane.
Make sure that your recursive DNS Servers have DNSSEC Validation enabled. For BIND9 configure this in named.conf.options using the options
Regulary test that the ad bit is set in the response, by querying a RR in a known signed zone, e.g.
The flags set above include ad which is exactly what we want, as that indicates that the server validated DNSSEC.
If you enable logging for DNSSEC ( and you really should, but more about that later), you should get something like this in the log when starting BIND9:
Meaning that you now trust the root servers “Key 20326 for zone .”.
Internal DNS servers #
These are the DNS Servers serving all your internal DNS Zones. Technically they are also Authoritative DNS Servers but those zones should never be advertised on the internet. Your internal domain names should always be offically registered and owned by your organization, do not use your own fancy domains like .local or whatever. It costs next to nothing to maintain a “real” domain, you’ll be happy you paid the few bucks for a domain later.
Hidden Primary #
Configuring a hidden primary DNS server prevents attacks on said primary DNS server. The server should not be listed with the registrar, has no NS Resource Record, and is npt referenced in the SOA Resource Record of the DNS zone (a hostname of any of the secondaries should be used within the SOA Resource Record). The server itself goes behind a firewall configured to only allow traffic to/from the secondaries, this way the primary DNS server is well protected, as it will be almost invisible, have a very limited attack surface, and can be monitored thoroughly for any unwanted behavior. Please note that this is more of a backup and recovery solution and will not stop a DDoS attack on your secondaries.
Authority Resource Records #
Zones are based on a concept of server authority. When a DNS server is configured to load a zone, it uses two types of Resource Records to determine the authoritative properties of the zone; the Start of Authority (SOA) and the Name server (NS) Resource Records (RR).
The SOA and NS Resource Records have a special role in zone configuration. They are required records for any zone and are typically the first Resource Records listed in DNS zone files.
Name Server (NS) Resource Records #
NS Resource Records are used to note which DNS servers are designated as authoritative for a zone. By listing a server in the NS Resource Record, it becomes known to others as an authoritative server for the zone. This means that any server specified in the NS Resource Record is to be considered an authoritative source by others, and is able to answer with certainty any queries made for names included in the zone. When configuring a Hidden Primary that specific servers do not have a NS Resource Record, however is configured to be a primary on the secondaries. For BIND9 that is done through the use of of type master|secondary and the allow-transfer { ip-address-1; ip-adddress-2; }; directives in named.conf.local as well as some fiddling with the SOA Resource Record (see next).
Start of Authority (SOA) Resource Record #
The SOA Resource Record is always first in any zone. It indicates the DNS server that is the primary server for the zone (the origin). It also stores other properties such as version information and timings that affect zone renewal or expiration. These properties affect how often transfers of the zone are done between servers authoritative for the zone.
The SOA Resource Record contain the following information;
Primary server: Host name of the primary DNS server for the zone. (must be faked when using a hidden primary).
Responsible person: E-mail address of the person responsible for administering the zone. A period (.) is used instead of an at sign (@).
Serial Number: Revision number of the zone file. This number must increase each time a Resource Record changes.
Refresh interval: Time¹ a secondary DNS server waits before querying its source for the zone to attempt renewal of the zone. When the refresh interval expires, the secondary DNS server requests a copy of the current SOA Resource Record for the zone from its source, which answers this request.
Retry interval: Time¹ a secondary server waits before retrying a failed zone transfer. This time is normally shorter than the refresh interval.
Expire interval: Time¹ before a secondary server stops responding to queries after a lapsed refresh interval where the zone was not refreshed or updated. Expiration occurs because at this point in time, the secondary server must consider its local data unreliable.
Negative response caching TTL: The time¹ clients are asked to cache negative results from the zone. The most common negative response is NXDOMAIN (Non-Existent Domain).
Here’s a practical example:
¹All times in seconds, e.g. a value of 3600 mean 1 hour.
Block and Log #
DNS is an awesome log source and a powerful way of identifying adversaries as, and when, they attack, enabling early & successful detection of and response to attacks as they happen. As this is primarily a blog post about DNS, this topic will not be covered in any detail, however hopefully the next sections will be food for thought and inspiration on this subject.
However do make sure you log everything on your DNS Servers. Previously it worked well capturing all traffic going to and from DNS Servers, however with DoH and DoT this has become harder, although you might still be able to do that if you are loadbalancing and/or offloading TLS to other devices. Remember that the only way to detect DNS Tunneling is your DNS Logs.
Blocking #
Use your own DNS Servers to rapidly and efficiently block unwanted sites - refer to the section on Response Policy Zones (RPZ) later in this blog post. You can also just simply black hole entire domains or hosts, redirecting users to a site in your control informing them of what and why this is happening.
DFIR and DNS #
There’s a lot of context available from DNS logs, e.g. the hostname to IP address association provided by DNS logging is very useful in characterizing NetFlow observations, which have no layer 7 context.
You should also add additional context to the logs, for example by using Mark Baggets Freq Server, a Web server that integrates with yout SIEM system and uses character frequency analysis to identify random hostnames, thereby identifying C2 and exfil traffic and Domain Stats that identify recently registered and “first visit” domains.
This is just scratching the surface, however the possibilities are almost endless.
Threat Hunting with DNS #
Some of us (Including Mandiant and Microsoft) did not catch the initial DGA DNS traffic from the trojanized SolarWinds Orion (Sunburst/UNC2452) even with (some) DGA detection in place (see Freq Server). However with the encoding and hashing going into the generated domain names and since the group were using a AWS domain (IIRC) it were not recently registered nor never visited before, so Domain Stats did not alert us either. However with the information about how the encoding was done it was easy to identify if/when Orion had communicated. And not least learned a lot about how to be even better at detecting DGA weirdness happening. As with everything Threat Hunting make sure to follow a playbook and turn it into automated detection based on the learnings and findings.
The above is just one (oversimplified) example, however it should give you an idea of the possibilities.
Example Logging Config for BIND9 #
Below is a decent template to use as the starting point for logging configuration on a BIND9 recursive server. Please always use print-time iso8601-utc instead of “just” print-time yes. No-one, and i mean NO-ONE, sane want any other format and timezone for their logs.
Note: BIND must be compiled with dnstap support included and configured to enable that support at runtime (comment it out in the config if yours isn’t.)
Cryptography #
Cryptography is used for signing zones (DNSSEC) and encrypting traffic with TLS for DNS over HTTPS (DoH), DNS over TLS (DoT), and DNS over QUIC (DoQ). This section cover the very basics of that.
DNSSEC #
The original DNS protocol (Do53) provided no protection against malicious or forged answers. DNS Security Extensions (DNSSEC) addresses this by adding digital signatures so DNS responses can be verified for integrity and authenticity. In other words, with DNSSEC the query reply can be verified to not have been tampered with and that it is coming from the authoritative servers, not from a Meddler in the Middle (MitM). This is all done using assymetric cryptography, that is Public/Private Key Pairs.
While there’s been some criticism of DNSSEC, I strongly encourage using it, at least for your internet facing domains. If you want to utilize DANE/TLSA there is no way around it either (and you really should have that in place).
Encryption #
For the purposes of this section, encryption of data involves transforming the data into another form, known as ciphertext, whereas the original data to be encrypted is known as plaintext. The protocols used are rather complex and involve both assymetric and symmetric cryptography, however this is not critical for understanding the next sections.
Privacy #
What about privacy? Several clever minds set out to add privacy to DNS lookups, resulting in two different standards; DNS over HTTPS (DoH), and DNS over TLS (DoT). Both are an alternative transport to the long-time standard “native” DNS, carried in UDP packets (sometimes TCP) over port 53. DNS-over-TLS (DoT) is the better alternative to DoH. Your DNS Server server should be capable of accepting queries over traditional DNS (Do53), DNS over HTTPS (DoH), and DNS over TLS (DoT). Which transport is used for an individual query should depend on what the client uses to contact your servers.
What is the privacy issue with Do53? By default, DNS queries and responses are sent in the clear over UDP. This means that all queries can be read by networks, ISPs, or anybody able to monitor transmissions - That is any Meddler in the Middle (MiTM) of DNS queries. Even when querying for an address of a website that you intend to access over HTTPS, the DNS query is exposed. DNS queries are not private, allowing governments to censor the Internet and adversaries to stalk your online behavior. While useful for analysis and investigations a good example of this is the huge collections of historical DNS data used for both good and bad.
However do not believe that DoH or DoT solve the privacy issues we’re seeing, it is merely an additional step towards a better internet.
DNS over TLS (DoT) #
DoT is defined by RFC7858. DoT is the network security protocol for encrypting and wrapping DNS queries and replies inside the Transport Layer Security (TLS) protocol to increase privacy and security by preventing eavesdropping and manipulation of DNS data via man-in-the-middle attacks. The well-known port number for DoT is 853/TCP.
Some claim that DoT provides less privacy as it will be obvious that DNS resolution is going on from the use of port 853, while on the contrary DoH is hidden with all the other HTTPS traffic. That seems more like security through obscurity to me, however feel free to educate me on the topic.
DNS over HTTPS (DoH) #
DNS over HTTP is a way to transport DNS queries and responses via HTTPS URIs, using the TLS security provided by HTTPS to encrypt those messages. DoH is defined by RFC 8484 for communications between a DNS client and a recursive resolver. It uses port 443/TCP for communications.
There’s some issues with DoH worth mentioning here;
- Centralization of the Internet: It appears that DoH leads to more centralization of DNS query traffic - e.g. Mozilla have now enabled DOH and is pointing to CloudFlare’s 1.1.1.1 for all your queries in Firefox. Bert Hubert wrote a great blog post on this topic in 2019 over at RIPE Labs
- Increased Security Risk to your Network: You should be using DNS to “block and log”. Response Policy Zones (RPZ), for example, can be used to make DNS a control point to block unwanted traffic. The logs from DNS will be discussed later, however can be one of the best log sources on your network.
- What DNS Server is really used?: Traditionally your clients were assigned which DNS Servers to use by DHCP (or manually) and everything on that system used those DNS Servers. With DoH your applications, such as Firefox and Chrome will be using something else in their default configurations, effectively circumventing your security controls discussed above, leaving you with less control of, and visibility into, what is going on within your network.²
²There’s an IETF Working Group working on adaptive DNS discovery. Hopefully this will not turn into the next WPAD.
DNS over QUIC (DoQ) #
It was not until the other day, when reminded by Patrick Mevzek I realized that DoQ was proposed as RFC 9250 in May of 2022.
DoQ is the latest addition to these DNS transports. QUIC itself was designed to make HTTP traffic more secure, efficient, and faster, and features mandatory encryption, provides multiplexing, and improves on connection establishment time by combining the transport and encryption handshakes into a single round trip. Those qualities apply to DoQ as well.
DoQ uses port 853/UDP unless there is a mutual agreement to use another port, however must not use port 53/UDP. As mentioned above, DoT uses port 853/TCP, and since QUIC is UDP based I guess port 853/udp made sense and was there for the taking.
Zones #
A DNS Zone is a delegated partition of the domain namespace delegated to you or your organization. Managing that zone well is your responsibility.
Forward Lookup Zones #
This is the zones for the domains you own. Manage them well. Make sure that you get them signed (DNSSEC) and create records for SPF, DMARC, and DKIM. Even if they’re not just you should create those Resource Records as null records. Example:
-
DMARC: _dmarc in TXT v=DMARC1; p=reject; sp=reject; rua=mailto:dmarc-report-email; ruf=mailto:forensics-report-mail; fo=1; aspf=s; adkim=s | This DMARC Resource Record mean that you will receive reports from MTAs supporting DMARC on the specified email addresses. See also ParseDMARC for further details on how to parse, ingest, and report on this with Elastic or Splunk.
-
DKIM: *._domainkey in TXT v=DKIM1; p=
-
SPF: @ in TXT v=spf1 -all | The DKIM and SPF Resource Records basically tells receiving MTAs that no mailservers nor DKIM key exist, thus any mail purporting to come from the domain in question should be rejected according to the DMARC policy.
-
CAA: @ in CAA 0 issue “letsencrypt.org” | The CAA Resource Record is used to provide additional confirmation for the Certification Authority (CA) when validating an SSL certificate. This Resource Record allows you to specify which certification authorities are authorized to deliver SSL certificates for your domain. This provides a little extra security if an adversary is trying to obtain certificates for your domain.
Reverse Lookup Zones #
Always create reverse lookup zones for all of your networks. A lot of systems and services perform a reverse lookup and while they often still work while receiving NXDOMAIN for the query additional delays are incurred. Many applications will behave better and faster with the right Reverse Lookup Zones in place.
Response Policy Zones #
Use RPZ’s extensively! Automate creation of RPZ’s from MISP (or similar), ensuring that you have a core set of blocked domains based on good intel, and have the possibility to react very fast when detecting something anomalous going on. If your detection game is strong you may be able to stop the adversary before they get the loot. If you also add some ad-networks to your block-lists, you’ve not just increased security but also saved some bandwidth.³ Malvertising is real and used by several adversaries, so do block as much as you can.
³Some studies have shown up to a 40% decrease in bandwidth usage with ad-blocking. Add other savings in terms of CPU usage, and thus power usage, and it might mean real fiscal savings.
Forwarders & Conditional Forwarders #
A DNS forwarder is a DNS server that forwards DNS queries for external DNS names to DNS servers outside of that network. A Conditional Forwarder is a DNS server that only forward queries for specific domain names only. Instead of forwarding all queries it cannot resolve locally, it forwards a query to specific forwarders based on the domain name contained in the query.
Forwarders #
Configure your recursive servers to use Root Hints and use proper recursion, then let your internal servers forward to those. Do not “stack” forwarders as that will spoil performance and mess with your DNS Resolution. Keep it to a single forwarder if at all possible.
Do not forward to Cloudflare, your ISP, Google, QUAD9, or whatever. That is centralization, and centralization of DNS is not a good thing in this instance. Besides you’re providing those entities with a lot of information about your organization and yourself by using these, whether or not you are using DoH and DoT. The benefit may be that they block some malicious domains, but they might also block domains that you do not think should be. Besides it is not going to be faster than your local cache for most of your queries as many claim. Besides that your ISP is not in the business of running DNS.
If using a DNS Provider, such as those discussed above, Cisco Umbrella, or similar services, you could consider those your recursive servers, however it is still a good idea to cache those queries in what can be called a caching forwarder (however you would be back to using the Recursive DNS Servers as discussed herein). This provides the following benefits and should be preferred over clients querying the DNS Providers servers directly;
- Performance: Reducing the amount of queries that recurse to DNS Provider, saving bandwidth and providing a faster experience for the end user when their queries are already in your DNS Servers cache.
- Security: Logging and monitoring is advised to identify possible compromise from specific endpoints or customers, and may be required by local law.
- Local Policy: The ability to “block and log' all queries at the forwarder level puts more control in your hands (as discussed previously), without relying exclusively on DNS Provider to block malicious domains and/or to be able to block e.g. ad-networks which DNS Provider may not want a fight with.
Conditional Forwarders #
As discussed Conditional Forwarders are able to forward queries for specific domains. This can be very useful for resolution of e.g. domain names inside of a partners network with their own internal DNS and during migrations of DNS infrastructures. Do not overdo it, it is still more efficient to have your own servers configured as a secondary of that zone instead.
Tying it all together #
After all this rambling, what would the layered DNS Design actually look like?
Wow, that is a lot of work?
Well, it really is not. At least not in terms of hardware and software. If You’re running Active Directory you probably already have the internal DNS servers needed, then you “just” need;
Internal DNS Servers: If You’re running the Hackers Super Highway⁴ Active Directory you already should have Internal DNS Servers in place. They may require a little reconfiguration, however Microsoft provide a decent DNS Server with good features for secure updates etc., so stick with that here. This is the only place where any Forwarding shall be done, and that is to the Recursive DNS Servers. Do not do any conditional forwarding here, that is for the Recursive DNS Servers to do.
Recursive DNS Servers: This is where You’ll likely need some new servers. Install some *Nix boxes with BIND9, PowerDNS, Knot DNS, NSD, or whatever your prefer. I’ll be installing BIND9, thank you! Configure these to tie other namespaces together. Let these host all internal zones as well as communicating with DNS Servers on the Internet, starting with the Root Name Servers (remember to update the root hints regularly). You might even want to make these secondaries on the AD Zone(s) as well - The goal here is to create a complete view of all internal DNS zones, reaching a state of “if it does not exist on the recursive servers, it does not exist.” This provides you with an Enterprise Management Plane for all name resolution, allowing you to Block and Log everything DNS as well as optimize performance. This may be overkill for smaller organizations, but really helps in complex environments. If you need to limit access to some zones or parts thereof, use ACLs and Views as needed.
-
Important: This also makes the Recursive DNS Servers a juicy target for reconnaisance, so design and build them well and monitor the crap out of them.
-
Bonus: The logs from the Recursive DNS Servers will tell you almost everything going on in your organization. Cherish them, use them, detect attacks with them, Threat Hunt with them, do DFIR with them.
Authoritative DNS Servers: These can be outsourced, and probably should be unless it is your core business to run DNS. However do make sure that the provider support a number of core requirements, including:
- 1) DNSSEC: This is a must, if they do not support this, run!
- 2) Hidden Primary: It is nice to have your own hidden primary and have that control the public secondaries hosted by the provider.
- 3) API: An API would allow you to manage your records in a structured and controlled way. Either an API or a Hidden Primary is a must.
- 4) Export: If neither 2 nor 3 is available, at a bare minimum you should be able to easily both import and export your configuration from/to standard zone files.
⁴As per John Strand of Blackhills Information Security. (I tend to agree).
There’s tons of great advice and good practice on managing and maintaining your DNS Servers securely on Knowledge-Sharing and Instantiating Norms for DNS and Naming Security | kindns.org, heed their advice!
References #
[1] Security and Privacy for public DNS Resolvers | ENISA
[2] Adaptive DNS Discovery | IETF
[3] Centralised DoH is bad for Privacy, in 2019 and beyond | RIPE Labs
[4] The Big DNS Privacy Debate | RIPE Labs
[5] Adblock Plus Efficacy Study
[6] Malvertising: When Online Ads Attack | Trend Micro
[7] Malvertising definition | Malwarebytes
[9] How SUNBURST Used DNS to Avoid Detection | ExtraHop
[10] Threathunting. Frequency analysis to identify C2 over DNS
[11] THREAT HUNTING USE CASE: DNS QUERIES | ReliaQuest
[12] Threat Hunting using DNS logs
[13] Mark Bagget’s DNS Stats | GitHub
[14] Mark Bagget’s Freq Server | GitHub
[15] Sean the Geek’s ParseDMARC | Github
[16] Knowledge-Sharing and Instantiating Norms for DNS and Naming Security | kindns.org
[17] The untold story of SolarWinds
[18] State of DNS Rebinding in 2023
[19] DNS Training and Labs | nsrc.org
[20] Everything about DNS