I’ve not been writing here much lately, and I think it’s for a pretty good reason. My daughter was born about a month ago, and I have been very busy there! In getting ready for the baby, my “office” has become the baby’s room, which means that my hardware projects are safely stashed away in the basement (for now). I do intend to get back to them once I get a decent schedule in place.
Earlier this year I set up Pi-hole in my home. For those unaware, Pi-hole is a network-wide ad blocker. This means that any device using my network at home will benefit from ad blocking. I’ve had a wildly unnecessary Dell server running in my basement for the last few years, and initially set it up there, since it had a static IP already. Configuring my router to use the server as a DNS server was fairly straight-forward, and for the most part it worked flawlessly.
There were some minor issues I had when updating it, which Google helped me to solve. These issues generally were due to nsswitch.conf, and for some reason it seemed like I had something else on the server that was fighting with the Pi-hole over config files. I finally decided to shut down the server, and set up a new instance Pi-hole from scratch on an old Raspberry Pi I had around, and that issue seems to have resolved itself (for now?).
Out of curiosity, I started to take a peek at the codebase and realized that a good chunk of the code is simply shell scripts. On the one hand, this is impressive and fairly portable! On the other, it seemed like something I could set out to write for myself, as a learning exercise.
I knew the very high-level basics of what DNS is going into this: essentially a way to convert a name (e.g.
google.com
) into a resolvable IP address (e.g. 172.217.10.78
). I knew that it relied on UDP for communication,
for speed and simplicity (I learned later that sometimes a TCP connection is used in certain cases).
I found the “DNS Protocol” article on NS1’s website to be pretty informative, and used this as a good starting point. I came up with a pretty simple sketch for what I wanted to do:
As I’ve found over time, Go has some pretty great packages already built in. In this case, Google has created the
dnsmessage
package, which handles a lot of the packing/unpacking of packets, and makes it easy to work with the
protocol.
This is where I started. I created an event loop that would listen for requests and farm them out to handlers. The code for this is quite straight-forward, and really didn’t change much from the first time I wrote it:
s.conn, err = net.ListenUDP("udp", &net.UDPAddr{Port: 53})
if err != nil {
log.Fatalf("Failed to listen to UDP port %v", err)
}
defer s.conn.Close()
for {
packetBuffer := make([]byte, 512)
_, remoteAddr, err := s.conn.ReadFromUDP(packetBuffer)
if err != nil {
log.Fatalf("Failed to read packets from UDP port: %v", err)
}
go handleReceivedDnsRequest(packetBuffer, remoteAddr)
}
That’s it! The code here is fairly self-explanatory, so I won’t say much except to mention that I chose to
call handleReceivedDnsRequest(...)
as a goroutine, so that this
code can be parallelized. I don’t expect to have much of a load at home, but it seemed like good practice
to do it this way.
To get started, I decided to take a very simple approach to filtering domains: I’m not going to support
regular expressions to start with. I’d love to add this down the road, but to start, I think its a good
simplification to simply use explicit lists. To this end, I decided that the blacklist and whitelist would
be simply []string
slices. Down the road, it may make sense to use a database for this, but for a starting
point, this should work nicely.
I decided to use Pi-hole’s approach of using Hosts files as a way to define block lists. Hosts files are well-known and simple to understand, and it will allow me to quickly build lists of domains to block.
Again to simplify things, I chose to discard any IP information in hosts files. This is for two reasons: essentially I plan to “block” the domains–I don’t need them to redirect to specific places, and because it’s less data to carry around.
I wrote a quick parser to retrieve Hosts files from a given URL (e.g. https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
)
and parse it into a map of domain name to IP address. Then I simply iterated through the map to retrieve the keys (i.e.
domain names), discarding the IP address information.
The whitelist is expected to be much shorter, and probably more personalized. As such, I don’t expect this to be parsed from several remote lists, and even if it is, I suspect the parsing will be of a simple list, not a map, so using a Hosts file for this is unnecessary.
So we’ve received the DNS request and parsed it into a Go structure. The next step is to figure out whether or not to block this request. The logic I’ve chosen is simply:
allowedRequest = whitelisted or not(blacklisted)
Put another way: we will only block requests that are not on the whitelist and are on the blacklist.
A likely unnecessary thing I did was to support multiple “DNS Questions” (a DNS Question is essentially one
domain name to look up) per DNS Request. In practice this is apparently fairly rare, but it was pretty simple
to implement, so I went ahead and did it anyways. Basically the logic is to filter the questions into “block”
and “allow” lists, creating DNS Answers for the blocked Questions that point to 0.0.0.0
(or ::/0
for an
IPv6 Question), and forwarding the allowed DNS Questions to an upstream DNS Resolver (e.g. 8.8.8.8
).
The way I implemented this upstream communication isn’t great. Since each incoming DNS request is handled in its own goroutine, I chose to have the upstream communication done over its own UDP channel. This means that if several DNS requests come in, there will be several concurrent UDP packet exchanges with the upstream server. In a future version, maybe I will choose to use a single centralized thread for communicating with the upstream DNS resolver. That said, this could also be a bottleneck and UDP ports are cheap, so there isn’t really a good reason to change it. Another thing I am relatively unhappy with is the logic for when something goes wrong in communicating with the upstream server. Right now, the goroutine will block indefinitely if the request or response UDP packet is dropped.
Once we have all of this information, we construct the response to the DNS request, which means essentially putting a list of DNS Answers into a packet that correspond to the DNS Questions that were asked of us, and sending a UDP packet back to the requester.
I tested this using dig
in Ubuntu on Windows (WSL):
jfisher@JFisher-Desktop:~ $ dig @127.0.0.1 google.com
; <<>> DiG 9.10.3-P4-Ubuntu <<>> @127.0.0.1 google.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5330
;; flags: qr aa; QUERY: 0, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; ANSWER SECTION:
google.com. 140 IN A 172.217.10.110
;; Query time: 20 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Sep 20 21:04:14 EDT 2019
;; MSG SIZE rcvd: 38
jfisher@JFisher-Desktop:~ $ dig @127.0.0.1 itunes.net
; <<>> DiG 9.10.3-P4-Ubuntu <<>> @127.0.0.1 itunes.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8573
;; flags: qr aa; QUERY: 0, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; ANSWER SECTION:
itunes.net. 14399 IN A 165.160.13.20
itunes.net. 14399 IN A 165.160.15.20
;; Query time: 29 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Sep 20 21:04:21 EDT 2019
;; MSG SIZE rcvd: 54
and then with a blocked domain (using this block list):
jfisher@JFisher-Desktop:~ $ dig @127.0.0.1 googletagservices.com
; <<>> DiG 9.10.3-P4-Ubuntu <<>> @127.0.0.1 googletagservices.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53880
;; flags: qr aa; QUERY: 0, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; ANSWER SECTION:
googletagservices.com. 0 IN A 0.0.0.0
;; Query time: 14 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Sep 20 21:08:19 EDT 2019
;; MSG SIZE rcvd: 49
Note the response is 0.0.0.0
, as expected of a blocked domain.
If you want to take a peek at my code, it is on my GitHub: https://github.com/jonathanfisher/DnsFilter.
Some next steps, if I ever get to it:
I am not sure that this method of blocking ads/tracking/etc. will be around very long. A simple way for some developers to get around this is to embed their own calls to DNS servers in their applications, bypassing the need for network-level DNS servers.