Transcoding streams for the Nokia N900 @ 3G data rates

January 22nd, 2011

Every four years, I learn again how to stream world cup matches from DVB into the LAN and onto 3G mobile devices. I finally wrote down some notes. Besides DVB, any input decodable by vlc is fine. I used these scripts to watch the morning talks at 27C3 on the N900 from my hotel room. Streaming video from a file server at home should work, too.

Local Multicast

  • Convince your boss: Productivity improves if everyone can watch the matches on their work screen and doesn’t need to hang around the break room. Internet uplink availability improves, too, with a local multicast source instead of many unicast streams.
  • Realtek NICs continually dropped packets when sending multicast UDP, resulting in ugly decoding errors at the receivers. No packet loss with Intel NICs observed.
  • IGMP snooping switches: If it works, only the ports with active receivers pass the full stream. If it doesn’t, you might end up bombarding slow servers with 5-12MBit/s UDP traffic.
  • You need multicast filters in front of POS/VoIP adapters. At least if you want them to send and receive faxes. Better yet, isolate them in a separate VLAN.
  • Putting filters in place at VPN ethernet bridges over slow SDSL links is also a good idea.
  • Use multiple processes: stream multicast with one, transcode the multicast stream with another. It allows tweaking transcoding options without annoying the local viewers.

cvlc -I rc -v channels.conf --mtu=512 --sout '#rtp{mux=ts,dst=226.23.23.42,port=1234,sdp=sap://,name="Fussi"}'

Mobile Limits

  • Variable data rates and latencies
    • I restricted the bit rate to less than 384kBit/s. Although HSDPA offers data rates up to 7MBit/s, the mobile network often delivers less, especially in subway tunnels or crowded places with many smartphone users.
    • Network delay in a moving train can reach 10 seconds and more. The receive buffer compensates short outages. Restart the stream after more than 5s.
    • TCP transport is not ideal in this case as the link layer already provides reliability. All packets retransmitted by TCP eventually arrive and must be discarded. TODO: Read this.
    • Only HTTP gets reliably through NAT and proxies.
  • The N900 cannot handle b-frames
    • vlc and mplayer lag behind and lose audio-sync.
    • The hardware H264 decoder just blocks on the first b-frame.
    • Restricting the H264 profile to “baseline” helps.
  • The iPhone can play the stream with vlc
    • Not without voiding your warranty, of course. It’s an iPhone.
    • Baseline restriction, too.
  • Android? Computer says: no
    • We didn’t find any media player that would play the stream in any form. We tried MP4/TS, ASF/MMSH and RTP.
    • Is there really no vlc for Android?

Transcoding Options Voodo

I tried lots of different options and fiddled with x264 encoder settings. I started from “preset=faster,profile=baseline” and gradually increased encoding quality until vlc reached about 60% CPU load on the transcoding machine. Then I reduced decoding-intensive features until the N900 would play it smoothly in parallel to my mail and IRC clients.

This option set is what I use now:

vlc -I rc -L -Z $1 \
  --sout '#transcode{deinterlace,audio-sync,height=288,fps=25,threads=1,high-priority,vcodec=h264,acodec=aac,ab=24,channels=1,vb=0,venc=x264{preset=faster,profile=baseline,keyint=125,min-keyint=25,qpmin=20,qpmax=51,qpstep=50,crf=25,vbv-maxrate=280,vbv-bufsize=450,scenecut=-1,ref=6,mixed-refs,subme=7,merange=32,partitions=all,direct=auto,trellis=2,no-dct-decimate,deblock=1:1}}:duplicate{dst=std{access=http,mux=ts,dst=0.0.0.0:3380},dst=std{access=mmsh,mux=asfh,dst=0.0.0.0:3370}}'

Using the N900 hardware H264 decoder

Maemo comes with special gstreamer plugins to support the Texas Instruments hardware codecs on the N900. The plugin source code is available.

You can browse the plugin options with gst-inspect:

narf900:~$ gst-inspect dsp
Plugin Details:
  Name:			dsp
  Description:		Texas Instruments DSP elements
  Filename:		/usr/lib/gstreamer-0.10/libgstdsp.so
  Version:		0.7.0-0maemo2.4+0m5
  License:		LGPL
  Source module:	none
  Binary package:	none
  Origin URL:		none

  dsph264enc: DSP video encoder
  dspjpegenc: DSP video encoder
  dspmp4venc: DSP MPEG-4 video encoder
  dsph263enc: DSP video encoder
  dspvdec: DSP video decoder
  dspdummy: DSP dummy element

Running the gstreamer pipeline from the shell

The simplest way to watch the transcoded stream is to start the gstreamer pipeline via gst-launch. Unfortunately, gst-launch cannot use full-screen, so you’ll see the clock and status line at the top of the display.

The decoder sometimes has problems finding the first key frame in the stream and gstreamer aborts with “pipeline doesn’t want to preroll”. I just use a loop to restart the pipeline in this case:

while :; do
  gst-launch gnomevfssrc location=http://youserver:3380 ! queue ! ffdemux_mpegts name=demux \
     demux.audio_00 ! queue ! nokiaaacdec ! autoaudiosink \
     demux.video_00 ! queue ! dspvdec ! autovideosink
  sleep 1
done

Using the gstreamer C api

A dedicated application is the more flexible approach to control the gstreamer pipeline and handle asynchronous events.

I ripped code from a Maemo gstreamer example, added a few lines and stitched it back together. It proves the concept but has lots of room for improvement. Aspect-ratio is often not set correctly. Main source file here. Autoconf environment for building in the Maemo SDK there.

Remote Control

  • vlc-remote works.
  • The transcoding vlc often needs a restart after change of source/channel-
  • TODO: Control the source and transcoding chain from the viewer application.

DNS Traceroute with Scapy

April 14th, 2010

[Work in progress, not the final version]

DNS Traceroute with Scapy can

  • detect forged MITM answers, as they happen on network routes into China
  • find packetfilters that drop queries with EDNS flags
  • find packetfilters that drop large responses

This is my first experience with Scapy and even though I haven’t used Python in several years, it certainly is easier to learn than Perl (IMHO, I often confuse myself with the different meanings of %@$#!) and extensible with only a few lines of code for a new payload type. Scapy’s main UI is the Python shell, which allows to inspect, twist and turn packets in more ways than many other utilities. I’m still exploring the vast possibilities of Scapy and Python, so this article will likely change over time.

On-Path DNS forgeries

My primary reason for a DNS traceroute was to get a better view on the chinese DNS filters, which spurred a lot of news reports lately. hping2 (with hand-crafted DNS payload) was not flexible enough. Scapy offers much more options to examine the returned packets.

Here’s a simple traceroute to a chinese DNS server, querying one of the blocked substrings. Because the DNS filters are on the path of the packets, they don’t have to guess the query’s ID or source port, thus randomization won’t help against this kind of poisoning:

>>> ans=sr(IP(dst="dns.baidu.com",ttl=(1,20))/UDP(dport=53)/DNS(qd=DNSQR(qname="facebook.com.baidu.com")),multi=1,timeout=2,inter=1)[0];ans.show()

The forged answers appear only a few hops behind the border routers. The queries are not dropped, though. They reach the destination server and eventually result in the correct response:

[...]
0008 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / ICMP 80.91.252.156 > 10.0.8.222 time-exceeded 0 / IPerror / UDPerror
0009 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / ICMP 213.248.94.126 > 10.0.8.222 time-exceeded 0 / IPerror / UDPerror
0010 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / ICMP 219.158.30.169 > 10.0.8.222 time-exceeded 0 / IPerror / UDPerror
0011 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / UDP / DNS Ans "93.46.8.89"
0012 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / ICMP 219.158.3.201 > 10.0.8.222 time-exceeded 0 / IPerror / UDPerror
0013 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / UDP / DNS Ans "8.7.198.45"
0014 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / ICMP 219.158.4.69 > 10.0.8.222 time-exceeded 0 / IPerror / UDPerror
0015 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / UDP / DNS Ans "46.82.174.68"
0016 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / ICMP 123.126.0.166 > 10.0.8.222 time-exceeded 0 / IPerror / UDPerror
0017 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / UDP / DNS Ans "159.106.121.75"
0018 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / ICMP 202.106.227.166 > 10.0.8.222 time-exceeded 0 / IPerror / UDPerror
0019 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / UDP / DNS Ans "37.61.54.158"
0020 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / ICMP 61.148.156.138 > 10.0.8.222 time-exceeded 0 / IPerror / UDPerror
0021 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / ICMP 202.106.43.66 > 10.0.8.222 time-exceeded 0 / IPerror / UDPerror
0022 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / UDP / DNS Ans "46.82.174.68"
0023 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / ICMP 61.135.165.253 > 10.0.8.222 time-exceeded 0 / IPerror / UDPerror
0024 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / UDP / DNS Ans "243.185.187.39"
0025 IP / UDP / DNS Qry "facebook.com.baidu.com" ==> IP / UDP / DNS Ans

>>> conf.AS_resolver.resolve(ans[9][1][IP].src)
[('213.248.94.126', 1299, 'TELIANET TeliaNet Global Network')]

>>> conf.AS_resolver.resolve(ans[10][1][IP].src)
[('219.158.30.169', 4837, 'CHINA169-BACKBONE CNCGROUP China169 Backbone')]

The last response contains the actual NXDOMAIN answer from dns.baidu.com:

>>> ans[25][1]
<IP version=4L ihl=5L tos=0x20 len=111 id=33106 flags= frag=0L ttl=53 proto=udp chksum=0x3b2 src=202.108.22.220 dst=10.0.8.222 options='' |<UDP sport=domain dport=domain len=91 chksum=0x1eac |<DNS id=0 qr=1L opcode=QUERY aa=1L tc=0L rd=0L ra=0L z=0L rcode=name-error qdcount=1 ancount=0 nscount=1 arcount=0 qd=<DNSQR qname='facebook.com.baidu.com.' qtype=A qclass=IN |> an=0 ns=<DNSRR rrname='baidu.com.' type=SOA rclass=IN ttl=7200 rdata="\x03dns\xc0\x19\x02sa\xc0\x19w\xce\xcb\xd6\x00\x00\x01,\x00\x00\x01,\x00'\x8d\x00\x00\x00\x1c " |> ar=0 |>>>

>>> ans[25][1][DNSRR].show()
###[ DNS Resource Record ]###
rrname= 'baidu.com.'
type= SOA
rclass= IN
ttl= 7200

Thinking point: If a resolver expects a DNSSEC-signed answer, should it ignore unsigned or invalid responses and keep listening for further packets until the “real” answer arrives or a timeout occurs? Current implementations don’t do this. They would accept the forged packet and return a name resolution failure due to invalid signatures.

The DNS interceptors appear on every route:

>>> ans=sr(IP(dst="a.cnnic.cn",ttl=(1,20))/UDP(dport=53)/DNS(qd=DNSQR(qname="xmarks.com")),multi=1,timeout=2,inter=1)[0];ans.show()
[...]
0009 IP / UDP / DNS Qry "xmarks.com" ==> IP / ICMP 203.192.137.174 > 10.42.23.3 time-exceeded 0 / IPerror / UDPerror
0010 IP / UDP / DNS Qry "xmarks.com" ==> IP / ICMP 159.226.254.253 > 10.42.23.3 time-exceeded 0 / IPerror / UDPerror
0011 IP / UDP / DNS Qry "xmarks.com" ==> IP / ICMP 159.226.254.29 > 10.42.23.3 time-exceeded 0 / IPerror / UDPerror
0012 IP / UDP / DNS Qry "xmarks.com" ==> IP / UDP / DNS Ans "37.61.54.158"
[...]

>>> conf.AS_resolver.resolve(ans[9][1][IP].src)
[('203.192.137.174', 10026, 'ANC Asia Netcom Corporation')]

>>> conf.AS_resolver.resolve(ans[10][1][IP].src)
[('159.226.254.253', 7497, 'CSTNET-AS-AP Computer Network Information Center')]

>>> ans=sr(IP(dst="123.123.123.123",ttl=(1,20))/UDP(dport=53)/DNS(qd=DNSQR(qname="twitter.com")),multi=1,timeout=2,inter=1)[0];ans.show()
[...]
0004 IP / UDP / DNS Qry "twitter.com" ==> IP / ICMP 89.221.35.21 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0005 IP / UDP / DNS Qry "twitter.com" ==> IP / ICMP 219.158.33.189 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0006 IP / UDP / DNS Qry "twitter.com" ==> IP / ICMP 219.158.30.241 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0007 IP / UDP / DNS Qry "twitter.com" ==> IP / ICMP 219.158.4.225 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0008 IP / UDP / DNS Qry "twitter.com" ==> IP / UDP / DNS Ans "159.106.121.75"
0009 IP / UDP / DNS Qry "twitter.com" ==> IP / UDP / DNS Ans "37.61.54.158"
[...]

>>> conf.AS_resolver.resolve(ans[4][1][IP].src)
[('89.221.35.21', 6762, 'SEABONE-NET Telecom Italia Sparkle')]

>>> conf.AS_resolver.resolve(ans[5][1][IP].src)
[('219.158.33.189', 4837, 'CHINA169-BACKBONE CNCGROUP China169 Backbone')]

EDNS Flags

Some packet filters drop DNS packets with EDNS flags. Traceroute can help to locate these devices.

Scapy does net yet support EDNS, although I found an old Trac entry without patch. I added a DNSOPTRR object and AD/CD flags, but parsing of large packets still produces warning messages and received OPT records are stored in DNSRR objects, not DNSOPTRR.

Usage is simple. Just add DNSOPTRR records to the query’s additional section:

>>> ans=sr(IP(dst="nsig4.attraktor.org",ttl=(1,10))/UDP(sport=RandShort(),dport=53)/DNS(qd=DNSQR(qname="attraktor.org",qtype="ALL"),ar=DNSOPTRR(edns_flags="DO",edns_bufsize=65535)),multi=1,timeout=2)[0];ans.show()

0000 IP / UDP / DNS Qry "attraktor.org" ==> IP / ICMP / IPerror / UDPerror / DNS Qry "attraktor.org."
0001 IP / UDP / DNS Qry "attraktor.org" ==> IP / ICMP 213.191.84.236 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0002 IP / UDP / DNS Qry "attraktor.org" ==> IP / ICMP 62.109.116.125 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0003 IP / UDP / DNS Qry "attraktor.org" ==> IP / ICMP / IPerror / UDPerror / DNS Qry "attraktor.org." / Padding
0004 IP / UDP / DNS Qry "attraktor.org" ==> IP / ICMP 213.191.66.138 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0005 IP / UDP / DNS Qry "attraktor.org" ==> IP / ICMP 80.81.192.164 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0006 IP / UDP / DNS Qry "attraktor.org" ==> IP / ICMP 213.239.240.200 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0007 IP / UDP / DNS Qry "attraktor.org" ==> IP / ICMP 213.239.244.176 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0008 IP / UDP / DNS Qry "attraktor.org" ==> IP / ICMP / IPerror / UDPerror / DNS Qry "attraktor.org."
0009 IP / UDP / DNS Qry "attraktor.org" ==> IP / UDP 88.198.161.124:domain > 10.42.20.232:50427 / Raw
0010 IP / UDP / DNS Qry "attraktor.org" ==> 88.198.161.124 > 10.42.20.232 udp frag:93 / Raw
0011 IP / UDP / DNS Qry "attraktor.org" ==> 88.198.161.124 > 10.42.20.232 udp frag:185 / Raw
0012 IP / UDP / DNS Qry "attraktor.org" ==> 88.198.161.124 > 10.42.20.232 udp frag:278 / Raw
0013 IP / UDP / DNS Qry "attraktor.org" ==> 88.198.161.124 > 10.42.20.232 udp frag:370 / Raw

Todo: Analyse and display a path with a restrictive packet filter. EDNS blockage is rare but it does exist.

The AD flag is accessible just like any other DNS header flag:

>>>
ans=sr(IP(dst="149.20.64.20")/UDP(sport=RandShort(),dport=53)/DNS(rd=1,qd=DNSQR(qname="isc.org"),ar=DNSOPTRR(edns_flags="DO",edns_bufsize=4096)))[0]

>>> ans[0][1][DNS].ad
1L

Large DNS packets

Note: This part is likely to change over the next days. I’m still exploring the vast possibilites of Scapy.

Answers larger than 512 bytes can pose a problem with packet filters that rely on a rather old definition of DNS. Scapy can help locate these devices on the network.

We can generate large queries by adding long records to the query’s answer section. BIND und PowerDNS answer such questions, Unbound and dnscache ignore it:

>>> ans=sr(IP(dst="85.10.240.248",ttl=(1,10))/UDP(dport=53,sport=RandShort())/DNS(rd=1,id=RandShort(),qd=DNSQR(qname="localhost"),an=DNSRR(rrname="example.com",type="TXT",rdata=("\xff"+"A"*255)*5),ar=DNSOPTRR()))[0];ans.show()
[...]
0000 IP / UDP / DNS Qry "localhost" ==> IP / ICMP 10.42.21.1 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror / Raw
0001 IP / UDP / DNS Qry "localhost" ==> IP / ICMP 213.191.84.236 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0002 IP / UDP / DNS Qry "localhost" ==> IP / ICMP 62.109.116.125 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0003 IP / UDP / DNS Qry "localhost" ==> IP / ICMP 213.191.66.73 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror / Raw
0004 IP / UDP / DNS Qry "localhost" ==> IP / ICMP 213.191.66.138 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0005 IP / UDP / DNS Qry "localhost" ==> IP / ICMP 80.81.192.164 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0006 IP / UDP / DNS Qry "localhost" ==> IP / ICMP 213.239.240.234 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0007 IP / UDP / DNS Qry "localhost" ==> IP / ICMP 213.239.244.68 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror
0008 IP / UDP / DNS Qry "localhost" ==> IP / ICMP 85.10.240.248 > 10.42.20.232 time-exceeded 0 / IPerror / UDPerror / Raw
0009 IP / UDP / DNS Qry "localhost" ==> IP / UDP / DNS Ans "127.0.0.1"

To debug large answer sizes, Scapy offers an extensible “AnsweringMachine” (see code below). I’ll expand it further, so that answer size and traceroute results can be controlled by querying different names, building a remote-controlled diagnostics loop.

>>> am=DNS_debug_am(verbose=1,promisc=0)
>>> am()

$ dig rlen=254.example.com txt @10.0.8.222
[...]
rlen=254.example.com. 0 IN TXT "AAAA[...]"
;; MSG SIZE rcvd: 336

$ dig rlen=65000.example.com txt @10.0.8.222
[...]
;; MSG SIZE rcvd: 65360

High-level functions

Often-used functions can be combinde into higher-level wrappers that provide reasonable defaults for most parameters. I’ll use this section to document the functions and objects I wrote for easier routine tasks.


#! /usr/bin/env python

# Set log level to benefit from Scapy warnings
import logging
logging.getLogger("scapy").setLevel(1)

from scapy.all import *

class DNS_debug_am(AnsweringMachine):
	function_name="dns_debug_responder"
	filter="udp dst port 53"

	def parse_options(self, domain="example.com.", rlen=255, mult=1, bsize=65535):
		self.domain=domain
		self.rlen=rlen
		self.mult=mult
		self.bsize=bsize

	def is_request(self, req):
		return req.haslayer(DNS) and \
				req.getlayer(DNS).qr == 0 and \
				req.getlayer(DNS).qd.qname.endswith(self.domain)

	def make_reply(self, req):
		ip = req.getlayer(IP)
		dns = req.getlayer(DNS)
		resp = IP(dst=ip.src, src=ip.dst)/UDP(dport=ip.sport,sport=ip.dport)

		if dns.qd.qtype != 16:
			resp /= DNS(id=dns.id, qr=1, rcode="refused")
			return resp

		args = {"rlen": self.rlen, "mult": 1, "bsize": self.bsize}

		for e in [ s.split("=") for s in dns.qd.qname[:dns.qd.qname.rfind("."+self.domain)].split(".") ]:
			try:
				args.update(dict([e]))
			except ValueError:
				pass

		print args

		try:
			c = int(args["rlen"]) / 255
			r = int(args["rlen"]) % 255

			if dns.qd.qname == "stats." + self.domain:
				dnsresp = DNS(id=dns.id, qr=1, qd=dns.qd,
						an=DNSRR(rrname=dns.qd.qname, type="TXT", rdata="\x05Stats"))
			else:
				dnsresp = DNS(id=dns.id, qr=1, qd=dns.qd,
						an=DNSRR(rrname=dns.qd.qname, type="TXT", rdata=("\xff" + "A"*255)*c + (chr(r) + "A"*r)),
						ar=DNSOPTRR(edns_bufsize=int(args["bsize"])))
		except ValueError:
			return resp/DNS(id=dns.id, qr=1, rcode="server-failure")
		else:
			return resp/dnsresp

def show_dns_traceroute(trace):
	print "%3s  %15s  %9s   %s" % ("TTL", "src IP", "RTT", "pkt summary")
	for s,r in trace:
		print "%3d  %15s  %9.3f   %s" % (s.ttl, r.src, (r.time-s.time)*100, r.summary())

def _dns_traceroute(target,minttl,maxttl,dport,sport,timeout,dnspkt):
	if minttl < 1:
		minttl = 1
	if maxttl < minttl:
		maxttl = minttl
	if not dport:
		dport=53

#	conf.checkIPsrc=0
	trace=SndRcvList([])

	for i in range(minttl-1, maxttl):
		qpkt=IP(dst=target,ttl=i+1,id=RandShort())/\
			UDP(sport=RandShort(),dport=dport)/\
			dnspkt
		if sport:
			qpkt[UDP].sport=sport

		ans,unans=sr(qpkt, multi=1, timeout=timeout)
		trace.extend(ans)

	# flatten trace tuples into serialized packet list
	pkts=PacketList()
	lasttl=0
	for s,r in trace:
		if s.ttl != lasttl:
			pkts.append(s)
			lasttl = s.ttl
		pkts.append(r)

	return trace, pkts

def dns_traceroute(target,minttl,maxttl,qname,qtype="A",qclass="IN",rd=0,dport=53,sport=0,timeout=2):
	trace,pkts = _dns_traceroute(target,minttl,maxttl,dport,sport,timeout,\
		DNS(rd=rd,qd=DNSQR(qname=qname,qtype=qtype,qclass=qclass),id=RandShort()))

	show_dns_traceroute(trace)
	return trace,pkts

#	trace.make_table( lambda(s,r): (s.dst, s.ttl, (r.src, r.summary()))

#trace,pkts=dns_traceroute("123.123.123.123", 1, 15, "twitter.com")
#trace,pkts=dns_traceroute("123.123.123.123", 1, 15, "twitter.com", "A", "IN", 0, 0, 0, 2)
#trace,pkts=dns_traceroute("85.10.240.250", 7, 11, "twitter.com", "A", "IN", 1, 0, 0)
#trace,pkts=dns_traceroute("85.10.240.250", 7, 8, "twitter.com", "A", "IN", 1, 0, 0)
#trace,pkts=dns_traceroute("85.10.240.250", 1, 3, "twitter.com", "A", "IN", 1, 0, 0, 1)
#wrpcap(\"/tmp/foo.pcap\",pkts)\n"

if __name__ == "__main__":
    interact(mydict=globals())


How to install DNSCurve on your authoritative name server

February 26th, 2010

Update:


I can haz running code?

DNSCurve has been hyped vaporware promoted for nearly two years while DNSSEC finally enjoys steady deployment and interoperable implementations. Now that OpenDNS announced the adoption of DNSCurve and many twits were tweeted about how OpenDNS uses “DNSCurve today, not DNSSEC tomorrow“, I wanted to find out how ready and usable DNSCurve really is today.

I won’t go into any of the aesthetic and operational problems of the DNSCurve design. I just wanted to make it work. Now.

To install the (so far) only available DNSCurve prototype by Matthew Dempsky, you need the DNSCurve forwarding proxy for authoritative servers and a DNSCurve patch for dnscache from the djbdns package. Both require the NaCl library to compile.

For easier installation, I built my own Ubuntu DNSCurve repository:

apt-get install daemontools djbdns dnscurve-forward


The UDP DNSCurve proxy

First, install the UDP part of the DNSCurve proxy server.

Create a user account for dnscurve:

useradd -r -d /var/lib/dnscurve-forward dnscurve

Now initialize a standard daemontools environment. Because dnscurve has no configure script yet, I used dnscache-conf:

dnscache-conf dnscurve dnscurve /var/lib/dnscurve-forward 85.10.240.252

Remove unnecessary files:

rm -r seed env/CACHESIZE env/DATALIMIT root/ip root/servers

The current implementation doesn’t seem to support IP address selection for outgoing queries (towards the authoritative server) using $IPSEND. Use route and/or iptables to force a specific source address if necessary.

Put the IP address of the authoritative server into env/FORWARD:

echo 85.10.240.254 >env/FORWARD

Modify the run script to start dnscurve-forward via udpserver:

#!/bin/sh
exec 2>&1
exec envdir ./env sh -c '
exec envuidgid dnscurve softlimit -d300000 udpserver "$IP" 53 /usr/bin/dnscurve-forward "$FORWARD"
'

Create a key pair for the server:

$ dnscurve-kegyen
DNS public key: uz5mjzrmru60lc6kdsszqhlsw0gvjdj6j9cknmr22qkjwsl7mrtdyz
Hex public key: 13fe3baf36402e1319c6df3e8939701b3268605a91ce2b848d906379e6cdcc7f
Hex secret key: 7c4db642ea6ee136c9ce05588d695326f90a95fa5959245a60a23e4c1dd0df83

Put the private key into env/DNSCURVE_PRIVATE_KEY:

echo "7c4db642ea6ee136c9ce05588d695326f90a95fa5959245a60a23e4c1dd0df83" >env/DNSCURVE_PRIVATE_KEY

Enable the service by linking it into the daemontools service directory:

ln -s /var/lib/dnscurve-forward /etc/service/

The service should start immeditately:

svstat /etc/service/dnscurve-forward
/etc/service/dnscurve-forward: up (pid 18158) 1 seconds

Now place the “DNS public key” into the zone file and point the A record towards your DNSCurve proxy server:

example.com. NS uz5mjzrmru60lc6kdsszqhlsw0gvjdj6j9cknmr22qkjwsl7mrtdyz.example.com.
uz5mjzrmru60lc6kdsszqhlsw0gvjdj6j9cknmr22qkjwsl7mrtdyz.example.com. A 85.10.240.252

Your DNSCurve proxy should now forward requests to the authoritative server. Watch log/main/current for error messages if it doesn’t work.

$ dig example.com SOA @85.10.240.252

Large UDP packets

The proxy passes on EDNS flags and returns large UDP responses unharmed:

$ dig +dnssec kein.sicherheitsproblem.de DNSKEY @85.10.240.252
[...]
;; MSG SIZE rcvd: 794

Fragmented UDP responses also pass through the proxy:

$ dig +bufsize=4096 +ignore kein.sicherheitsproblem.de ANY @85.10.240.252
[...]
;; MSG SIZE rcvd: 1949

DNSSEC zones

DNSSEC records passed through the DNSCurve proxy (in clear, without encryption) still validate. So, at least in theory, DNSSEC and DNSCurve could be used together, with DNSCurve encrypting the DNSSEC-signed records on the transport link between the last-hop resolver and the DNSCurve content server.


The TCP DNSCurve proxy

Update: The DNSCurve proxy does not support TCP yet. Because the client side implementation in dnscache does not use EDNS0, DNSCurve response size is currently limited to 512 bytes or less.

Installing the TCP part of the DNSCurve forwarder should be simple.
In practice, it doesn’t work for me. Every TCP query results in a error message.

I used axfrdns-conf to create the environment:

axfrdns-conf dnscurve dnscurve /var/lib/dnscurve-forward/tcp /var/lib/dnscurve-forward 85.10.240.252

The TCP proxy shares the environment variables with the UDP proxy, so just remove the env directory. The server should also accept queries from everyone, so it won’t need the acl database and Makefile:

rm -r env tcp Makefile

Modify the run script to read the enviroment variables from ../env instead of ./env and start dnscurve-forward via tcpserver:


#!/bin/sh
exec 2>&1
exec envdir ../env sh -c '
exec envuidgid dnscache softlimit -d300000 tcpserver -vDRHl0 -X -- "$IP" 53 /usr/bin/dnscurve-forward "$FORWARD"
'

Note the “-X” instead of “-x tcp.cdb”. This allows everyone to query your nameserver.

Enable the service by linking it into the daemontools service directory:

ln -s /var/lib/dnscurve-forward/tcp /etc/service/dnscurve-forward-tcp

The service should start immeditately:

svstat /etc/service/dnscurve-forward-tcp
/etc/service/dnscurve-forward-tcp: up (pid 20910) 3 seconds

However, queries to the TCP proxy always result in an error:

==> /var/lib/dnscurve-forward/tcp/log/main/current <==
@400000004b870bf6036de43c tcpserver: status: 1/40
@400000004b870bf6036dec0c tcpserver: pid 20929 from 85.177.243.207
@400000004b870bf6036df3dc tcpserver: ok 20929 0:85.10.240.252:53 :85.177.243.207::39986
@400000004b870bf6036df7c4 epoll_ctl: Operation not permitted
@400000004b870bf6036ff394 tcpserver: end 20929 status 256
@400000004b870bf6036ff77c tcpserver: status: 0/40

I know, the djbdns world doesn’t value DNS over TCP much, but I hope to see a solution to this problem soon. I wouldn’t run a name server for my zones without TCP support.


The DNSCurve resolver


The patched dnscache resolver doesn’t need any special configuration:

dnscache-conf dnscache dnscache /var/lib/dnscache 85.10.240.251

(don’t forget to allow clients to query your server by creating files in $ROOT/ip/)

Enable the service by linking it into the daemontools service directory:

ln -s /var/lib/dnscache /etc/service/dnscache

Now query your dnscache for a domain with DNSCurve public keys:

$ dig kein.sicherheitsproblem.de SOA @85.10.240.251

The dnscache logfile signals DNSCurve queries with a “+” in the “tx” log lines:

==> /var/lib/dnscache/log/main/current <==
@400000004b8717b12538965c tx 0 6 kein.sicherheitsproblem.de. kein.sicherheitsproblem.de. + 550af0fc

Hooray, it works!


IPv6 and all the other stuff


Unfortunately, the DNSCurve patch conflicts with the widely used IPv6 patches which in turn also conflict with the useful CNAME correcting patches. Oh what a joy is Open Source. Why does djb software always need so many patches to work right?


Drop ICMP rejections of slow DNS responses

January 17th, 2010

If you run a busy DNS resolver, you probably see a lot of ICMP port-unreachable packets from your resolver to authoritative name servers. This happens when slow DNS responses arrive after the resolver already closed the corresponding socket, e.g. because it received an answer from another name server.

The result can be large streams of unnecessary ICMP traffic to authoritative name servers.

To save resources, I drop these packets at the gateway:

## drop ICMP port-unreachable responses to UDP packets from source port 53
iptables -A FORWARD ! -f -p icmp --icmp-type port-unreachable \
-m u32 --u32 "0>>22&0x3C@ 14&0xFF=17 && "\
"0>>22&0x3C@ 12&0x1FFF=0 && "\
"0>>22&0x3C@ 8>>22&0x3C@ 8>>16&0xFFFF=53" \
-j DROP

The u32 match consists of three patterns:

  1. Skip the IP header, test for protocol 17 (UDP) in the IP header embedded after the 8 byte ICMP header
  2. Skip the IP header, test if the embedded IP packet is the first fragment, if any
  3. Skip both IP headers, test for source port number 53 in the UDP header. For some reason, the UDP header begins at offset 8, not at 0 as I expected. I haven’t investigated this further, yet.

If you have suggestions on how to achieve the same for IPv6, please let me know.

Bookmarks:

I learned how to skip headers with variable length from the IPTables U32 Match Tutorial.
The RFC Sourcebook has lots of protocol reference tables.


How to boot a remote cryptroot with Debian Lenny

November 6th, 2009

Update:
Since Debian Squeeze, the default initrd does not contain a busybox shell anymore and therefore no ifconfig, route and other tools. You must either add “copy_exec” calls for the tools required by your keyscript or configure the device another way.


[No introduction yet, will contain these thoughts:]

  • supply the passphrase over the internet
  • transmit the passprase over a secure connection
  • use a passphrase that can be written down, stored in a safe and read out over the phone if necessary
  • not require a special application, not even SSH. Anything that opens a SSL connection should do

To do:

  • look again at configure_network() and how to use it.
  • If an attacker gains access to the server and can modify initrd, kernel or boot loader, the passphrase can be compromised. Unless the server’s BIOS can boot from a remote network without local DHCP support, I have to accept that risk, though. My laptop computer boots grub, kernel and initrd from a USB drive I keep on my keychain.

Server side – prepare the initrd image

SSL certificate and key

Create a SSL certificate and key. This certificate identifies your server to the client providing the passphrase. Proper validation avoids leaking the transmitted secret to a man-in-the-middle.

Put both certificate and key into a single file /root/cryptroot/cryptroot_remotepass_keyscript.pem and save a copy of the certificate for later.

Key script

Create a shell script in /root/cryptroot/cryptroot_remotepass_keyscript.sh that

  • brings up the network interface
  • listens on a socket for SSL connections
  • echoes the received passphrase to stdout
  • takes down the network interface afterwards

Note: The file must be executable (chmod u+x), otherwise update-initramfs won’t copy it into the initrd.

Keep in mind that the script runs from initrd’s busybox shell, although you can add additional binaries with an initramfs-hook, as we’ll see later.

I use static network configuration and socat:

#!/bin/sh

## configure your network parameters here
IP=213.133.110.42
NETMASK=255.255.255.224
BROADCAST=213.133.110.63
GW=213.133.110.33
PORT=3333
## use static device name or first available eth* device
DEVICE=eth0
#DEVICE=`ifconfig -a|grep ^eth|awk '{ print $1 }'|head -n1`

[ -z "$DEVICE" ] && {
        echo "No ethernet device found. Aborting." >&2
        exit 1
}

echo "Configuring boot network device ${DEVICE}..." >&2
ifconfig $DEVICE down
ifconfig $DEVICE $IP netmask $NETMASK broadcast $BROADCAST up
route add default gw $GW

echo "Waiting for remote key..." >&2
socat OPENSSL-LISTEN:${PORT},verify=0,reuseaddr,cert=/etc/cryptroot_remotepass_keyscript.pem STDOUT |tr -d "\r\n"

echo "De-Configuring boot network..." >&2
route del default gw $GW
ifconfig $DEVICE down

Additional binaries

Create a shell script in /etc/initramfs-tools/hooks/cryptroot_remotepass that

  • Copies socat into the initrd’s bin/ directory
  • Copies all necessary libraries. copy_exec() will do that for you
  • Copies cryptroot_remotepass_keyscript.pem to etc/

Note: The file must be executable (chmod u+x), otherwise update-initramfs ignores it.

#!/bin/sh -e

PREREQS=""

prereqs() { echo "$PREREQS"; }

case "$1" in
    prereqs)
    prereqs
    exit 0
    ;;
esac

## hook-functions provides copy_exec()
. /usr/share/initramfs-tools/hook-functions

#set -x
copy_exec /usr/bin/socat /bin/
# Enable the following lines for Debian Squeeze:
#copy_exec /usr/bin/tr /bin/
#copy_exec /sbin/ifconfig /sbin/
#copy_exec /sbin/route /sbin/

cp /root/cryptroot/cryptroot_remotepass_keyscript.pem $DESTDIR/etc/

Back up your working initrd

Now would be a good time to save a copy of your working initrd and put it where your boot loader can find it.

Modify crypttab

Add a “keyscript” option to your root device’s crypttab entry. update-initramfs later copies it to keyscripts/ in the initrd.

md2_crypt  /dev/md2  none  luks,keyscript=/root/cryptroot/cryptroot_remotepass_keyscript.sh

Note: With a key script, cryptsetup will not ask for a passphrase on the console. Save a backup copy of an initrd without key script if you need to keep the ability to unlock the crypted root device from the console.

Generate a new initrd

Run update-initramfs to generate a new initrd containing your key script.

update-initramfs -u -k all

Look for warnings (“skipped”, “ignored”) that would indicate problems with the cryptroot setup.

Check the initrd contents

You should now be able to reboot and wait for you server to listen for a passphrase. If it doesn’t work or if you want to examine the initrd contents first, decompress the file with gzip and pipe it through cpio:

mkdir /tmp/initrd
cd /tmp/initrd
zcat /boot/initrd.img-2.6.26-2-xen-amd64 | cpio -iv

Check that

  • conf/conf.d/cryptroot contains an entry for the root device with the “keyscript” option
    target=md2_crypt,source=/dev/md2,key=none,lvm=hendrek-dom0_root,keyscript=/keyscripts/cryptroot_remotepass_keyscript.sh
  • the key script lies in keyscripts/
  • bin/socat exists
  • usr/lib/ contains the necessary libraries (libcrypto, libssl, libz)
  • cryptroot_remotepass_keyscript.pem exists in etc/

For debugging, you can chroot into the busybox shell:

chroot /tmp/initrd /bin/sh
./keyscripts/cryptroot_remotepass_keyscript.sh

Note: You probably want to disable “ifconfig down” and “route del” in the key script before you test it on a remote computer.

Client side – transmit the passphrase

You can transmit the passphrase with any application that supports SSL connections. I use socat but openssl s_client and telnet-ssl probably work, too. Disable local terminal echo to keep your passphrase confidential.

{
        echo -n "Password: "
        socat STDIN,echo=0 OPENSSL:213.133.110.42:3333,verify=1,cafile=/home/hauke/cryptroot/hendrek.cert
}

sendpass.sh

Here’s a slightly more complex script that decrypts a passphrase file with gpg and sends its contents to the server:

#!/bin/sh

KEYDIR=/home/hauke/cryptroot

HOST=$1
[ -z "$1" ] && { echo "Usage: $0 <hostname> [<IP address>] [<Port>]"; exit 2; };

[ -r "$KEYDIR/${HOST}.conf" ] && . "$KEYDIR/${HOST}.conf"
[ -r "$KEYDIR/${HOST}.gpg" -a -r "$KEYDIR/${HOST}.cert" ] || { echo "Passfile or certificate not found for $HOST in $KEYDIR"; exit 1; }
[ -n "$2" ] && IP4="$2"
[ -z "$IP4" ] && { echo "No IP address"; exit 1; };
[ -n "$3" ] && PORT="$3"
[ -z "$PORT" ] && PORT=3333

gpg -o - "$KEYDIR/${HOST}.gpg" | socat STDIN "OPENSSL:${IP4}:${PORT},verify=1,cafile=$KEYDIR/${HOST}.cert"

DNSSEC-enabled Ubuntu packages

November 2nd, 2009

Ubuntu’s Personal Package Archives are a nice way to provide customized packages for Ubuntu without the need to rebuild packages for all architectures on your own.

Instead of waiting two hours or more for glibc to compile, you can now install DNSSEC-enabled packages with back-ported patches from my DNSSEC PPA for Ubuntu 9.04 (Jaunty) and 9.10 (Karmic).

Install updated packages from PPA

Put these lines into /etc/apt/sources.list or add them in Ubuntu’s software manager:

deb http://ppa.launchpad.net/hauke/dnssec-enabled/ubuntu jaunty main
deb-src http://ppa.launchpad.net/hauke/dnssec-enabled/ubuntu jaunty main

(Karmic users obviously should replace “jaunty” with “karmic” or use the new syntax “ppa:hauke/dnssec-enabled”)

and install the package archive key:

apt-key adv --keyserver pool.sks-keyservers.net --recv-keys 890387116FBBE07B

Sign your fingerprints

To use SSH’s DNSSEC features, you need to:

  • Install glibc and openssh packages from this repository
  • Put SSHFP records with your host key fingerprints into DNS. ssh-keygen generates those records for you:
      orbit:~# for keyfile in /etc/ssh/ssh_host_?sa_key.pub; do ssh-keygen -r `hostname -f` -f $keyfile; done
        orbit.attraktor.org IN SSHFP 2 1 d1ff1cbfa68dd167a13d342eb030cc5f640ced97
        orbit.attraktor.org IN SSHFP 1 1 3fe08662ca9d72032601e5a333d2588cc8569dff
  • Sign your DNS zone
  • Use a DNSSEC-verifying resolver with the necessary trust anchors to validate your zone (eg. by Look-aside Validation)
  • Add “options edns0″ to /etc/resolv.conf (or /etc/resolvconf/resolv.conf.d/head if you use resolvconf)
  • Enable “VerifyHostKeyDNS” to let OpenSSH trust fingerprints that match signed SSHFP records:
      hauke@snorri:~$ ssh -v -o VerifyHostKeyDNS=yes orbit.attraktor.org
        [...]
        debug1: found 1 secure fingerprints in DNS
        debug1: matching host key fingerprint found in DNS

Nagios plugin to check RRSIG expiration dates

November 1st, 2009

Expired record signatures are a common problem in the introduction of DNSSEC. Scripts break, automatic processes fail. If the problem isn’t fixed in time, whole domains can become unresolveable.

I wrote a bash-script to check RRSIG expiration dates on selected names, suitable to run as a nagios plugin. It uses dig and parses its output with a convoluted sed expression. It also relies on “date -d” to support intervals like “+2days”. I really plan to rewrite it in Perl and use Net::DNS::SEC to validate the signatures. For now, the shell script works good enough, though.

Download

Download check_rrsig.

Usage

  check_rrsig -n <ns> -z <name> [ -w <refresh interval> ]
     -n <ns>                query nameserver <ns>
     -z <name>              check RRSIGs on <name>
     -w <refresh interval>  warn if signatures expire within <interval>. Passed to "date -d"

Examples

$ check_rrsig -n nsig2.hauke-lampe.de -z hauke-lampe.de -w +1week
RRSIG OK: Valid signature timestamps for hauke-lampe.de (min: 20091112102302 / max: 20091112102302)

$ check_rrsig -n parent.rfc5011.shinkuro.com -z roll_one-31.rfc5011.shinkuro.com -w +2days
RRSIG OK: Valid signature timestamps for roll_one-31.rfc5011.shinkuro.com (min: 20091104111136 / max: 20091106111650)

$ check_rrsig -n parent.rfc5011.shinkuro.com -z roll_one-31.rfc5011.shinkuro.com -w +7days
RRSIG WARNING: Signatures expire soon on roll_one-31.rfc5011.shinkuro.com: 20091104111136 < 20091108095820 (max: 20091106111650)

$ check_rrsig -n nsig2.hauke-lampe.de -z badsig.dnstest.hauke-lampe.de -w +1week
RRSIG CRITICAL: Expired signature for badsig.dnstest.hauke-lampe.de: 20080615225814 < 20091101095856 (max: 20080615225814)

Use with nagios

Install check_rrsig on your nagios server (eg. in /usr/local/bin).

Add a command definition to your nagios configuration:

# DNSSEC RRSIG
define command {
        command_name    check_dns_rrsig
        command_line    /usr/local/bin/check_rrsig -n '$HOSTADDRESS$' -z '$ARG1$' -w '$ARG2$'
}

Finally, define a new service

define service {
    host_name               brax
    service_description     dns-rrsig
[...]
    check_command           check_dns_rrsig!hauke-lampe.de!+7days
[...]
}

Links

Another nagios plugin from The Measurement Factory with a different approach: It verifies all of a zone’s nameservers and also the delegation tree.


Update: RES_USE_DNSSEC backports for Ubuntu and Debian

November 1st, 2009

My older patches are obsolete, now that glibc 2.11 supports RES_USE_DNSSEC and fixed an EDNS-related bug.

I backported the changes to the glibc versions used in Ubuntu 9.04 (Jaunty Jackalope) and 9.10 (Karmic Koala) as well as Debian Lenny:

Binary packages for Ubuntu are available from my Personal Package Archive.


Adding RES_USE_DNSSEC to the glibc resolver

July 4th, 2009

Note: This article is obsolete. Follow this link for backported RES_USE_DNSSEC patches from glibc 2.11.


While my recent take on making glibc more DNSSEC-friendly is “good enough” to let OpenSSH verify SSHFP records, it’s a workaround and not a real solution. A better approach would be to give applications the choice to use the RES_USE_DNSSEC resolver option and let them test for the AD flag in lookup responses or even verify the signatures on their own.

Here’s a patch to enable RES_USE_DNSSEC without adding actual signature validation support to the glibc resolver.

EDNS0 in glibc

Stock glibc already comes with support for EDNS0. RES_USE_EDNS0 or “options edns0” in /etc/resolv.conf add an EDNS0 OPT record to queries but offer no method to set DO=1:

127.0.0.1.47056 > 127.0.0.1.53: [1au] A? fnord.gov. ar: . OPT UDPsize=1024 (38)ldns
127.0.0.1.53 > 127.0.0.1.47056: NXDomain q: A? fnord.gov. 0/1/1 [...] ar: . OPT UDPsize=4096 (103)

Note: A problem can occur in the interaction of (portable) OpenSSH and glibc, resulting in queries with an invalid UDPsize=0. See this post on the openssh-unix-dev mailing list with a proposed workaround (committed). Also reported to Ubuntu and the glibc bug tracker (fixed).

Adding support for DO

This patch adds RES_USE_DNSSEC to resolv.h and a resolv.conf-option “dnssec-ok“.

[download patch]

A resolver option for security-aware applications

An application may now set RES_USE_DNSSEC and test for the AD flag in DNS responses while resolver functions oblivious to DNSSEC are not irritated by large extraneous record sets.

You may have to recompile DNSSEC-aware applications (e.g. OpenSSH) to recognize and use RES_USE_DNSSEC.

hauke@pope:~$ ssh -v -o VerifyHostKeyDNS=yes orbit.attraktor.org
OpenSSH_5.1p1 Debian-5ubuntu1a, OpenSSL 0.9.8g 19 Oct 2007
[...]
debug1: found 1 secure fingerprints in DNS
debug1: matching host key fingerprint found in DNS
127.0.0.1.41424 > 127.0.0.1.53: [1au] A? orbit.attraktor.org. ar: . OPT UDPsize=1024 (48)
127.0.0.1.53 > 127.0.0.1.41424: q: A? orbit.attraktor.org. 1/0/1 [...] ar: . OPT UDPsize=4096 (64)
127.0.0.1.60880 > 127.0.0.1.53: [1au] SSHFP? orbit.attraktor.org. ar: . OPT UDPsize=65535 OK (48)
127.0.0.1.53 > 127.0.0.1.60880: q: SSHFP? orbit.attraktor.org. 3/0/1 [...] ar: . OPT UDPsize=4096 OK (428)

Global settings in /etc/resolv.conf

Additionally, DO can be enabled globally in /etc/resolv.conf through “options dnssec-ok“. Don’t do it unless you have a good reason, as it can lead to unwanted behaviour.

Under the global option, all queries set DO=1 and may return DNSSEC records even where the resolver functions don’ t expect them. These functions usually return the correct answer, yet they log warnings:

firefox: gethostby*.getanswer: asked for "bd.hauke-lampe.de IN A", got type "46"

Also, the default buffer size is too small for many DNSSEC answers:

127.0.0.1.48227 > 127.0.0.1.53: 31891+ [1au] A? fnord.gov. ar: . OPT UDPsize=1024 OK (38)
127.0.0.1.53 > 127.0.0.1.48227: 31891 NXDomain|$ q: A? fnord.gov. 0/4/1 [...] ar: . OPT UDPsize=4096 OK (767)

A NXDOMAIN response in .gov with NSEC3 exceeds the 1024 byte buffer and needs to be retried with TCP:

127.0.0.1.56428 > 127.0.0.1.53: S, 1:1(0)
127.0.0.1.53 > 127.0.0.1.56428: S, 1:1(0) ack 1
127.0.0.1.56428 > 127.0.0.1.53: ., 1:1(0) ack 1
127.0.0.1.56428 > 127.0.0.1.53: P, 1:41(40) ack 1 31891+ [1au] A? fnord.gov. ar: . OPT UDPsize=1024 OK (38)
127.0.0.1.53 > 127.0.0.1.56428: ., 1:1(0) ack 41
127.0.0.1.53 > 127.0.0.1.56428: P, 1:1504(1503) ack 41 31891 NXDomain$ q: A? fnord.gov. 0/8/1 [...] ar: . OPT UDPsize=4096 OK (1501)
127.0.0.1.56428 > 127.0.0.1.53: ., 41:41(0) ack 1504
127.0.0.1.56428 > 127.0.0.1.53: F, 41:41(0) ack 1504
127.0.0.1.53 > 127.0.0.1.56428: F, 1504:1504(0) ack 42
127.0.0.1.56428 > 127.0.0.1.53: ., 42:42(0) ack 1505

Frequent retries over TCP increase the load on the name server and delay the answer to the client.

Larger default buffer size

The second patch changes the default query buffer size from 1024 to 4096. Usually not needed unless you have to enable “options dnssec-ok” for some reason and want to reduce the number of TCP queries.

Beware: Default buffer sizes are statically defined in several files. I certainly have not checked all cases for side-effects and may have missed a definition or changed one where it was not necessary.

[download patch]

With 4096 byte buffer size, even larger responses return without truncation:

127.0.0.1.35864 > 127.0.0.1.53: 64866+ [1au] A? fnord.gov. ar: . OPT UDPsize=4096 OK (38)
127.0.0.1.53 > 127.0.0.1.35864: 64866 NXDomain$ q: A? fnord.gov. 0/8/1 [...] ar: . OPT UDPsize=4096 OK (1501)

127.0.0.1.38231 > 127.0.0.1.53: 40972+ [1au] A? bd.hauke-lampe.de. ar: . OPT UDPsize=4096 OK (46)
127.0.0.1.53 > 127.0.0.1.38231: 40972$ q: A? bd.hauke-lampe.de. 3/5/16 [...] ar: . OPT UDPsize=4096 OK (2661)

Fragmented responses “work for me”:

(flags [DF], proto UDP (17), length 74) 10.42.20.232.48882 > 10.42.23.3.53: [udp sum ok] 25583+ [1au] A? bd.hauke-lampe.de. ar: . OPT UDPsize=4096 OK (46)
(flags [+], proto UDP (17), length 1500) 10.42.23.3.53 > 10.42.20.232.48882: 25583$ q: A? bd.hauke-lampe.de. 3/5/16 [...] [|domain]
(flags [none], proto UDP (17), length 1209) 10.42.23.3 > 10.42.20.232: udp

I did not find the source of this 256 byte buffer size:

127.0.0.1.36225 > 127.0.0.1.53: 44875+ [1au] ANY ANY? 1.0.0.127.in-addr.arpa. ar: . OPT UDPsize=256 OK (51)
127.0.0.1.53 > 127.0.0.1.36225: 44875 Refused q: ANY ANY? 1.0.0.127.in-addr.arpa. 0/0/1 ar: . OPT UDPsize=4096 OK (51)

Where to go from here

[This section is still in draft]

  • Does this patch make the stub resolver “security-aware” in the terms of DNSSEC RFCs? I have not verified the MUSTs and SHOULDs against this code
  • The next step would probably be DNSSEC validation in the resolver. The glibc stub resolver uses old code from BIND8 (see resolv/README). Maybe it can be amended with features from libUnbound?
    • Must support NSEC3, otherwise rather pointless
    • Example: .se provides a DNSSEC patch for dkim-milter
    • Still, implementing DNSSEC in a stub-resolver is expensive and introduces bugs in critical libraries
  • Where expensive DNSSEC validation is undesirable,  the query could be protected by TSIG between the resolver and the recursor or even between the application and the resolver?
    • port TSIG from BIND8 libraries to glibc
    • ldns supports TSIG, too
  • Other applications could set RES_USE_DNSSEC and test the AD flag
    • gnupg: “--auto-key-locate“, RFC2538bis/RFC4398
      • secure fingerprint and URL / public key in CERT RR
      • gpg --auto-key-locate cert -e -r [email protected]

How to get OpenSSH to see DNSSEC AD flags on SSHFP lookups with glibc

June 29th, 2009

Update: If you just want to know how to use OpenSSH with SSHFP DNS records, read this.

1. Create the SSHFP record, repeat with DSA key if necessary:

ssh-keygen -r foohost.example.com -f /etc/ssh/ssh_host_rsa_key.pub

2. Put the record into the DNS zone:

foohost.example.com.    A     192.168.1.1
                        SSHFP 1 1 1cee8dde082a7d50b8f46440f4c12a7bcd7c2741

3. In /etc/ssh/ssh_config or ~/.ssh/config, set:

Host *
  VerifyHostKeyDNS yes

4. Test:

ssh -v foohost.example.com
[...]
debug1: found 1 secure fingerprints in DNS

Done.


Note: This article is obsolete. Follow this link for backported RES_USE_DNSSEC patches from glibc 2.11.


OpenSSH can match host keys against fingerprints stored in SSHFP records in DNS and even check for the “Authenticated Data” (AD) flag in DNSSEC-secured lookups. While users of OpenBSD and derivative run-time libraries already benefit from this, glibc doesn’t support it yet. With a single line of inofficial patchery, we can get this bit set, though.

This is a patch for the DNS security-aware user. It does not make your application any more secure than it is now.

The compatible way

The most compatible method is to set the AD flag in the query header. That way, the response won’t contain all the additional DNSSEC-related records that could otherwise confuse an application which does not expect them. Neither the stub resolver nor OpenSSH validate RRSIGs on their own yet, anyway. Also, this method doesn’t need EDNS0 support in the resolver library. The validating recursor simply sets AD in its response header if the answer passed signature validation. OpenSSH tests for AD and can optionally use the result to trust a fingerprint without user interaction.

The RFCs define AD only for DNS responses at this time but IETF work is underway to standardize its use in queries, too. Both BIND and Unbound already implement it.

The downside is that incompatible DNS proxies and packet filters could strip AD flags or drop the packets altogether. If you’re stuck behind such an appliance and cannot run your own recursor, you’re probably out of luck. For further information on DNSSEC compatibility in popular broadband routers and SOHO firewalls, read the results of a study conducted by Nominet and Core Competence: DNSSEC-CPE-Report.pdf (pg. 12, Request Flag Compatibility)

So, my current quick and ugly hack is to unconditionally set AD in all queries sent by the stub resolver by hard-wiring “hp->ad = 1” in  res_nmkquery(). Ideally, the library would allow AD to be set through the use of a resolver option.

This patch “works for me” with Ubuntu Jaunty glibc-2.9-4ubuntu6 and also applies against stock glibc 2.10.1:

[download patch]

--- glibc-2.9/resolv/res_mkquery.c
+++ glibc-2.9-ad/resolv/res_mkquery.c
@@ -141,6 +141,7 @@
    hp->opcode = op;
    hp->rd = (statp->options & RES_RECURSE) != 0;
    hp->rcode = NOERROR;
+   hp->ad = 1;
    cp = buf + HFIXEDSZ;
    buflen -= HFIXEDSZ;
    dpp = dnptrs;

With this patch and “VerifyHostKeyDNS=yes” enabled in SSH config, OpenSSH can trust a host key fingerprint found in DNS:

debug1: found 1 secure fingerprints in DNS
debug1: matching host key fingerprint found in DNS

The True Way

The preferred method would be to use an EDNS0 OPT record and set its DNSSEC OK (DO) bit. This allows the resolver to negotiate a larger maximum response size and also returns all DNSSEC-related records, so that the resolver or application could validate the responses themselves, closing the insecure gap between a stub resolver and its validating recursor.

Unfortunately, glibc’s resolver codes still needs some work before it can make good use of EDNS0 for DNSSEC.

Since glibc 2.6, EDNS0 can be enabled by “options edns0” in /etc/resolv.conf or the RES_OPTIONS environment variable (see manpage) and also by adding RES_USE_EDNS0 to the resolver API options.  But this alone won’t let OpenSSH see the AD flag in DNS responses.

Note: A problem can occur in the interaction of (portable) OpenSSH and glibc, resulting in queries with an invalid UDPsize=0. See this post on the openssh-unix-dev mailing list with a proposed workaround (committed). Also reported to Ubuntu and the glibc bug tracker (fixed).

Also, RES_USE_DNSSEC is not yet defined in resolv.h or honored by the resolver code.

Additionally, the resolver’s default buffer size of 1024 byte is too small if RES_USE_DNSSEC is enabled for all queries. DNSSEC RFCs requires a minimum EDNS buffer size of 1220 bytes (although there is ongoing discourse on this limitation) and should be much larger in practice.  DNSSEC records increase the size of an answer by a sizable amount. Too small a buffer leads to truncated responses that need to be retried over TCP, placing a high burden on the server’s resources.

I addressed the issues with a different patch.

If you think you know what you’re doing and still want to experiment with DO=1 in stub resolver queries right now, add NS_OPT_DNSSEC_OK to the flags in __res_nopt():

[download patch]

--- glibc-2.9/resolv/res_mkquery.c
+++ glibc-2.9-do/resolv/res_mkquery.c
@@ -248,6 +248,7 @@
    *cp++ = NOERROR;    /* extended RCODE */
    *cp++ = 0;        /* EDNS version */
    /* XXX Once we support DNSSEC we change the flag value here.  */
+   flags |= NS_OPT_DNSSEC_OK;
    NS_PUT16(flags, cp);
    NS_PUT16(0, cp);    /* RDLEN */
    hp->arcount = htons(ntohs(hp->arcount) + 1);