Linux

Protocol reverse engineering with tcpdump

Sometimes network protocols don’t entirely behave as documented. Other times there is no documentation at all beyond code. Either way you can sometimes find a need to sniff the traffic of a connection to find out what is really going on.

Whilst I have been working on MariaDB ColumnStore for a year now there are still some parts of the codebase I know little about. I recently had to write some code that worked with the network protocol of ColumnStore, but there were a few parts that were difficult to understand exactly what was happening just by looking at the code. This is where tcpdump came in.

tcpdump is a powerful tool to help you sniff the raw packet data for network connections. It can be very verbose giving parts of the TCP/IP handshake, headers, etc… This is way more than I often need for reverse engineering network protocols so I use tcpflow to filter the results. The final command looks a little like this:

sudo tcpdump -i lo -l -w - port <PORT> | tcpflow -D -C -r -

Breaking this down we are listening on localhost interface with a line buffered output to pipe using raw packets. We then use tcpflow to just show the hex data when reading from the pipe.

If we look at port 8616 (DBRM controller) for ColumnStore the end result can look a little like this during a small insert query:

0000: 37c1 fb14 0500 0000 3100 0000 00 7.......1....

0000: 37c1 fb14 0600 0000 0000 0000 0000 7.............

0000: 37c1 fb14 0100 0000 2d 7.......-

0000: 37c1 fb14 0d00 0000 00bd 1d00 0000 0000 0000 0000 00 7....................

0000: 37c1 fb14 0100 0000 34 7.......4

0000: 37c1 fb14 0500 0000 0029 0000 00 7........)...

0000: 37c1 fb14 9100 0000 1a05 0000 0000 102d 0000 0000 0000 0000 0000 0000 80ff ffff 7..............-................
0020: ffff ffff 7ffe ffff ff00 202d 0000 0000 0000 0000 0000 0000 80ff ffff ffff ffff .......... -....................
0040: 7ffe ffff ff00 302d 0000 0000 0000 0000 0000 0000 80ff ffff ffff ffff 7ffe ffff ......0-........................
0060: ff00 502d 0000 0000 0000 0000 0000 0000 80ff ffff ffff ffff 7ffe ffff ff00 702d ..P-..........................p-
0080: 0000 0000 0000 0000 0000 0000 80ff ffff ffff ffff 7ffe ffff ff .........................

From observing the ColumnStore messaging code I know that “37c1 fb14” is an uncompressed packet header and the next 4 bytes are the packet length. The next byte is usually packet type (or response) which we can lookup some ENUMs to discover. From there we can figure out the rest packet contents. I won’t go into details here but on some occasions it required printing off this data and using highlighters to figure out the parts of the packet.

This method has been extremely useful for other things in the past as well such as debugging MySQL’s replication protocol. It is definitely part of my toolset for working on network daemons. If there are any similar tools you use please put them in the comments below. I’m always interested in improving my workflow and toolset.

Image credit: Terry Robinson, used under a Creative Commons license

LinuxJedi

Share
Published by
LinuxJedi

Recent Posts

The Legend Continues: Amiga 1000 Keyboard Revival

I have restored the boxed Amiga 1000 main unit and the mice that came with…

2 days ago

Amiga 4000 Repair: This one was just weird

I was recently sent an Amiga 4000 motherboard repair. It should have been quite straightforward,…

3 days ago

Unboxing the Legend Continues: Amiga 1000 Mouse Restoration.

I recently received a boxed Amiga 1000 which was in excellent condition, but required a…

1 week ago

Unboxing a Legend: Amiga 1000

I have a local friend who is a private collector of vintage computers and consoles,…

2 weeks ago

Amiga 4000 With Lots of Little Problems

I’ve had a few people send me things in to repair lately. Amongst these was…

2 months ago

Amiga A3640 CPU Card Repair

Lately, I've been very busy, but haven't had many interesting things to blog about happen.…

2 months ago