Reads network addresses in /proc

If you happen to be on a read only container that doesn’t have the usual network utilities, like netstat, ss (socket stat), lsof, etc then your only option is to leverage procfs, however procfs displays some data in hexadecimal.

This blog post briefly presents a few tricks to read /proc in a human readable way.

tl;dr One of the final command could be this one :

lists destination IPs on port 9042
$ cat /proc/$(pidof java)/net/tcp \
  | awk -v DPORT=$(printf ":%x" 9042) '$3 ~ DPORT { print $3}' \
  | sort -u \
  | cut -f1 -d':' \
  | awk '{gsub(/../,"0x& ")} OFS="." {for(i=NF;i>0;i--) printf "%d%s", $i, (i == 1 ? ORS : OFS)}'
10.45.12.17
10.45.1.18
10.45.77.20
10.45.31.23
10.45.84.25
10.45.8.26
10.45.54.30
10.45.55.30
10.45.12.32
10.45.19.34
10.45.10.75
10.45.32.123

I understand these long awk scripts are intimidating, but they are really helpful when you don’t have a choice. Plus you learn awk basics. So let’s see how I build it, then let’s look if it’s possible to repurpose elements of this command other use cases.

Constructing the above command

First we need to see which information we need to extract from procfs

output of procfs net/tcp
$ cat /proc/$(pidof java)/net/tcp
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode
   0: 0100007F:3A98 00000000:0000 0A 00000000:00000000 00:00000000 00000000  1337        0 1694605946 1 0000000000000000 100 0 0 10 0
   1: 00000000:3A99 00000000:0000 0A 00000000:00000000 00:00000000 00000000  1337        0 1694637598 1 0000000000000000 100 0 0 10 0
   2: 00000000:3A9E 00000000:0000 0A 00000000:00000000 00:00000000 00000000  1337        0 1694637607 1 0000000000000000 100 0 0 10 0
   3: 00000000:1F90 00000000:0000 0A 00000000:00000000 00:00000000 00000000 43514        0 1694688232 1 0000000000000000 100 0 0 10 0
   4: 00000000:3AF2 00000000:0000 0A 00000000:00000000 00:00000000 00000000  1337        0 1694605955 1 0000000000000000 100 0 0 10 0
   5: 0D07D00A:3A9E 0F01D00A:9008 01 00000000:00000000 00:00000000 00000000  1337        0 1716103554 1 0000000000000000 20 4 0 20 33
   6: 0D07D00A:9152 1D20D00A:1F90 01 00000000:00000000 00:00000000 00000000  1337        0 1713179716 1 0000000000000000 20 4 22 10 -1
   7: 0D07D00A:D290 2708DC0A:0050 01 00000000:00000000 00:00000000 00000000 43514        0 1716022449 1 0000000000000000 20 4 28 10 145
   8: 0100007F:1F90 0100007F:86DC 01 00000000:00000000 00:00000000 00000000 43514        0 1716279440 2 0000000000000000 21 4 2 10 24
   9: 0D07D00A:AC3E 7126D00A:1F90 01 00000000:00000000 00:00000000 00000000  1337        0 1694739437 1 0000000000000000 21 4 22 10 -1
  10: 0D07D00A:88FA 901BD00A:1F90 01 00000000:00000000 00:00000000 00000000  1337        0 1694718568 1 0000000000000000 20 4 30 10 -1
...
This document describes the interfaces /proc/net/tcp and /proc/net/tcp6.
Note that these interfaces are deprecated in favor of tcp_diag.

These /proc interfaces provide information about currently active TCP
connections, and are implemented by tcp4_seq_show() in net/ipv4/tcp_ipv4.c
and tcp6_seq_show() in net/ipv6/tcp_ipv6.c, respectively.

It will first list all listening TCP sockets, and next list all established
TCP connections. A typical entry of /proc/net/tcp would look like this (split
up into 3 parts because of the length of the line):

   46: 010310AC:9C4C 030310AC:1770 01
   |      |      |      |      |   |--> connection state
   |      |      |      |      |------> remote TCP port number
   |      |      |      |-------------> remote IPv4 address
   |      |      |--------------------> local TCP port number
   |      |---------------------------> local IPv4 address
   |----------------------------------> number of entry

   00000150:00000000 01:00000019 00000000
      |        |     |     |       |--> number of unrecovered RTO timeouts
      |        |     |     |----------> number of jiffies until timer expires
      |        |     |----------------> timer_active (see below)
      |        |----------------------> receive-queue
      |-------------------------------> transmit-queue

   1000        0 54165785 4 cd1e6040 25 4 27 3 -1
    |          |    |     |    |     |  | |  | |--> slow start size threshold,
    |          |    |     |    |     |  | |  |      or -1 if the threshold
    |          |    |     |    |     |  | |  |      is >= 0xFFFF
    |          |    |     |    |     |  | |  |----> sending congestion window
    |          |    |     |    |     |  | |-------> (ack.quick<<1)|ack.pingpong
    |          |    |     |    |     |  |---------> Predicted tick of soft clock
    |          |    |     |    |     |              (delayed ACK control data)
    |          |    |     |    |     |------------> retransmit timeout
    |          |    |     |    |------------------> location of socket in memory
    |          |    |     |-----------------------> socket reference count
    |          |    |-----------------------------> inode
    |          |----------------------------------> unanswered 0-window probes
    |---------------------------------------------> uid

timer_active:
  0  no timer is pending
  1  retransmit-timer is pending
  2  another timer (e.g. delayed ack or keepalive) is pending
  3  this is a socket in TIME_WAIT state. Not all fields will contain
     data (or even exist)
  4  zero window probe timer is pending

From the documentation, if I need to select connection targeting the remote port 9042, I’ll need the 3rd column.

remote address
$ cat /proc/$(pidof java)/net/tcp \
  | awk '{print $3}'
rem_address
00000000:0000
00000000:0000
00000000:0000
00000000:0000
00000000:0000
0F01D00A:9008
1D20D00A:1F90
2708DC0A:0050
0100007F:86DC
0100007F:1F90
...

Then we need to select the port, it’s simple as we just need to match the hexadecimal value of 9042.

connection’s destination addresses for port 9042
$ cat /proc/$(pidof java)/net/tcp \
  | awk '{print $3}' \
  | grep $(printf ":%x" 9042)
1148D00A:2352
142CD00A:2352
2207D00A:2352
2207D00A:2352
1E1BD00A:2352
1934D00A:2352
1A4CD00A:2352
4B1FD00A:2352
4B1FD00A:2352
1E2DD00A:2352
1A4CD00A:2352
1148D00A:2352
1E2DD00A:2352
1E1BD00A:2352
...

The grep command can be included as part of the awk script.

$ cat /proc/$(pidof java)/net/tcp \
  | awk -v DPORT=$(printf ":%x" 9042) '$3 ~ DPORT { print $3}'

The first thing to notice is the duplicate, entries, that’s expected, there’s multiple connection to the same remote IP address (possibly with a different state). Duplicates can be removed with sort -u. Then we need to extract the IP.

all destination IP (hexa reversed) on port 9042
$ cat /proc/$(pidof java)/net/tcp \
  | awk -v DPORT=$(printf ":%x" 9042) '$3 ~ DPORT { print $3}' \
  | sort -u \
  | cut -f1 -d:
110C2D0A
12012D0A
144D2D0A
171F2D0A
19542D0A
1A082D0A
1E362D0A
1E372D0A
200C2D0A
22132D0A
4B0A2D0A
7B202D0A

Then we need to print these in human-readable form. Notice they all are ending by 2D0A, those two octets are respectively 0x2D=0x45 and 10. This insight suggests these IPs are actually reversed, I’m not sure why ; I only ever used reversed IPs for reverse DNS lookup before. (I you know the reason please drop a comment ;))

So to get human-readable form we need to separate the 4 bytes, convert them in decimal, and reverse them.

$ echo "110C2D0A" \
  | sed 's/../0x& /g' \ (1)
  | awk '{ for(i=NF;i>0;i--) printf "%d.",$i; print "" }' \ (2)
  | sed 's/.$//' (3)
1 tell sed to split the stream every two characters, and prepend each byte by 0x, this is useful for printf "%x".
2 reverse the order of each field, this simply a for-loop decrementing the index, for each field printing it as a decimal. For each line prints a new line (that’s the role of print "") Then the last sed simply removes the last dot.

These two sed are a bit inelegant and be replaced by a better awk script:

$ echo "110C2D0A" \
  | awk '{gsub(/../,"0x& ")} OFS="." {for(i=NF;i>0;i--) printf "%d%s", $i, (i == 1 ? ORS : OFS)}'
  1. Here sed 's/../0x& /g' is replaced by {gsub(/../,"0x& ")}

  2. Replacing the last sed also requires a better collection joining in the awk script with OFS output field separator to separate each IP’s octets and ORS output result separator, to go to the next line, which gives for each IP printf "%d%s", $i, (i == 1 ? ORS : OFS)

human readable list of destination IPs on port 9042
$ cat /proc/$(pidof java)/net/tcp \
  | awk -v DPORT=$(printf ":%x" 9042) '$3 ~ DPORT { print $3}' \
  | sort -u \
  | cut -f1 -d':' \
  | awk '{gsub(/../,"0x& ")} OFS="." {for(i=NF;i>0;i--) printf "%d%s", $i, (i == 1 ? ORS : OFS)}'
10.45.12.17
10.45.1.18
10.45.77.20
10.45.31.23
10.45.84.25
10.45.8.26
10.45.54.30
10.45.55.30
10.45.12.32
10.45.19.34
10.45.10.75
10.45.32.123

Exploring other usage

There are many files in procfs net directory, I just showed /proc/net/tcp pseudo file, and I started this blog with a port filter, but they are other elements to look at, e.g. the TCP connection state. Looking aside there are also /proc/net/udp, /proc/net/route, etc pseudo files too.

TCP Connection state

While this may seem tricky at first, it’s easy to tweak these few lines, as functions, and combine them or modify them to extract on other criteria, e.g. the fourth field is about the connection state.

#define _LINUX_TCP_STATES_H

enum {
	TCP_ESTABLISHED = 1,
	TCP_SYN_SENT,
	TCP_SYN_RECV,
	TCP_FIN_WAIT1,
	TCP_FIN_WAIT2,
	TCP_TIME_WAIT,
	TCP_CLOSE,
	TCP_CLOSE_WAIT,
	TCP_LAST_ACK,
	TCP_LISTEN,
	TCP_CLOSING,	/* Now a valid state */
	TCP_NEW_SYN_RECV,

	TCP_MAX_STATES	/* Leave at the end! */
};

Thanks to this stackoverflow answer for pointing to the kernel code.

So if I want to list which ports are open, we need to filter on the TCP_LISTEN=0x0A state, in this case the local address, indicates what addresses have been bound by the process.

$ cat /proc/$(pidof java)/net/tcp \
  | awk -v TCP_STATE=0A '($4 == TCP_STATE) { print $2 }'
00000000:1F90
00000000:3AF2
0100007F:3A98
00000000:3A99
00000000:3A9E

Now my previous awk script

$ echo "110C2D0A" \
  | awk '{gsub(/../,"0x& ")} OFS="." {for(i=NF;i>0;i--) printf "%d%s", $i, (i == 1 ? ORS : OFS)}'

can only format the IPs, let’s rewrite it to be able to parse the whole address

$ echo "0100007F:3A98" \
  | awk -F: '{gsub(/../,"0x& ", $1)} {l=split($1,hip," "); for(i=l;i>0;i--) printf "%d%s", hip[i], (i == 1 ? ":" : "."); printf "%d%s","0x"$2, ORS}'
  1. Addresses will be split using the colon separator -F:.

  2. Then {gsub(/../,"0x& ", $1)} will split the IP field as octets and prefix them by 0x.

  3. Then I need to reverse the IP octets, as awk cannot expand fields, I can split the IP field to an array l=split($1,hip," "), where l is the length of that array then run the for-loop on this array for(i=l;i>0;i--) printf "%d%s", hip[i], (i == 1 ? ":" : "."). Ending the line by printing the port field in decimal printf "%d%s","0x"$2, ORS.

$ cat /proc/$(pidof java)/net/tcp \
  | awk -v TCP_STATE=0A '($4 == TCP_STATE) { print $2 }' \
  | awk -F: '{gsub(/../,"0x& ", $1)} {l=split($1,hip," "); for(i=l;i>0;i--) printf "%d%s", hip[i], (i == 1 ? ":" : "."); printf "%d%s","0x"$2, ORS}'
0.0.0.0:8080
0.0.0.0:15090
127.0.0.1:15000
0.0.0.0:15001
0.0.0.0:15006

In order to make this script be reusable for an IP or an address (IP:port), it just need a little tweak

- awk -F: '{gsub(/../,"0x& ", $1)} {l=split($1,hip," "); for(i=l;i>0;i--) printf "%d%s", hip[i], (i == 1 ? ":" : "."); printf "%d%s","0x"$2, ORS}'
+ awk -F: '{gsub(/../,"0x& ", $1)} {l=split($1,hip," "); for(i=l;i>0;i--) printf "%d%s", hip[i], (i == 1 ? "" : "."); if ($2 != "") printf ":%d%s","0x"$2, ORS}'

This last revision simply skip the port $2 if it’s value is blank.

formal hexadecimal IPs or network addresses
$ echo "0100007F:3A98
0100007F" \
  | awk -F: '{gsub(/../,"0x& ", $1)} {l=split($1,hip," "); for(i=l;i>0;i--) printf "%d%s", hip[i], (i == 1 ? "" : "."); if ($2 != "") printf ":%d%s","0x"$2, ORS}'
127.0.0.1:15000
127.0.0.1

Reading /proc/net/route

Likewise, in stripped down container /sbin/route may be missing, in that cas using procfs offers an alternative :

$ cat /proc/$(pidof java)/net/route
Iface	Destination	Gateway 	Flags	RefCnt	Use	Metric	Mask		MTU	Window	IRTT
eth0	00000000	0104F80A	0003	0	0	0	00000000	0	0	0
eth0	0204F80A	00000000	0001	0	0	0	00FFFFFF	0	0	0

Reversed hexadecimal IPs are immediately identifiable, although having a human-readable form would be better. Reusing our previous awk command is pure profit.

$ cat /proc/$(pidof java)/net/route \
  | awk '$1 == "eth0" {print $3}' \
  | awk '{gsub(/../,"0x& ")} OFS="." {for(i=NF;i>0;i--) printf "%d%s", $i, (i == 1 ? ORS : OFS)}'
10.248.4.1
0.0.0.0

I’m not quite sure about the flags, this [stackoverflow answer] suggests to look at net-tool source,

Which in turns suggest to look at include/linux/route.h (v3.6) file, which moved to include/uapi/linux/route.h#L50-L60 since v3.7. This happened during this commit for the creation of what is called the user space API of the kernel (uapi), or headers that can be used publicly.

See this discussion, and thanks for this StackOverflow answer.

Anyway here’re the flags :

#define	RTF_UP		0x0001		/* route usable		  	*/
#define	RTF_GATEWAY	0x0002		/* destination is a gateway	*/
#define	RTF_HOST	0x0004		/* host entry (net otherwise)	*/
#define RTF_REINSTATE	0x0008		/* reinstate route after tmout	*/
#define	RTF_DYNAMIC	0x0010		/* created dyn. (by redirect)	*/
#define	RTF_MODIFIED	0x0020		/* modified dyn. (by redirect)	*/
#define RTF_MTU		0x0040		/* specific MTU for this route	*/
#define RTF_MSS		RTF_MTU		/* Compatibility :-(		*/
#define RTF_WINDOW	0x0080		/* per route window clamping	*/
#define RTF_IRTT	0x0100		/* Initial round trip time	*/
#define RTF_REJECT	0x0200		/* Reject route			*/

IPv6

Please keep in mind the above commands only account IPv4 connections, IPv6 related connections are in their related pseudo files, e.g /proc/$(pidof java)/net/tcp6, /proc/$(pidof java)/net/ipv6_route

/proc/$(pidof java)/net/tcp6
$ cat /proc/$(pidof java)/net/tcp6
  sl  local_address                         remote_address                        st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode
   0: 00000000000000000000000000000000:3AAC 00000000000000000000000000000000:0000 0A 00000000:00000000 00:00000000 00000000  1337        0 2247569106 1 0000000000000000 100 0 0 10 0
   1: 00000000000000000000000000000000:FFFD 00000000000000000000000000000000:0000 0A 00000000:00000000 00:00000000 00000000 65533        0 2247536566 1 0000000000000000 100 0 0 10 0
   2: 0000000000000000FFFF0000171F2D0A:3AAC 0000000000000000FFFF00000134D00A:9C74 06 00000000:00000000 03:000002B9 00000000     0        0 0 3 0000000000000000
   3: 0000000000000000FFFF0000171F2D0A:3AAC 0000000000000000FFFF00000134D00A:9A36 06 00000000:00000000 03:00000060 00000000     0        0 0 3 0000000000000000
   4: 0000000000000000FFFF0000171F2D0A:3AAC 0000000000000000FFFF00000134D00A:9ED8 06 00000000:00000000 03:00000511 00000000     0        0 0 3 0000000000000000

However, it’s a tad easier for IPv6 as those are expressed as hexadecimal anyway. This should be easy for those as only bit to do is formatting to remove leading 0s however there’s a catch, if you notice the last bytes they look exactly like the reversed IPv4, I’m not how exactly this reverse thing applies to the rest of the IPv6 though. But in this case the IPv6 is

  1. 0000000000000000FFFF0000171F2D0A

  2. 0000000000000000FFFF00000A2D1F17 reversed the IPv4

  3. 0000:0000:0000:0000:FFFF:0000:0A2D:1F17 group by words (2 bytes)

  4. 0:0:0:0:ffff:0:a2d:1f17 removed leading 0s

As I’m qui unsure how tha applies to all IPv6 I’d rather be prudent there.

Closing thoughts

When the containers are stripped down and read-only, and you don’t have the possibility to attach a debug container, I’m quite lucky to have the ability to introspect process internals thanks to the pseudo filesystem /proc. However sometimes the information is not easily human accessible. I hope the few awk tricks here will be useful to someone else.