Patrick MCHardy, NFtables

September 30, 2008 on 5:24 pm | In Uncategorized | No Comments

Iptables and Netfilter problems (bashing iptables)

  • Full dump during ruleset update. Userspace is very primitive. Extentions build a blob which is passed to the core. There is no optimisation
  • Very little abstraction: UDP and TCP ports for example don’t share port matching code
  • Inconsistencies among matches: negation, range, prefixes support defer.
  • Iptables parsing code is not in the core and it introduces inconsistencies in command line
  • One target per rule: It requires rule duplication and duplicated changes (ex: LOG + DROP, MARK + ACCEPT)
  • Static target module parametrization: option of the target expands to constant. You can not use flexible settings of the variable. It leads to build specific modules such as IPMARK, IPCLASSIFY and could lead to a TCPPORTCONNMARK module

Iptables good side

  • Lots of matches, very flexible classification
  • Features beside packet filtering: load balancing, multipath routing, packet manipulation
  • Fast built-in filtering operations compared to TC
  • Easy to add new extensions but these extensions are responsible for the poor quality of the external modules

Some other classifiers

  • tc has u32 which is powerful but not easy to use
  • BPF is interesting because it is fully programmable

It could be interesting to have a similar syntax as Linux classifiers’. NFtables aims at this.

Patrick McHardy

Feature wishlist

  • Use netlink for incremental changes and change notifications
  • Multijump: Extensions can return arbitrary verdicts, and in particular jump to be able to orientate the flow in subchains
  • Runtime target parametrization

Nftables components

Three components:

  • Kernel implementation
  • nftables userspace frontend: parses textual rule set representation and perform error checks and postprocessing. And finally send raw data needed by libnl
  • libnl: netlink message parsing and construction

The libnl point is discussed heavily as libnl is distributed under LGPL. Almost all developers don’t want to ease proprietary interface writer work but they prefer to build a library using GPL license and that could be use by third party open source projects.

Libnl: Legal point and objectives

This legal point is followed by a discussion on the problem of avoiding code duplication and at what point the userspace tools provided by Netfilter project should go. Some Netfilter developers want to only provide a near code library and let other developers bring the tool to a higher level. Others want to provide a more user-ready interface.

Some Nftables details

Patrick shows a big work on internal structure definitions. He has tried to factorize things and specifically to avoid to put protocols related information in the main structure. This is the case in the packet structure which contains a pointer to network header and a pointer to transport header. This is used to easily jump to protocol data.

Expression of filters look like TC u32 syntax and are expressed internally with register based operations.

ip daddr 192.168.0.1

[ payload load 4 offset network header + 16 => reg 1 ]
[ compare reg 1 192.168.0.1 ]

Sets

Sets are supported and they should be able to support ipset type sets. They are currently represented as rbtree. This is suboptimal in many ways but there are a lots of possibilities to optimize.

An intervals tree set is implemented. It could be combined to jump maps to represent the nf-hipac algorithm. But incremental changes will be very difficult.

A hash set has been developed too. It is fixed size hash (non resizable) and can be used for arbitrary sets.

Matching

  • Payload: The payload module implement generic packet parsing. It support encapsulation even if userspace part is not implemented
  • Meta: It is used to match on specific Netfilter data (length, protocol, mark, iifindex, iifname, …)
  • ct: conntrack matching
  • concat (planned): concatenate multiple keys to do multidimensionnal equality expression and lookups

Kernel userspace interface

This is basically a netlink interface:

  • standard operations (GET/NEW/DELETE)
  • tables, chains and rules can be addressed

Performance comparison with iptables

For a ruleset consisting of 1000 instances of a an ICMP rule, the average cycle per rule is 110 for iptables and 115-120 for nftables. The performance are thus quite comparable although nftables is more generic and has not been optimized yet.

Userspace tool

The basic syntax is the following:

nft rule add filter output tcp dport ssh

nft rule add filter output ip daddr 191.68.0.1 ip protocol tcp

or

nft rule add filter output tcp dport == 22

nft rule add filter output ip daddr == 191.68.0.1 ip protocol == tcp

The implicit equality sign introduces some complexity in the implementation.

nft rule add filter output tcp flags syn

Flags support is not fully implemented but syntax could look like this:

nft rule add filter output tcp flags syn | ack

It is possible to define composed objects:

nft add filter output ip addr  {192.168.0.0/24, 192.168.1.1, 10.0.0.0/8}

No Comments yet »

RSS feed for comments on this post. TrackBack URI

Leave a comment

XHTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Powered by WordPress with Pool theme design by Borja Fernandez.
Entries and comments feeds. Valid XHTML and CSS. ^Top^