How to find 0day

Brother · Nov 10, 2021

I decided to write about a common error in network applications and network devices. I will try to explain the problem using the Linux stack as an example. And I will argue more Abstractly, trying to explain the principle. After all, all applications are different, at least the essence is the same. (They transfer bits back and forth.) And each network application, or a network device, needs its own approach in order to correct mistakes made in it.

The problem lies in the small chunks of data that are exchanged between systems. More precisely, not in these portions themselves, but in how they are further collected and processed.

For example, one system sends another system the phrase "Hello World!"

How can I do that? Yes, whatever. Any data transfer protocol is selected: TCP, MPTCP, UDP. The data is split into chunks, which we will call packets. And they are sent to the final system. (How exactly? It doesn't matter at all.)

The important thing is what (and how) the final system will do with the received data. If the string "Hello World!" was split into two packages:

"Hello" - This is the content of the first package.

"World!" - And this is the second package.

Then, after the network device starts receiving packets (they can be processed by the network device itself), with successful processing, they will be transferred further to the network stack of the Operating System (OS). (The problem at this stage will be described later.)

If the packets hit the Network Stack, then the OS processes the packet data sequentially. This means that part of the package will indicate how the package was delivered to the system. And we safely skip this part because it does not matter. We discard all part of the reasoning, up to the important point - before collecting data at the transport level.

Ultimately, the system must figure out what the correct packet sequence is. After all, the package could first reach the system

"World!" - the second package.

And then

"Hello" is the first packet .

And if the OS did not define the correct sequence of packages, then the following line will reach the user-level application: "World! Hello". Of course, most do not like this approach, and therefore the protocols support sequence numbers (as well as timestamps, but these are all particulars, we will only talk about “sequence numbers” and call them “seq”).

We already operate with such important things: Data and seq. Let's add to our knowledge also "Data Size". The data and its size are enough to send them over the socket. But why mention seq then? Here you need to realize that you can predict the state of the second system, which receives our "Hello World!"

Let's look at the example of Linux and the TCP protocol (I cite the code from the Linux-5.10.7 kernel) In the net / ipv4 / tcp_ipv4.c file there is a tcp_v4_fill_cb function that will shed light on what is happening on the target system. Or rather, we are interested in

Code:

TCP_SKB_CB (skb) -> end_seq = (TCP_SKB_CB (skb) -> seq + th-> syn + th-> fin +

end_seq - This value is used to indicate the end of the packet data. (Any next packet must be at least one larger than the previous one (In order not to be discarded before copying packet data from the socket)

For example: "Hello" - First packet . The size is 6 bytes.

First packet , has seq = 0x1 and end_ seq = 0x7

"World!" - The second package . The size is 7 bytes .

Second package , has seq = 0x7 and end_seq = 0xe

Then the data will be collected in the string “ Hello World! "

And if the second packet has seq = 0x4 and end_ seq = 0xb

Then the data will be collected in the string “ Hello ld! "

Symbols " Wor " will be discarded . ( This happens when copying data from packets, when reading a socket) After copying data. The tcp_rcv_nxt_update function of the net / ipv4 / tcp_input.c file specifies the rcv_nxt value in the struct tcp_sock protocol structure. This value will equal end_seq after copying all packets to the shared buffer (after reading data from the socket). (The first rcv_nxt value will be set immediately after the tcp handshake )

In short: packets come first and the end_seq value is moved. Next, the data is copied into the shared buffer, and the copied_seq value of the struct tcp_sock structure is moved. And after the data from the packets (socket buffers) is copied to the shared buffer, rcv_nxt is set to end_seq. And the iteration is repeated.

Now consider the following. What will happen if the first packet : seq = 0xfffffff0 and size 0x64 (100) then

Code:

TCP_SKB_CB (skb) -> end_seq = (0xfffffff0 + 0x1 + 0 + 0x140 - 0x40);

That is, an overflow has occurred, and the result is end_seq = 0x55 (85 in decimal) while rcv_nxt = 0xfffffff0, since no data has yet been copied from the socket.

This is where the situation arises when end_seq <seq (0x55 < 0xfffffff0 )

And also, you can further send packets with a large sequence number and a larger packet data size . (in order to influence the value of skb_len)

For example, such a package will be correct. seq = 0xfffffc6e 1000 bytes in size (end_seq = 0x56 ) This is nothing interesting if you don't remember the URG flag. Wikipedia says: 16-bit value of the positive offset from the sequence number in the given segment. This field indicates the sequence number of the octet with which urgent data ends.

And this is where possible errors are found. It all depends on how the network device processes urg packets. Errors can be in the implementation of the drivers . Or in the hardware part . (with TCP Offload Engine)

For large seq values ( close to integer overflow ), the ability to adjust the socket buffer size (the size of the packet data), the ability to specify an offset for urg in the range 0 - 0xffff,

by setting the rcv_nxt value. You can influence the further behavior of the system . (at the kernel level) I will add some more information. It will not be superfluous.

Packet sequence number checks are based on the: after and before functions. Which check where the number is in the sequence "before" the specified number, or "after" or "equal to".

Here are the results of functions with some numbers, as an example.

Code:

inline bool before (__ u32 seq1, __u32 seq2)

As you can see, the comparison is greater than or equal to, is reduced to subtracting the second argument from the first and comparing the result with zero.

In most cases, these functions determine what to do with the package. If it turns out that end_seq has a number less than or equal to the last packet received, then such a packet is discarded.

For example end_seq <seq ( 0x55 < 0xfffffff0 )

In cases where mathematical operations take place on two or more parameters that can be influenced, and comparing them with some_number , then

Code:

if (TCP_SKB_CB (skb) -> end_seq - TCP_SKB_CB (skb) -> seq> = some_number) {

Here is a sample code from drivers / net / ethernet / chelsio / inline_crypto / chtls / chtls_cm.c which may contain an error.

The handle_urg_ptr function contains code.

Code:

if (skb && tp-> copied_seq - ULP_SKB_CB (skb) -> seq> = skb-> len)

It contains three variables: copied_seq, seq, skb-> len that can be influenced.

In fact, no (due to the peculiarities of the device, you cannot influence ULP_SKB_CB (skb) -> seq). If it was possible to influence (skb) -> seq , then it is possible to choose such values of seq , copied and skb-> len at which the socket buffer clearing will work and the driver code will continue its work. (When processing the packet, from the already freed memory)

Next, consider this example: in the chtls_cm.c file

There is a chtls_recv_data function, it contains the code

Code:

if (unlikely (hdr-> urg))

You can immediately pay attention to the call to handle_urg_ptr (sk, tp-> rcv_nxt + ntohs (hdr-> urg)) ; Where are the two arguments: tp-> rcv_nxt + ntohs (hdr-> urg).

rcv_nxt can be predicted and hdr-> urg can be set in the package. Thus, passing some_number to the handle_urg_ptr function for further processing. But this is not as critical as the following code.

Code:

if (unlikely (tp-> urg_data == TCP_URG_NOTYET &&

Here again we see tp-> urg_ seq - tp-> rcv _n xt <skb-> len, the two numbers that can be affected by the sender of the packets are compared with the value of the socket data size, which is also indicated by the sender (through the amount of data sent). And then skb-> data [tp-> urg_seq - tp-> rcv_nxt];

Again two values set remotely tp-> urg_seq - tp-> rcv_nxt . Refer to the data on the index - ???

Code:

tp-> urg_seq - tp-> rcv_nxt <skb-> len

This results in reading memory at the specified index. Or the system freezes.

Here's another example.

But already processing the packet data on the network device (which, I cannot make out). It can be assumed that the inside of the device has the same problems as a large number of network devices (and programs). This is a problem with sequence number checks in packet protocols.

In fact, it is possible to test the "black box", but with the assumption that different conditions for the sequence numbers of the packets will be tested.

When the device does hardware processing of tcp packets, it is assumed that the client / server interaction occurs according to the standard scenario. When all the actions of the systems are transferred to the network stack. All data have serial numbers indicated by the network subsystem (time stamps, packet size, etc.). But if, the packages are generated outside the "system" (manually - which is usually not done), then there is a huge chance that something will go wrong. After all, when a programmer tests a network application (or hardware), he does not leave the standard testing scheme. He takes the application, or writes it himself. Creates a socket and sends data. How this data is sent is not important to the programmer. All actions for the delivery of data are transferred to the OS. This is the problem.

An example of an error with an array index cannot be found in the usual testing scheme. By sending arbitrary data.

And when a network device is tested, then after a data processing error (or not an error) in the device. It can send incorrect data to the driver. Which, in fact, could not appear in any way. (in theory) (But if we remember the problems of checking serial numbers, then the probability of detecting an error increases sharply)

Here's an example: in the chtls_main.c file

There is a chtls_recv function, it works with a packet that has already been processed by a network device.

Code:

void chtls_recv (struct chtls_dev * cdev,

After the packet is processed by the device, the network card passes the opcode of the operation to the driver to be performed by the system.

opcode = * (u8 *) rsp; Contains the opcode of the instruction that the system will execute.

But if, you transfer to the device certain packets that will pass all checks for seq values and add urg to them, then the device will expose a certain opcode for execution, which should never have been exposed at all.

In this example

Code:

ret = chtls_handlers [opcode] (cdev, skb);

The function at address NULL will be executed. Which will lead to the system freezing.

And such mistakes remain unnoticed for YEARS.

I will also say that I tried to talk about the problem using the example of TCP and the urg flag. But exactly the same problems can be found when assembling UDP packets (everywhere, it doesn't matter whether drivers or application-level applications, or network devices). I don't know how many network devices are affected by this problem. I think that it is not enough.

I checked only one device (which was on hand). Result, a remote call to a null pointer in the kernel. Denial of service. (+ access to the system memory, in certain ranges) I wrote to the manufacturer about the problems, but they were not interested, they ignored it and released a new flagship) with the same problems.

That's all for now.

How to find 0day

Brother

Professional

Similar threads