Tech Editorials #CVE #Vulnerability #Windows #Kernel #Advisory

Frame by Frame, Kernel Streaming Keeps Giving Vulnerabilities

Angelboy 2025-05-17

This is a series of research related to Kernel Streaming attack surface. It is recommended to read the following articles first.

Welcome to Part III of my series on streaming vulnerabilities in the Windows kernel. This research was also presented at OffensiveCon 2025.

Over the past year, we uncovered an overlooked bug class called Proxying to Kernel, which led to severe consequences, making exploitation straightforward in Windows kernel. However, this is just the tip of the iceberg for Kernel Streaming.

After discovering several vulnerabilities — including those related to the Proxying series — in Kernel Streaming, we decided to dive deeper into its internals. Between late 2023 and the end of 2024, we identified over 20 vulnerabilities. Approximately 14 of them are related to AVStream, with most occurring during frame handling. In this post, I’ll focus on these frame-related issues.

Let’s talk about kernel streaming frame.

Brief overview of Kernel Streaming Frame

In Kernel Streaming, when reading data from a device, Kernel Streaming will allocate KS frame to carry streaming data such as video or audio.

struct _KSPFRAME_HEADER
{
  _LIST_ENTRY ListEntry;
  _KSPFRAME_HEADER *NextFrameHeaderInIrp;
  void *Queue;
  _IRP *OriginalIrp;
  _MDL *Mdl;
  _IRP *Irp;
  KSPIRP_FRAMING_ *IrpFraming;
  KSSTREAM_HEADER *StreamHeader;
  void *FrameBuffer;
  KSPMAPPINGS_TABLE *MappingsTable;
  unsigned int StreamHeaderSize;
  unsigned int FrameBufferSize;
  void *Context;
  int RefCount;
  void *OriginalData;
  void *BufferedData;
  int Status;
  unsigned __int8 DismissalCall;
  _KSPFRAME_HEADER_TYPE Type;
  _KSPSTREAM_POINTER *FrameHolder;
  unsigned int OriginalOptionsFlags;
  _KSPMDLCACHED_STREAM_POINTER *MdlCaching;
};

The frame buffer inside the KS frame stores the actual image or audio data. Most frame buffers are described by a Memory Descriptor List (MDL) that maps their physical memory. If you’re not familiar with what an MDL is, don’t worry — here’s a quick overview.

MDL

MDL (Memory Descriptor List) is a kernel-mode structure used in Windows to describe the physical pages backing a virtual memory buffer. It allows kernel components and drivers to perform direct memory access (DMA) and safely share buffers across different contexts. MDLs are widely used throughout the Windows kernel, commonly in conjunction with IRPs during Direct I/O, as well as in file system and network drivers during data transfer operations.

The MDL (Memory Descriptor List) structure is defined as follows:

typedef struct _MDL {
  struct _MDL      *Next;
  CSHORT           Size;
  CSHORT           MdlFlags;
  struct _EPROCESS *Process;
  PVOID            MappedSystemVa;
  PVOID            StartVa;
  ULONG            ByteCount;
  ULONG            ByteOffset;
  ULONG64          PFN[];  // Variable-length array of page frame numbers

} MDL, *PMDL;

This is a variable-sized structure, where the PFN (Page Frame Numbers) array is stored at the end of the MDL. Each PFN represents the physical page corresponding to a the virtual buffer described by the MDL.

In Kernel Streaming, an MDL describes a buffer that is mapped into user space and kernel space, and both mappings refer to the same physical memory.

As a result, when data is read from a device, it is written to both the user-mode and kernel-mode buffers at the same time.

Let’s take a quick look at how MDLs are typically used.

Basic Usage of MDL

When the kernel needs to access user-mode memory — especially at elevated IRQL levels such as DISPATCH_LEVEL or within a DPC — it often relies on an MDL to safely describe and lock that memory. Typically, this process invokes the set of APIs shown in the diagram below.

IoAllocateMDL

First, the kernel calls IoAllocateMdl to allocate an MDL structure, initializing it to describe a buffer based on the provided virtual address and length. But it does not initialize the PFN (Page Frame Number) array in the MDL.

MmProbeAndLockPages

Next, the kernel calls MmProbeAndLockPages to lock the physical pages corresponding to the virtual address range, and to populate the PFN (Page Frame Number) array inside the MDL.

MmMapLockedPagesSpecifyCache

Once the kernel needs to access the memory, it calls MmMapLockedPagesSpecifyCache to map a new virtual address using the PFNs stored in the MDL.

By the way, it’s also possible to map kernel buffers into user space using this API.

MmUnlockPages/IoFreeMdl

After the kernel has finished using the buffer mapped through the MDL, it must call MmUnlockPages to release the locked physical pages. Finally, the MDL itself should be freed using IoFreeMdl.

For the purposes of this post, it’s enough to understand that Kernel Streaming uses MDLs to manage frame buffers shared between user space and kernel space.

If you’re interested in more details about MDL, here are some helpful references:

Next, let’s take a look at how a typical application reads data from a webcam — and how Kernel Streaming implements this functionality under the hood.

How to Read Streams from webcam

Here is a simplified overview of the workflow for reading a video stream from a webcam using Kernel Streaming:

Open the device to obtain a handle to the webcam device.
Use this device handle to create an instance of the Pin on this filter and obtain the Pin handle.
Set the Pin’s state to RUN using IOCTL_KS_PROPERTY. When the Pin enters the RUN state, the webcam’s indicator light usually turns on, indicating that the device is active and ready to stream.
Finally, you can use IOCTL_KS_READ_STREAM to read data from this Pin. When sending the IOCTL to read the stream, we need to provide a KSSTREAM_HEADER structure as input to specify the necessary information.

typedef struct {
  ULONG    Size;
  ULONG    TypeSpecificFlags;
  KSTIME   PresentationTime;
  LONGLONG Duration;
  ULONG    FrameExtent; //Buffer Size
  ULONG    DataUsed; 
  PVOID    Data; // point to image Buffer
  ULONG    OptionsFlags;
  ULONG    Reserved;
} KSSTREAM_HEADER, *PKSSTREAM_HEADER;

The kernel will use this structure to copy data from the device into memory. The most important fields are the Data, which points to your user-space buffer, and FrameExtent, which indicates the size of the buffer. Kernel Streaming will map a frame buffer based on these values and write the image data into the memory region you provided. Optionally, you can also use the OptionsFlags field to describe the attributes of the frame.

Stream Reading in Kernel Streaming

Let’s briefly introduce how ks implements reading a frame.

First, a buffer must be allocated in user space to store the incoming image data. Then, a KSSTREAM_HEADER structure is prepared, containing the buffer’s address and size, and passed to the kernel via an IOCTL_KS_READ_STREAM. When this IOCTL is sent to the webcam device, it is handled by ksthunk.sys and ks.sys. If the request does not originate from a WoW64 process, it will be passed to ks.sys for further processing.

Once ks.sys receives the request, it parses the KSSTREAM_HEADER, creates an MDL based on the provided buffer and size, and insert it to the IRP. The user-space buffer is then mapped into kernel space as a frame buffer through this MDL. At this point, both the user buffer and the frame buffer point to the same physical memory, enabling efficient zero-copy data transfers between user space and kernel space.

Finally, ks.sys allocates a KS Frame (_KSPFRAME_HEADER) in kernel. This structure contains the associated MDL, a pointer to the frame buffer, the buffer size, and other metadata used for managing the streaming operation.

The KS FRAME is then placed into an internal queue, where it waits to be filled with data. Next, the Kernel Streaming worker thread dequeues a KS FRAME and begins capturing image data from the device into the associated frame buffer. Any remaining KS FRAME structures in the queue will be processed one by one in the order they were enqueued.

By the way, it’s also possible to submit multiple KSSTREAM_HEADER structures in a single IOCTL call to request multiple frames. In that case, ks.sys will process each frame request in order, based on the KSSTREAM_HEADER array provided in the input buffer. Each frame has a one-to-one mapping with a KSSTREAM_HEADER, an MDL, and a KS FRAME.

With the basics of the architecture and frame reading in place, we can now examine things from an attacker’s point of view.

From Attacker’s Perspective

So, where should we focus our attention?

The first and most intuitive target is the transition between ksthunk.sys and ks.sys.When 32-bit requests are converted to 64-bit, improper handling of user-controlled KSSTREAM_HEADER structures may lead to memory corruption — for example, CVE-2024-38054 is one such case. This transition layer can also introduce inconsistency issues.

Another interesting target is how ks.sys manages frame buffers. If MDLs are misused during frame buffer handling, it can result in various forms of memory corruption. We’ll examine some examples of these issues later.

In the course of our research on Kernel Streaming, we identified several new bug classes worth highlighting.

New Bug Classes in Kernel Streaming

The first bug class we identified is MDL mismatch.

MDL Mismatch

When ksthunk.sys receives a 32-bit request, it not only converts the request to its 64-bit equivalent, but also pre-allocates an MDL to describe the frame buffer.

As illustrated in the diagram, when a 32-bit request is issued, ksthunk.sys is the first to handle it. During this step, it sets up the MDL and performs the mapping for the frame buffer.

Once ksthunk.sys completes its preprocessing, it passes the IRP to ks.sys for further handling. Since the MDL has already been created by ksthunk.sys, ks.sys will not allocate a new one. At this point, a KS FRAME is allocated to represent the frame within the Kernel Streaming framework.

Moreover, if multiple frames are requested in a single call, ksthunk.sys will pre-allocate all the necessary MDLs and perform the corresponding frame buffer mappings.

However, if the OptionsFlags field is set to KSSTREAM_HEADER_OPTIONSF_PERSIST_SAMPLE (0x8000), ksthunk.sys will skip the normal MDL allocation process. This flag is actually part of Kernel Streaming’s MDL caching mechanism. While we won’t go into the full details here, it’s important to understand that enabling this flag causes ksthunk to skip MDL allocation for that frame.

Additionally, since each frame is handled independently, it’s possible to intentionally mark only one of the submitted frames as caching by setting the KSSTREAM_HEADER_OPTIONSF_PERSIST_SAMPLE flag on that specific frame when submitting multiple frames in a single request.

Let me give you an example :

Suppose we submit two frames, with the second frame marked as caching.

ksthunk.sys will check the OptionsFlags field for each frame. If the cache flag is not set, it allocates an MDL and maps the frame buffer accordingly. Since the second frame has the cache flag set, ksthunk.sys will skip MDL allocation for that frame.

After that, the IRP is passed down to ks.sys, which will once again inspect the OptionsFlags field for each frame. However, the logic here is reversed compared to ksthunk.sys.

For the first frame — because it doesn’t have the cache flag — ks.sys assumes the MDL has already been allocated by ksthunk, and therefore skips MDL allocation.
For the second frame, since the cache flag is set, ks.sys will allocate a new MDL and map the frame buffer.

ks.sys then creates KS FRAME based on the order of the KSSTREAM_HEADER entries. Each KSFRAME is paired one-to-one with its corresponding MDL, and the frames are placed into an internal queue, waiting to be pulled and processed by the worker thread.

But… is it really safe ?

There seems to be something inconsistent. Let’s abuse the MDL chain !

Suppose we submit two frames:

For the first frame, we set the buffer size to 0x1000 and enable the cache flag.
For the second frame, we set the buffer size to 0x20000, but do not set the cache flag.

ksthunk.sys checks each stream header as usual. For the first frame, since the cache flag is set, it skips MDL allocation. For the second frame, since the cache flag is not set, ksthunk allocates a new MDL and maps the frame buffer accordingly.

After that, the IRP is passed down to ks.sys, which once again inspects the OptionsFlags field for each frame.

For the first frame, since the cache flag is set, ks.sys will allocate a new MDL, map the frame buffer, and insert it into the MDL chain.
For the second frame, the cache flag is not set, so ks.sys assumes the MDL has already been allocated by ksthunk, and therefore skips the allocation.

Finally, ks.sys creates KS FRAME based on the MDL chain and the corresponding KSSTREAM_HEADER entries. The FrameExtent field from each header is stored into the associated KS FRAME, defining the expected frame size.

As shown in the diagram above, the first frame will have a size of 0x1000 stored, while the second frame will have 0x20000 stored.

Do you notice the problem? After we run it …

Why ?

The root cause of this issue is a mismatch between each KSSTREAM_HEADER and its corresponding MDL. For example, the first KSSTREAM_HEADER gets paired with the MDL of the second frame, while the second KSSTREAM_HEADER ends up linked to the MDL of the first frame.

What’s the actual impact?

When the worker copies data from the device, it relies on the buffer address and size stored in each KS FRAME to perform the copy operation. Both frames are treated the same — the worker refers to the KS FRAME structure to determine where and how much data to copy. However, here lies the problem…

For the second KS FRAME, the actual allocated buffer is only 0x1000 bytes, but the FrameExtent field in the structure indicates a size of 0x20000. As a result, the worker attempts to copy 0x20000 bytes into a much smaller buffer, leading to a buffer overflow.

In fact, several of the vulnerabilities we discovered stem from this exact issue. As long as an attacker can create a mismatch between a KSSTREAM_HEADER and its corresponding MDL, the result is a buffer overflow.

CVE-2024-38237
CVE-2025-21375
…

The second bug class we’re going to discuss is called The Forgotten Lock in MDL — a vulnerability pattern involving incorrect handling of MDL.

This bug class is a bit more special

The Forgotten Lock

Actually, it is an uninitialized issue in MDL.

Before we discuss this issue, let’s first look at some common mistakes developers make when working with MDLs.

Security Risks of MDL

The first one is a common issued recently — one that I also mentioned in a previous post.

Incorrect access mode flag in MmProbeAndLockPages

When the kernel calls MmProbeAndLockPages to lock a user-supplied memory buffer, it may incorrectly set the access mode flag. This mistake causes the kernel to skip the check that verifies whether the target address belongs to user space. As a result, a user-mode process could supply a kernel-mode address, leading to arbitrary memory writes in kernel space.

For more details, please refer to Synacktiv’s presentation at HITB 2023 HKT and Nicolas Zilio(@Big5_sec) ‘s blog post.

Double Free in I/O Complete

Another common issue occurs when a kernel driver frees an MDL without clearing the corresponding MDL pointer in the IRP. Later, when the IRP is completed, the system attempts to free the MDL again, resulting in a double free vulnerability during IoCompleteRequest. This pattern can also be found in Kernel Streaming(CVE-2025-24046).

When frame allocation fails, ks.sys releases the MDLs in the MDL chain, but it does not clear the MDL pointer stored in the IRP. As a result, the MDL is freed again when the IRP completes — leading to a double free.

These two bug patterns are quite common, there are many more overlooked issues out there.

Let’s take an example from Microsoft driver Security Guidance.

In this document, Microsoft warns that if developers use MmMapIoSpace without properly validating the physical address, it could result in arbitrary physical memory being mapped into virtual address space — potentially leading to serious security issues.

To illustrate safe usage, Microsoft provides the following secure coding example:

Func ConstrainedMap(PHYSICAL_ADDRESS paAddress)
{
    // expected_Address must be constrained to required usage boundary to prevent abuse
    if(paAddress == expected_Address && qwSize == valid_Size)  //-----[1]
    {
        lpAddress = MmMapIoSpace(paAddress, qwSize, ...);   
        pMdl = IoAllocateMdl( lpAddress, ...); //----------[2]
        MmMapLockedPagesSpecifyCache(pMdl, UserMode, ... ); //-------------[3]
    }
    else
    {
        return error;
    }
}

First, the physical address is validated at [1]. Then, at [2], an MDL is allocated to describe the mapped memory region.Finally, [3] calls MmMapLockedPagesSpecifyCache to map the physical memory into a user-space virtual address.

Now… you might notice something strange here.

As we mentioned earlier, in typical usage, after allocating an MDL, you are expected to call MmProbeAndLockPages to lock the underlying physical pages. However, in this case, the code calls MmMapLockedPagesSpecifyCache directly, without locking the pages first. This results in undefined behavior, as the MDL may not correctly describe valid or accessible physical memory.

As shown in diagram above, IoAllocateMdl is used to allocate the MDL structure and initialize some basic metadata. However, if we immediately call MmMapLockedPagesSpecifyCache without first locking the pages,the function will still attempt to access the PFN array inside the MDL. This can lead to undefined behavior, or worse, controlled memory corruption. In many cases, this leads directly to a BSoD.

However, this kind of mistake is widespread throughout Kernel Streaming. In the following section, I will examine CVE-2024-38238, which clearly demonstrates this issue in practice.

CVE-2024-38238

We once again construct two KSSTREAM_HEADER structures — and this time, both frames are of the same size. The first frame has the cache flag set, while the second frame does not.

As mentioned earlier, ksthunk.sys will allocate and lock an MDL only for the frame that does not have the cache flag set. Once that’s done, the IRP is passed down to ks.sys for further processing.

Now, let’s take a closer look at how ks.sys handles this frame.

__int64  CKsMdlcache::MdlCacheHandleThunkBufferIrp(...)
{
  ...
  while(TotalSize >= sizeof(KSSTREAM_HEADER)){ //-------[4]
      ...
      if(OptionsFlag & 0x8000 == 0) //-------[5]
        return KsProbeStreamIrp(irp, a3, 0); //-------[8]
      IoAllocateMdl(header->Data,header->FrameExtent,...,Irp); //-------[6]
  }
  ...
  for(i = irp->MdlAddress;i;i = i->Next){
      MmProbeAndLockPages(i, irp->RequestorMode, IoWriteAccess); //-------[7]
  }
}

Looking at the while loop in ks!CKsMdlcache::MdlCacheHandleThunkBufferIrp at [4], we can see that it iterates through each KSSTREAM_HEADER and checks the OptionsFlags at [5] to determine whether an MDL should be allocated.

If the cache flag is set, it proceeds to allocate a new MDL at [6]. Under WOW64, if the MDL was already allocated (e.g., by ksthunk), KS will then call MmProbeAndLockPages at [7] to lock the memory pages.

However, in our specific case:

The first frame has the cache flag set.
The second frame does not.

So, when KS begins processing the second frame, it takes the path to KsProbeStreamIrp at [8]. At this point, the MDL chain inside the IRP looks like this:

The first MDL has already been properly locked, but the second one is not locked at all.

After that, ks!KsProbeStreamIrp handles the mapping of the frame buffers:


NTSTATUS KsProbeStreamIrp(PIRP Irp, ULONG ProbeFlags, ULONG HeaderSize){
 ...
 MDL = Irp->MdlAddress;
 if ( (MDL->MdlFlags & is_locked_and_nonpaged) != 0 ) { //----[9]
    while ( MDL ) 
    {
        if ( (MdlFlags & 5) != 0 )
        MappedSystemVa = MDL->MappedSystemVa;
        else
        MappedSystemVa = MmMapLockedPagesSpecifyCache(MDL, 0, MmCached, 0LL, 0, 0x40000010u); 
        
        MDL = MDL->Next;
    }
 }
}

As shown above, the function uses MmMapLockedPagesSpecifyCache to map the frame buffer by each MDL. If the MDL is marked as locked, the function maps it directly. However, there’s a critical flaw: It only checks the first MDL in the MDL chain at [9],and assumes that the entire chain has already been locked.

When MmMapLockedPagesSpecifyCache is called on the second MDL, it attempts to map memory based on an uninitialized PFN list.

Unexploitatble ?

The good news is that IoAllocateMdl allocates memory from NonPagedPoolNx without zero-initializing it. This means the PFN array located at the end of the MDL structure will contain leftover memory.

As shown above, when IoAllocateMdl allocates memory, it uses the POOL_FLAG_UNINITIALIZED flag, and does not initialize the PFN array in the MDL. This behavior allows us to apply pool spraying techniques to gain partial or full control over the PFN values inside the MDL.

By calculating the exact size of the MDL structure — including the number of PFNs based on the frame size — we can perform a pool spray using Named Pipes to populate NonPagedPoolNx memory with carefully crafted data.

WhenIoAllocateMdl reuses this memory without zero-initialization, the leftover values will be interpreted as valid PFNs, giving the attacker control over physical-to-virtual mappings.

As shown above, when MmMapLockedPagesSpecifyCache is called afterward, it treats the attacker-controlled PFNs as valid physical page mappings and uses them to map the frame buffer.

Finally, when the worker thread copies image data from the device, it writes directly to the physical addresses specified by the attacker, resulting in a powerful arbitrary physical memory write primitive.

Actually, not all PFNs can be mapped — they must be valid, such as ResidentPage. But for our purposes, that’s more than enough.

The next step is to achieve elevation of privilege (EoP) using the arbitrary physical memory write primitive. But that raises the question:

Where should we write?

During testing on several Windows 24H2, we observed a consistent behavior: the physical base address of ntoskrnl.exe was typically fixed at 0x100400000.

We tested it on Hyper-V and VMware. The value might have changed in newer builds, but it’s still likely to remain fixed in many cases. This behavior may also depend on the device or hardware configuration.

So … does that mean we can just write directly to nt and take over the kernel?

There is a problem ……

We cannot control the data being written, because it comes directly from the webcam device.

Initially, it seemed like we were stuck. But with a primitive this powerful — stable and repeatable arbitrary physical memory writes — we knew there had to be a way forward.

So we went back, carefully reviewed the entire Kernel Streaming workflow, and eventually discovered a new angle of attack.

Buffered

Kernel Streaming offers a feature called buffered mode. When a KS FRAME is created with the buffered flag(KSSTREAM_HEADER_OPTIONSF_BUFFEREDTRANSFER) set, ks.sys allocates an additional intermediate buffer in kernel space.

During the streaming process, the contents from the original image buffer are first copied into this intermediate buffer.

As shown in diagram above, after the device finishes writing data — or if an error occurs during the transfer — ks.sys will copy the contents of the buffered memory into the frame buffer. However, in our case, this frame buffer has already been mapped to the physical address of the ntoskrnl.exe image. In other words, we now have an arbitrary physical memory write primitive with fully controlled data. This opens the door to directly modifying kernel code.

In our exploit, we chose to overwrite a security check inside PsOpenProcess. Specifically, we replaced the check for SeDebugPrivilege with SeChangeNotifyPrivilege. As a result, any normal user can open a high-privilege process except PPL. For more details on the technique of replacing the check with SeChangeNotifyPrivilege, you can refer to my previous post.

There are multiple ways to cause this issue in Kernel Streaming

CVE-2024-38238
CVE-2024-38241
CVE-2025-24066
…

As long as you find a way to make it forget lock, it can result in an arbitrary physical memory writing.

The last issue we would like to share is Frame Buffer Misalignment.

Frame Buffer Misalignment (CVE-2024-38245)

Before diving into that, we first need to introduce a key object in Kernel Streaming: the KS Allocator. The KS Allocator is responsible for pre-allocating a set of frame buffers that can be reused during streaming operations. This significantly reduces the overhead of dynamic memory allocation at runtime. Typically, an allocator object is associated with a pin, and third-party drivers can also implement their own custom allocator if needed. Kernel Streaming also provides a default allocator for use when no custom implementation is specified.

In general, a KS Allocator can be created using the KsCreateAllocator API, and configured through a structure called KSALLOCATOR_FRAMING. This structure allows you to specify parameters such as the number of frame buffers, the size of each buffer, and even the alignment requirements for each frame buffer.

typedef struct {
  union {
    ULONG OptionsFlags;
    ULONG RequirementsFlags;
  };
#if ...
  POOL_TYPE PoolType;
#else
  ULONG     PoolType;
#endif
  ULONG     Frames;
  ULONG     FrameSize;
  union {
    ULONG FileAlignment;
    LONG  FramePitch;
  };
  ULONG     Reserved;
} KSALLOCATOR_FRAMING, *PKSALLOCATOR_FRAMING;

Note : To specify the alignment of a frame buffer, you must provide an alignment mask during allocator configuration.

After creating a KS Allocator, we can attach it to the pin. Before reading data from the pin, we need to set its state to KSSTATE_RUN.

At that moment, the allocator will pre-allocate the number of frame buffers based on the configuration provided earlier.

From that point on, data is streamed from the device into pre-allocated frame buffers. Corresponding KS FRAME structures are also allocated. When we send an IOCTL_KS_READ_STREAM to read data, the process begins just as described earlier.

However, instead of reading data from the device each time, the worker thread will copy data from the pre-allocated frame buffers managed by the allocator. In the following section, we’ll focus on how the default allocator manages these pre-allocated buffers.

Let’s take a deeper look at DefaultAllocator.

ks!KsCreateDefaultAllocatorEx

When we call KsCreateAllocator, Kernel Streaming creates a default allocator and initializes it using the parameters we provide. Internally, ks.sys implements its own custom allocation routine - DefAllocatorAlloc and DefAllocaorFree — and utilizes a LookasideList to efficiently manage buffer allocations and reuse.

The allocation function is quite simple :

char *__fastcall DefAllocatorAlloc(POOL_TYPE PoolType, SIZE_T NumberOfBytes, ULONG Alignment)
{
    ...
    if ( Alignment >= FILE_OCTA_ALIGNMENT )
        FileAlignment = Alignment;
    ...
    buffer = ExAllocatePoolWithTag((PoolType | 0x400), v8, 'adSK');//-----[10]
    if ( buffer )
    {
        padding = (~FileAlignment & (buffer + FileAlignment + 4)) - buffer;
        buffer += padding;
        *(buffer - 1) = padding; //-------[11]
    }

}

It simply calls ExAllocatePoolWithTag to allocate memory at [10]. If an alignment is specified, ks.sys records the size of the required padding in front of the frame buffer, as shown at [11].

In the free routine :

void __fastcall DefAllocatorFree(unsigned int *Buffer)
{
  __int64 padding; 
  ...
  if ( (Buffer & 0xFFF) != 0 )
    padding = *(Buffer - 1); //---------------[12]
  else
    padding = 0LL;
  ExFreePoolWithTag(Buffer - padding, 0);
}

KS use this padding size to calculate the original pointer returned by ExAllocatePoolWithTag at [12].

As shown in the diagram below, the memory layout of the pool looks like this:

The purple region represents the padding, while the blue region corresponds to the frame buffer itself. The 4 bytes immediately preceding the frame buffer are used to store the padding size. Under normal case, the alignment mask is expected to be a power-of-two minus one (e.g., 0x3F, 0xFFF, etc.).

However, here’s the problem:

KS only checks whether the alignment mask is greater than 0xFFF. If it’s less than 0xFFF, it accepts any value, even if it’s not a valid alignment.

Useless Bug ?

At first glance, this might seem like a harmless bug — just a minor issue with memory alignment. But what happens when that misaligned buffer meets the LookasideList?

LookasideList

LookasideList are per-processor caches optimized for fixed-size memory blocks. Instead of using the general pool allocator, they maintain a simple singly linked list for fast allocation and deallocation. Both allocations and frees always check the list first before using the general pool, and the list operates in LIFO (Last-In, First-Out) order. One important constraint is that entries stored in the LookasideList is expected to be aligned to 0x10 bytes. You can refer to SLIST_ENTRY.

As you can see in ExAllocateFromNPagedLookasideList:

PSLIST_ENTRY ExAllocateFromNPagedLookasideList(...){
    ...
    ReturnChunk = ListHead->FreeChunk & 0xFFFFFFFFFFFFFFF0;
    ListHead->FreeChunk = ReturnChunk->Next;
    ListHead->Depth-- ;
    ...
}

The allocation logic aligns the returned chunk address to 0x10 bytes before returning it to the caller.

PSLIST_ENTRY ExFreeToNPagedLookasideList(...,PSLIST_ENTRY Chunk){
    ...
    NextChunk = ListHead->FreeChunk & 0xFFFFFFFFFFFFFFF0
    Chunk->Next = NextChunk;
    ListHead->FreeChunk = Chunk;
    ListHead->Depth++;
    ...
}

Similarly, when freeing memory back to the LookasideList, it also aligns the chunk. As shown in this code snippet above, the free routine aligns the first entry in the list

Non–0x10-byte–aligned Frame Buffer + LookasideList

So, what happens if a frame buffer that’s not 0x10-byte aligned is inserted into a LookasideList?

Let’s play the funky frame.

We write a script to list out all possible alignment mask and padding size. In this case, we’re using an alignment mask that results in 8 bytes of padding. Then, we configured the allocator to pre-allocate 4 frame buffer. As a result, each buffer will follow the same layout — and due to the 8-byte padding, the resulting frame buffer addresses all end with 0x08.

The buffers will look like this one :

After that, the allocator returns four buffers — A, B, C, and D — all of which have addresses ending in 0x8 due to the applied padding.

When these buffers are freed, ks.sys releases them one by one and inserts each of them into the LookasideList in order.

As illustrated in the diagram above, we first free Frame A, which gets inserted into the LookasideList without any issues.

When Frame B is freed, the allocator first aligns the address of the current list head (Frame A) to satisfy the 0x10-byte alignment requirement. It then stores this aligned address in the next pointer field of Frame B, and inserts Frame B at the head of the LookasideList.

We continue by freeing Frame C and Frame D, both of which follow the same pattern as before. In the end, the LookasideList will look like the layout illustrated in the diagram above.

Have you spotted the issue?

The issue lies in the next pointer of Frame D. Due to alignment, the next pointer ends up pointing to the start of the pool chunk, rather than the actual frame buffer.

As shown in the diagram above, you’ll notice that the next pointer of Frame C points to the padding area, which contains the stored padding size, not the expected list entry structure. When interpreted as a 64-bit value, this pointer becomes something like 0x800000000 — which falls within the user-space address range.

Our plan is to allocate a memory page at 0x800000000, allowing us to gain control over the LookasideList. We then configure the final node in the list to point to our desired target address. After that, when the device performs a read operation, ks.sys will write the incoming data into these frame buffers — including the one pointing to our chosen address.

In theory, this gives us an arbitrary memory write primitive, right?

However, we still face the same limitation as before: we cannot control the content that gets written.

Additionally, we cannot use the buffered flag in this scenario, which means we’re limited to whatever data the device sends — making precise exploitation much more difficult.

At this point, we were stuck again.

But after thinking it through once more, we found another way forward.

Let’s make the LookasideList great again

As shown in the diagram above, we first construct a fake linked list in user space. The address 0x41410000 represents a user-controlled memory region, which we use to construct a valid LookasideList entry. Then, we proceed to allocate the frame buffer, which causes the allocator to traverse the fake list we’ve constructed.

In ExAllocateFromNPagedLookasideList, the allocator first aligns the chunk and then updates the list head. However, due to the misalignment, the alignment logic mistakenly interprets the start of Frame D as a next pointer — leading to incorrect traversal of the LookasideList.

Once the first chunk is popped from the list, the linked list transforms into the state shown in the diagram above. Next, we allocate all remaining chunks from the LookasideList. We also configure the allocator to use smaller frame buffers, which causes the webcam to enter a wait state — it no longer reads data from the device. Next, we trigger a STOP to release all of the frame buffers.

The frame buffer will appear as shown in the diagram above. At this point, ks.sys begins returning the buffers to the LookasideList, one by one. First, it releases Frame D. Then, it frees the malicious chunk at 0x800000000. After that, it frees the fake chunk at 0x41410000.

Once the three chunks have been released, the structure of the LookasideList transforms into the layout illustrated above. In the end, the allocator will release our target address.

It will cause the next pointer of target address to point to 0x41410000. This value can be any user-space address controlled by the attacker.

In other words, we now have a powerful arbitrary memory write primitive.

After gaining arbitrary memory write on Windows 23H2, we can use NtQuerySystemInformation to leak the address of the thread object. With that address, we flip the necessary bit in the token structure to escalate privileges. From here, we can apply any well-known EoP technique to achieve full privilege escalation. By the way, once you’ve achieved arbitrary memory write, don’t forget to restore the LookasideList to a valid state — otherwise, the system may crash during subsequent allocations.

We’ve successfully turned what seemed like a harmless bug into a serious vulnerability.

The Next & Summary

This bug patterns may not be limited to Kernel Streaming alone. By paying closer attention to MDL-related issues, you might be able to discover many more bugs in other driver. Kernel Streaming remains a fascinating research target and likely still harbors many undiscovered vulnerabilities beneath its surface.

Gaining a deep understanding of Windows API implementations — and recognizing the risks of their misuse — is essential to uncovering new vulnerabilities and building effective exploitation techniques.

Keep these patterns in mind — it might be your next vulnerability.

BLOG