Skip to content

tricendentAshu/Early-Cascade-Injection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 

Repository files navigation

Deadlock - When Threads Wait Forever

deadlock analogy Let's take an analogy

Imagine a hostel with only one washing machine. Once a wash cycle starts, the machine automatically locks its door and keeps it locked until the cycle finishes.

While the machine is still running, the system forces another start request. To start again, the door must be unlocked. The door unlocks only after the current cycle finishes. Since neither condition can be satisfied, the system becomes stuck.

This situation is deadlock.

Now that we have a basic intuition about deadlock, let us look at the technical definition.

Deadlock is a state in which two or more threads or processes are stuck indefinitely because they are dependent on each other. Each thread holds a resource that another thread needs, and none of them can proceed because the required resource is already occupied.

Since every thread waits for a resource held by another thread, and no thread releases its own resource until it acquires the next one, the system enters a state where progress becomes impossible. The threads remain stuck forever unless something externally breaks this cycle.

A natural question arises. Can deadlock occur randomly?

The answer is no.

Deadlock does not happen by chance. For a deadlock to occur, specific conditions must exist simultaneously.

Conditions Required for Deadlock

Deadlock can occur only when all four of the following conditions are present at the same time.

Mutual Exclusion - Only one thread can hold a resource at a time.

Hold and Wait - A thread holds at least one resource while waiting to acquire additional resources.

No Preemption - Resources cannot be forcibly taken away from a thread. They must be released voluntarily.

Circular Wait - A circular chain of threads exists, where each thread is waiting for a resource held by the next thread in the cycle.

Important Note

For deadlock to occur, all four conditions must be satisfied simultaneously. If even one condition is broken, deadlock is prevented.

This is why deadlock is not accidental. It is the result of a very specific execution state.

Okay now lets come to the main point i.e why did we study deadlock.....the answer to this is to know its real application and that is Special User Apc

Deadlock Risk in Special User APC

unnamed (11)

Special User APC introduces deadlock risk not because of what it executes, but because of when it executes.

A Windows thread normally runs in user mode and occasionally enters kernel mode through a system call. After the kernel finishes servicing the call, execution must return back to user mode. This return path is a critical boundary.

During this return-to-user transition, the kernel checks whether a Special User APC is queued for the thread.

If no Special User APC exists, the thread resumes normal user-mode execution. If a Special User APC exists, it is executed immediately, before the thread regains control.

So where the Risk Appears

At the moment the Special User APC executes, the thread has not yet resumed its original user-mode execution. Although the kernel is done, the thread may still logically hold user-mode resources such as heap locks, loader locks, GUI locks, or critical sections.

The APC runs inside the same thread context, not a new thread. If the APC code or any API it triggers needs a resource that the thread already holds, the thread ends up waiting for itself.

At this point, progress becomes impossible.

The thread cannot continue because it is blocked. The resource cannot be released because execution is blocked.

This is a self-deadlock.

Why the Kernel Cannot Prevent This

The kernel cannot see user-mode locks. It does not know what resources the thread was holding before the system call. Its responsibility is limited to delivering the APC at the correct architectural point, which is the return-to-user path.

Because the kernel lacks visibility into user-mode state, it cannot verify whether APC execution is safe. The timing is correct from the kernel’s perspective, but unsafe from the thread’s perspective.

Key Takeaway

Special User APC executes on the return-to-user path, before normal user-mode execution resumes. If the thread still holds user-mode resources and the APC needs those same resources, the thread waits for itself.

This is why Microsoft discourages the use of Special User APC from user mode.

Deadlock here is not caused by malicious code or incorrect logic. It is caused by forcing execution before the thread regains control and releases its resources.

IRQL

IRQL stands for INTERRUPT REQUEST LEVEL(IRQL) it is a windows kernel priority mechanism that determines which code can be executed at a given time . It also govenrs how interrupts(events) are handled and

how kernel components synchronize access to shared resources .

Now we can get confused between IRQL and THREAD PRIORITY . But both are different thread priority is the in-charge of scheduling when the CPU is free ,

while IRQL determines how events can get immediate attention regardless they are scheduled or not .

Now we understood that IRQL is a key player in windows for system stability but what if we violate the rules of IRQL

well THE SYSTEM WILL CRASH .

Since IRQL is a kernel-mode primitive sometimes it can be used for evasion or tampering by interrupting the execution handling in ways that can disrupt the monitoring by MALWARES .

For Example :

By delaying or racing callbacks , blocking certain APC , starving a monitoring thread . All of this degrades what EDR "sees" or how reliably it can respond .

IRQL HIERARCHY

Now IRQL has a hierarchy mechanism to determine which process will be executed first .There are 5 leavels to determine it. unnamed (8)

The Five Levels are :

  • PASSIVE LEVEL

  • APC LEVEL

  • DISPATCH LEVEL

  • DIRQL (DEVICE IRQL)

  • HIGH LEVEL

    Now let's understand what these 5 levels mean :

1. PASSIVE LEVEL

It is just a Normal Zone , in this level user mode application like chrome, notepad are executed and some standard Kernel Threads. It is the lowest priority level where most standard application code runs.

This level has full access to the pageable memory i.e the memory that can be swapped to dsik.

The threads are allowed to wait or sleep on synchronisation objects like Mutexes and Semaphores.

2. APC LEVEL

It is a Software Interrupt Zone , in this level the activity that is executed is basic I/O Completeion Routines and Thread suspension. But why it is called APC level , it is because APC is a software interrut targeted at a specific thread and the I/O routines are handled by the Threads of a System.

For example :

When a file read finishes , the system uses special kernel APC to notify the thread that its data is ready .

There are some constraints with this level and that is You can still access pageabel memory , but this creates a "critical section" where the current APC cannot be interrupted by other APCs.

3. DISPATCH LEVEL

It is a No Witing Zone , this level nurtures the Winodws Scheduler and Deferred Procedure Calls. It handles the time-critical operations. Here there is no pageable memory as there is no waiting in this level , and a page falut here can cause a system crash .

But what is Windows Scheduler - when windows raises the IRQL to

4. DIRQL (DEVICE IRQL)

This level caters the hardwares of a System so it is also called The Hardware Zone or Device IRQL . In this level there is INterrupt Services Routines(ISRs) for devices like netowork cards (if people still use it ) , mice , or storage(100k GB).

The basic GOAL of this level is to Execute extremely quickly to capture data , queue a DPCand return control. But there are some contraints and these are that the Execution must be almost immediate , since no waiting or apging is allowed .

This takes us to our last Level

5. HIGH LEVEL

The DO NOT DISTURB ZONE

This level also caters the system , it halts all the system activities to execute a command . It has two very important Linked Process :

1.Bug Check It is mainly used during crashes of system to safely write diagnoatic data(dump files) without interference from drivers or a scheduler(like in Level 3).

2.NMI (Non-Maskable Interrupts) - It is Reserved for critical hardware failures like memory corruption.

Mapping APC Types to IRQL Levels

After understanding the IRQL hierarchy, APC behavior can be mapped based purely on where execution is permitted.

PASSIVE_LEVEL -- User APC Execution

All user-mode APC payloads execute at PASSIVE_LEVEL. This is the only IRQL at which user code can safely run.

Mapped APC types:

Simple User APC

Early Bird APC

Special User APC

Early Cascade Injection

Differences between these APC types affect delivery timing, not execution level.

APC_LEVEL -- Kernel APC Execution

Kernel APC routines execute at APC_LEVEL and are part of kernel scheduling and I/O mechanisms.

Mapped APC types:

Normal Kernel APC

Special Kernel APC

These APCs operate under stricter execution rules and do not involve user-mode payload execution.

Key Mapping Rule :

User APCs → PASSIVE_LEVEL

Kernel APCs → APC_LEVEL

IRQL determines where execution is allowed, while APC type determines when the execution opportunity occurs.

Refer to the image below for better understanding unnamed (10)

MAPPING THREE TYPES OF APC TO IRQL

WHAT COULD BE THE DEFENSIVE APPROACH TO THE PROCESS

MOST ADVANCE FORM OF APC INJECTION ( A NOVEL APPROACH )

In APC injection we used to exploit different processes to execute our malicious code into the system that can get flagged by system's EDR , but have you ever thought as an attacker what if use insert the malicious code and queue it within the process without getting flagged by the EDR , Well Well Well Using Early Cascade Injection we can achieve this goal , but before starting with this Novel Approach we need to know some terms that will help us understand this.

REQUIRED PREREQUISITES

So Early Cascade Injection relies on very early user-mode initialization . Now to understand what does this mean the following prerequisites are essential to know .

Since all the process Injection techniques are executed in a process we need to undertand 1. Windows Process Creation (User-Mode Focus)

This will help us to understand what happens before a process actually "starts running".

  • Before the process starts running it is created in suspended state
  • Then Critical sections are initialized , it includes:
    • .mrdata Section
    • .data Section
  • Then the core DLLs are loaded , for ex - kernel32.dll.

One very important point to note is that all these initialization finishes before EDR user-mode hooks are fully active.

We have already discussed APC injection in our previous blog link to it Early Cascade Injection depends on advanced APC behavior.

To understand these behaviours we need to know some terms that we have already discussed in the blog or we will be discussing .

2. APC Fundamentals

  • User APCs
    • It always execute at Passive_level (Level 1 of IRQL)
  • Intra-process APCs
    • Since early cascade is a within process injection technique (later discussed in detail)
  • ntdll!NtTestAlert
    • This function executes all the queued APCs when the thread enters an alertable state.

APCs must be queues before the process resumes , not after.

Now we should know

3. WHY PASSIVE_LEVEL MATTERS

it matters because ,

  • User APC execution executes in PASSIVE_LEVEL only.
  • Early Cascade executes only in user mode.
  • No kernel IRQL abuse is required (because of stealth advantage).

4. SHIM ENGINE

Shims engine or Windows Compatibility Engine is bascially a subsystem of windows that helps older or incompatible application to run on newer versions of Windows without modifying the application itself.

Now you must be thinking BUT WHY DOES THIS EXIST ?

Well it exists because when the windows evolves (windows 10 -> windows 11) , the APIs chnages , Security restriction increases this can lead to older application getting crashed or refuse to run , so instead of forcing developers to rewrite (which they will hate after every evolution) Windows uses SHIMS.

This takes us to our last Prerequisite and that is ,

5. Payload (or Shellcode) Staging Concept

ECI (Early Cascade Injection) uses two payloads :

  1. PAYLOAD STUB
    • It acts a a initialozer for our main payload.
    • It is executed via the Shim Engine.
    • It Queues APCs internally.

Now the MALICIOUS CODE

  1. Main Payload
    • It is the actual shellcode.
    • It is executed later before the process resumes via APC.

EARLY CASCADE INJECTION

Well now that we know the prerequisites we can start understanding , how we can use this novel approach to inject the shellcode within the process .

Early Cascade Injection is a Windows user-mode Process Injection Technique in Windows that was introduced by OUTFLANK in 2024 . It injects/executes the shellcode during the process creation stage ( not after the process is created as in APC ) .

This technique executes specifically in the user-mode initialization phase but before most EDR( Endpoint Detection and Response ) solution fully initialize their user-mode detection mechanisms.

In simple words It executes shellcode before most EDR solutions fully initialize their detection mechanisms.

This technique also avoids drawbacks such as cross - process APC queuing becuase in this execution occurs within a process so it does not need other process to work and loader-lock restrictions.

Now ,

WHY EARLY CASCADE INJECTION ?

Here is a simple breakdown for it ,

WHY EARLY.

  • Because the execution occurs way before kernel32.dll finishes loading.
  • And also Before user-mode EDR hooks activate.

&

WHY CASCADE.

  • Because in this the execution happens in stages like shim engine triggers payload stub
  • Then payload stub ques APC
  • APC executes main payload later

Simply CASCADE means a process in which something is successively passed on or executed.

But how does this Novel Approach processed

To understand this we will use this technical flow and will understand it step-by-step:

→ Create suspended process

→ Write Stub + Main Payload

→ g_ShimsEnabled = 1

→ Point g_pfnSE_DllLoaded → Stub

→ Resume thread

→ Stub runs (via Shim callback)

→ g_ShimsEnabled = 0 immediately

→ Stub queues INTRA APC → Main Payload

→ NtTestAlert executes APC

→ Main Payload runs

I know what you're thinking is this some rocket science?.....Well its not we got you covered let's break this down in steps!

STEP 1

Process is Created in Suspended State

  • Here first the memory sections are mapped
  • And the most important .mrdata section and .data section becomes accessible

Before we proceed with this flow let us understand what is .mrdata section & .data section .

.mrdata section - it is a memory section behaviour that is

  • always Readable.
  • Writable only when the process is suspended.
  • and contains the most important (g_pfnSE i.e shim engine function pointer).

.data section -

  • this memory section behaviour is always Readable & Writeable.
  • It contains the Shim engine enable/disable flag (i.e boolean).

Now lets carry on with the technical flow .

STEP 2

Shim engine flag is enabled in .data , this causes the g_pfnSE function to be loaded from .mrdata. *when the shimes engine is enabled it loads all the dependencies the very first parameter is does is loading the g_pfnSE function that is inside of the .mr data section then the payload stub is called by using its address by the g_pfnSE pointer and then the main payload is queued in the same process by the payload stub

STEP 3

In this process the Shim Engine executes Payload Stub (discussed in previous step)

  • g_pfnSE callbacks the payload stub but at this moment
    • kernel32.dll is NOT fully loaded and

    • EDR hooks are not active

      The main payload cannot run yet as dependencies are missing that will come in later steps.

STEP 4

In this process The Payload stub queues Intra-Process APC

  • The Payload stub calls NtTestAlert logic here in an indirect manner
  • APC is queued inside the same process(since eci)
  • This is intra-process APC , not cross-process

EDRs mostly flag external APC queues , not internal ones.

STEP 5

In this process the shim engine flag is disabled immedialtely , it helps to prevent

  • Application crash because shim engine contains multiple parameters that should conation values because if there are no values then the program will crash , but in early cascade our main work is with g_pfnSE function and as it is executed the shimes engine is turned off/disabled in order to prevent programm crash.
  • Sometimes not disabling can cause Compatibility engine instability.

STEP 6

This step resumes the process normally

  • by this process kernel32.dll is loaded.
  • user-mode initialization is completed.
  • ntdll!NtTestAlert is executed in this step.

STEP 7

The main payload is executed via queued APCs

  • APC runs at PASSIVE_LEVEL.

  • Main payload is executed safely.

  • Because of intra-process APC it is stealth and EDR is not able to catch any remote injection ,or suspicious APC origin.

    After all these Steps you must be thinking WHY ECI IS MORE ADVANCED THAN APC INJECTION.

    There are few parameters that will help us understand this :

Feature Classic APC Injection Early Cascade Injection
Timing After process start During process creation
APC Type Often cross-process Intra-process
Shim Engine No Yes
EDR Visibility High Very Low
Dependency Safety Risky Safe

for more clarity refer the image:

unnamed (1)

FUNCTION ARCHITECTURE

TECHNICAL FLOW

FAQs

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •