MuddyWater Malware Loader drops LightPhoenix

Summary

This post is dissecting a malware loader that executes an embedded LightPhoenix backdoor payload. LightPhoenix is a C++ backdoor capable of writing files and executing CMD commands. The malware is associated with one of the latest MuddyWater campaigns.

This loader malware decodes and deobfuscates an embedded MZ payload. The functionality of this is fairly straight forward but is a good example to use to discuss several reverse engineering topics and how to work with IDA Pro better.

Who is MuddyWater? MuddyWater is an Iranian state-sponsored cyber espionage group that is commonly linked to Iran’s Ministry of Intelligence and Security (MOIS) which has been active since at least 2017. They focus on intelligence collection and disruptive operations against government, telecommunications, defense, and critical infrastructure targets across the Middle East, Europe, and North America.

The group is known for relatively simple but effective tradecraft, including spear-phishing, use of legitimate remote administration tools, PowerShell-based backdoors, and custom malware families such as POWERSTATS, MuddyC3, and SimpleTea.

MuddyWater often leverages compromised infrastructure and living-off-the-land techniques to blend into normal network activity, making detection more difficult, and is frequently associated with broader Iranian cyber campaigns aimed at strategic surveillance and regional influence.

Reverse Engineering the loader

What are we analyzing?

The loader malware sample that is unpacking and executing the LightPhoenix malware is being analyzed here. The hashes to both binaries are in the following table (if you want to follow along in IDA/Binja/Ghidra).

Hash	Description
32F51A376A8277649088047DD61EFDF5	The malware loader
96CA9282847651CB806ADAA82E532D17	The embedded payload (LightPhoenix)

Opening the malware loader binary in IDA Pro

When looking at the WinMain function of the binary there is a call that happens using qword_14009AB90. When this global variable is double clicked, we can see that it is not defined yet (meaning that it is resolved earlier).

Given the use of it and seeing that it is immediately used in a memcpy call we can assume that this is a call to VirtualAlloc but we can confirm this.

Figure 1: Showing unknown global variable in WinMain

Initialization Functions

When the binary begins to run there are many initialization steps that are taken before it gets to the WinMain function. We can look at the function that calls WinMain and locate where initialization functions are iterated through to investigate them.

Figure 2: Identifying the initterm function starting address

Find the initterm function and it will show you the address of the start of any functions that are called as part of the initialization starting with &First. When viewing this location, we will see a series of functions to investigate.

Figure 3: Initterm function list

We will look at one of these as they all do the same thing. They perform a SUB operation to decode the string to produce either a library or function name. This is then used in LoadLibrary or GetProcAddress calls to get a pointer to the function that is saves to a global variable.

Figure 4: Setting global variables that point to Win32 functions

Decoding the string results in “Cabinet.dll” and a pointer to this is saved into the global variable &unk_14009ABD0. I will then review this for all of the functions and update the name of the gobal variables to make it clear what function it is pointing to. This will make understanding the rest of the decompilation easier.

Figure 5: Decoding a function name string

With all of the functions understood, they are renamed and most importantly, the global variables used everywhere else are named properly as “g_func_X”.

Figure 6: Renamed all init functions and global variables

Decoding of Embedded Binary

When coming back to the WinMain function we can see where the renamed global variables show up and make more sense from what was otherwise unresolved. The g_payloadBytes global variable pointed to an address in the binary with a very large blob of bytes which is the embedded payload. The payload bytes are copied into a new memory section and passed into sub_140001480 along with a long string.

Figure 7: Function calls to decode payload bytes

The function is similar to what we saw with the module and function decoding. The long string is a rotating XOR key and it XOR decodes the entirety of the payload bytes.

Figure 8: XOR decoding routine

The second function right after labeled sub_140001500 allocates memory on the heap and selectively copies bytes across. The copying of bytes is performed in 10-byte chunks and it skips the 11th byte each time.

Figure 9: Special logic to drop every 11th byte

Looking into the remainder of the WinMain function we can see what the unknown sub_140002970 intends to return based on the remainder of the code. The malware is going use the handle of its own process to create a new thread within its virtual address space.

Figure 10: IDA showing creating a new thread to execute the embedded payload

Jumping into the sub_140002970 function, we can see several offsets being used. These should be quickly identifiable purely based on the initial a1 + 0x3C offset.

Figure 11: Variable using offsets

Resolving PE Header Structures

The first offset we see with 0x3C translates to e_lfanew -> IMAGE_NT_HEADERS. We can resolve this in IDA by applying the known type. Selecting the variable and pressing “Y” will bring up the dialog to change the name and type of the variable. Once this is performed, the unknown offsets resolve to reveal they are referencing OptionalHeader entries.

Figure 12: Identifying PIMAGE_NT_HEADER64 struct

At the end we see nt_headers + 1 which is not referring to another header but rather the section table. It is effectively stating IMAGE_FIRST_SECTION(nt_headers). IDA is treating this as pointer arithmetic instead of being a PIMAGE_SECTION_HEADER.

We can re-type this to what it actually is and see that it is looping through all section headers.

Figure 13: Identifying PIMAGE_SECTION_HEADER struct

What is it doing?

It takes the PE file stored in payloadBytes and reconstructs it in memory at new_Mem, section by section. This is exactly what the Windows loader normally does, but in this case the malware is doing it manually.

There are two separate scenarios for whether it will perform a memcpy on the section or a memset.

Section has raw data = perform a memcpy
Section has no raw data (BSS-style section) = perform a memset (zero-fills the section)

This is what the Windows loader does:

Reads PE headers
Allocates memory
Maps sections
Fixes imports
Applies relocations
Calls entry point

In this case, the code is doing step 3 manually. This pattern is extremely common in reflective loaders, packers, and fileless malware.

Hash	Description
Reflective loaders	* Load DLL/EXE from memory * No disk write
Packers	* Unpack payload into memory * Jump to it
Fileless malware	* Inject payload into another process

Import Resolution Phase

The previous loop mapped sections into memory. This next block walks the Import Directory and fills in the IAT so the unpacked PE can call Windows APIs normally.

What it does at a high level

It is manually doing the equivalent of what the Windows loader does for imports:

Read the PE Import Directory
For each imported DLL:
- load or locate the DLL
For each imported function:
- resolve its address with GetProcAddress
Write the resolved address into the unpacked image’s IAT

The DataDirectory[1] is Import Directory. Re-typing the v17 variable to the proper type that will resolve the other cases of offsets to named properties.

Figure 14: Identifying IMAGE_IMPORT_DESCRIPTOR struct

Base Relocation Phase

At this point, the loader has already parsed the PE, allocated memory, copied sections, resolved imports, and now it is fixing up absolute addresses because the image was not loaded at its preferred ImageBase.

Figure 15: Base relocation phase

Final memory-protection stage

At this point the loader has already mapped the PE into memory, resolved imports, applied relocations, and now it is changing the mapped memory from broad writeable/executable loader permissions into section-appropriate protections based on each section’s Characteristics.

That is usually one of the last steps before execution transfers to the unpacked payload.

Figure 16: Setting memory protections

Since Characteristics is being treated as a signed integer:

if Characteristics >= 0 -> high bit not set -> not writable
if Characteristics < 0 -> high bit set -> writable

Now back in the WinMain function, we can fully understand what the startAddress is now pointing to. This is used to create a new thread to execute it in memory.

Figure 17: The start address of the new thread

Saving the embedded payload to disk

To extract the payload out of the loader, I wrote some python scripts to assist in dumping the payload from memory right after it is written. We can set a breakpoint right after the bytes are copied to the heap. The memory pointed to by &ptr_mem should now contain the proper bytes to dump.

Figure 18: Breakpointing after heap written

The address to know at this point is the &ptr_mem address. In this case it is 0x5F6FD0.

Figure 19: ptr_mem address

We can attempt to dump out the beginning of the file to determine if we are getting a good binary. The following script can be executed to dump the first 0x1000 bytes to validate it is dumping correctly.

Make sure you enter the correct payload address and the test size when prompted:

Payload address: 0x5F6FD0
Payload size: 0x1000

dump_ptr_mem_runtime.py

  
import struct
import ida_kernwin
import idaapi
import idc

def ask_hex(prompt: str, default: str = "") -> int:
    s = ida_kernwin.ask_str(default, 0, prompt)
    if not s:
        raise RuntimeError("Cancelled")
    return int(s, 16)


def ask_path(default_name: str) -> str:
    path = ida_kernwin.ask_file(True, default_name, "Save dumped payload")
    if not path:
        raise RuntimeError("Cancelled")
    return path


def read_dbg_mem(ea: int, size: int) -> bytes:
    data = idc.read_dbg_memory(ea, size)
    if not data or len(data) != size:
        raise RuntimeError(f"Failed reading memory at 0x{ea:X}, size 0x{size:X}")
    return data


def validate_pe(buf: bytes) -> None:
    if len(buf) < 0x40:
        print("[!] Too small")
        return

    if buf[:2] != b"MZ":
        print("[!] Missing MZ")
        print(f"    First bytes: {buf[:16].hex()}")
        return

    e_lfanew = struct.unpack_from("<I", buf, 0x3C)[0]
    print(f"[*] e_lfanew = 0x{e_lfanew:X}")

    if e_lfanew + 4 > len(buf):
        print("[!] e_lfanew out of range")
        return

    sig = buf[e_lfanew:e_lfanew+4]
    if sig != b"PE\x00\x00":
        print(f"[!] Invalid PE sig at 0x{e_lfanew:X}: {sig.hex()}")
        return

    print("[+] Valid MZ/PE detected")

    file_hdr = e_lfanew + 4
    machine = struct.unpack_from("<H", buf, file_hdr)[0]
    num_sections = struct.unpack_from("<H", buf, file_hdr + 2)[0]
    size_opt = struct.unpack_from("<H", buf, file_hdr + 16)[0]
    opt_off = file_hdr + 20
    magic = struct.unpack_from("<H", buf, opt_off)[0]
    ep = struct.unpack_from("<I", buf, opt_off + 0x10)[0]

    print(f"[*] Machine            : 0x{machine:04X}")
    print(f"[*] NumberOfSections   : {num_sections}")
    print(f"[*] SizeOfOptionalHdr  : 0x{size_opt:X}")
    print(f"[*] OptionalHdr.Magic  : 0x{magic:04X}")
    print(f"[*] AddressOfEntryPoint: 0x{ep:X}")


def main():
    if not idaapi.is_debugger_on():
        raise RuntimeError("Debugger is not active")

    payload_ea = ask_hex("Payload address", "0x5F6FD0")
    payload_size = ask_hex("Payload size", "0x100000")
    out_path = ask_path("payload_manual_dump.bin")

    print(f"[*] Dumping 0x{payload_size:X} bytes from 0x{payload_ea:X}")
    buf = read_dbg_mem(payload_ea, payload_size)

    with open(out_path, "wb") as f:
        f.write(buf)

    print(f"[+] Wrote {len(buf):,} bytes to {out_path}")
    validate_pe(buf)


if __name__ == "__main__":
    try:
        main()
    except Exception as e:
        print(f"[!] Error: {e}")

The bytes will be dumped to the specified location.

Figure 20: Script dumping bytes to disk

In Detect-it-Easy we can see even from just the 4KB that this appears correct.

Figure 21: Detect-It-Easy showing good payload

The following script will read the headers and print the right sizes to dump. Run the following script against the previously dumped file that is 4KB is size to determine what the real size should be.

get_header_info.py

  
import struct
import ida_kernwin

def u16(b, o):
    return struct.unpack_from("<H", b, o)[0]

def u32(b, o):
    return struct.unpack_from("<I", b, o)[0]

def parse_pe(buf: bytes):
    if len(buf) < 0x40:
        raise ValueError("Too small")

    if buf[:2] != b"MZ":
        raise ValueError("Missing MZ")

    e_lfanew = u32(buf, 0x3C)
    if e_lfanew + 4 > len(buf):
        raise ValueError(f"e_lfanew out of range: 0x{e_lfanew:X}")

    if buf[e_lfanew:e_lfanew+4] != b"PE\x00\x00":
        raise ValueError("Missing PE signature")

    file_hdr = e_lfanew + 4
    machine = u16(buf, file_hdr + 0x00)
    num_sections = u16(buf, file_hdr + 0x02)
    size_opt = u16(buf, file_hdr + 0x10)

    opt = file_hdr + 20
    magic = u16(buf, opt + 0x00)
    size_of_image = u32(buf, opt + 0x38)
    size_of_headers = u32(buf, opt + 0x3C)

    sec_off = opt + size_opt

    print(f"e_lfanew         = 0x{e_lfanew:X}")
    print(f"Machine          = 0x{machine:04X}")
    print(f"NumberOfSections = {num_sections}")
    print(f"OptionalMagic    = 0x{magic:04X}")
    print(f"SizeOfImage      = 0x{size_of_image:X}")
    print(f"SizeOfHeaders    = 0x{size_of_headers:X}")
    print()

    max_raw_end = size_of_headers

    for i in range(num_sections):
        s = sec_off + i * 0x28
        name = buf[s:s+8].split(b"\x00", 1)[0].decode(errors="ignore")
        virtual_size = u32(buf, s + 0x08)
        virtual_address = u32(buf, s + 0x0C)
        size_of_raw_data = u32(buf, s + 0x10)
        pointer_to_raw_data = u32(buf, s + 0x14)
        characteristics = u32(buf, s + 0x24)

        raw_end = pointer_to_raw_data + size_of_raw_data
        if raw_end > max_raw_end:
            max_raw_end = raw_end

        print(
            f"[{i}] {name:<8} "
            f"VA=0x{virtual_address:08X} "
            f"VSz=0x{virtual_size:08X} "
            f"RawPtr=0x{pointer_to_raw_data:08X} "
            f"RawSz=0x{size_of_raw_data:08X} "
            f"RawEnd=0x{raw_end:08X} "
            f"Chars=0x{characteristics:08X}"
        )

    print()
    print(f"Computed on-disk file size = 0x{max_raw_end:X} ({max_raw_end})")
    print(f"In-memory image size       = 0x{size_of_image:X} ({size_of_image})")


def main():
    path = ida_kernwin.ask_file(False, "*.bin", "Select dumped payload")
    if not path:
        print("Cancelled")
        return

    with open(path, "rb") as f:
        buf = f.read()

    parse_pe(buf)


if __name__ == "__main__":
    try:
        main()
    except Exception as e:
        print(f"Error: {e}")

Upon running this, we get the exact size we should dump for an on-disk representation of the file.

Figure 22: Output showing header details

Utilize the dump_ptr_mem_runtime.py script from before and use the correct file size. This will produce the properly constructed binary it embedded.

Embedded Payload

The final dropped payload has the MD5 hash 96CA9282847651CB806ADAA82E532D17 and can be found on VirusTotal. This payload is the LightPhoenix backdoor attributed to the MuddyWater threat group.

Figure 23: LightPhoenix backdoor