Summary
This post is dissecting a malware loader that executes an embedded LightPhoenix backdoor payload. LightPhoenix is a C++ backdoor capable of writing files and executing CMD commands. The malware is associated with one of the latest MuddyWater campaigns.
This loader malware decodes and deobfuscates an embedded MZ payload. The functionality of this is fairly straight forward but is a good example to use to discuss several reverse engineering topics and how to work with IDA Pro better.
Who is MuddyWater? MuddyWater is an Iranian state-sponsored cyber espionage group that is commonly linked to Iran’s Ministry of Intelligence and Security (MOIS) which has been active since at least 2017. They focus on intelligence collection and disruptive operations against government, telecommunications, defense, and critical infrastructure targets across the Middle East, Europe, and North America.
The group is known for relatively simple but effective tradecraft, including spear-phishing, use of legitimate remote administration tools, PowerShell-based backdoors, and custom malware families such as POWERSTATS, MuddyC3, and SimpleTea.
MuddyWater often leverages compromised infrastructure and living-off-the-land techniques to blend into normal network activity, making detection more difficult, and is frequently associated with broader Iranian cyber campaigns aimed at strategic surveillance and regional influence.
Reverse Engineering the loader
What are we analyzing?
The loader malware sample that is unpacking and executing the LightPhoenix malware is being analyzed here. The hashes to both binaries are in the following table (if you want to follow along in IDA/Binja/Ghidra).
| Hash | Description |
|---|---|
| 32F51A376A8277649088047DD61EFDF5 | The malware loader |
| 96CA9282847651CB806ADAA82E532D17 | The embedded payload (LightPhoenix) |
Opening the malware loader binary in IDA Pro
When looking at the WinMain function of the binary there is a call that happens using qword_14009AB90. When this global variable is double clicked, we can see that it is not defined yet (meaning that it is resolved earlier).
Given the use of it and seeing that it is immediately used in a memcpy call we can assume that this is a call to VirtualAlloc but we can confirm this.
Initialization Functions
When the binary begins to run there are many initialization steps that are taken before it gets to the WinMain function. We can look at the function that calls WinMain and locate where initialization functions are iterated through to investigate them.
Find the initterm function and it will show you the address of the start of any functions that are called as part of the initialization starting with &First. When viewing this location, we will see a series of functions to investigate.
We will look at one of these as they all do the same thing. They perform a SUB operation to decode the string to produce either a library or function name. This is then used in LoadLibrary or GetProcAddress calls to get a pointer to the function that is saves to a global variable.
Decoding the string results in “Cabinet.dll” and a pointer to this is saved into the global variable &unk_14009ABD0. I will then review this for all of the functions and update the name of the gobal variables to make it clear what function it is pointing to. This will make understanding the rest of the decompilation easier.
With all of the functions understood, they are renamed and most importantly, the global variables used everywhere else are named properly as “g_func_X”.
Decoding of Embedded Binary
When coming back to the WinMain function we can see where the renamed global variables show up and make more sense from what was otherwise unresolved. The g_payloadBytes global variable pointed to an address in the binary with a very large blob of bytes which is the embedded payload. The payload bytes are copied into a new memory section and passed into sub_140001480 along with a long string.
The function is similar to what we saw with the module and function decoding. The long string is a rotating XOR key and it XOR decodes the entirety of the payload bytes.
The second function right after labeled sub_140001500 allocates memory on the heap and selectively copies bytes across. The copying of bytes is performed in 10-byte chunks and it skips the 11th byte each time.
Looking into the remainder of the WinMain function we can see what the unknown sub_140002970 intends to return based on the remainder of the code. The malware is going use the handle of its own process to create a new thread within its virtual address space.
Jumping into the sub_140002970 function, we can see several offsets being used. These should be quickly identifiable purely based on the initial a1 + 0x3C offset.
Resolving PE Header Structures
The first offset we see with 0x3C translates to e_lfanew -> IMAGE_NT_HEADERS. We can resolve this in IDA by applying the known type. Selecting the variable and pressing “Y” will bring up the dialog to change the name and type of the variable. Once this is performed, the unknown offsets resolve to reveal they are referencing OptionalHeader entries.
At the end we see nt_headers + 1 which is not referring to another header but rather the section table. It is effectively stating IMAGE_FIRST_SECTION(nt_headers). IDA is treating this as pointer arithmetic instead of being a PIMAGE_SECTION_HEADER.
We can re-type this to what it actually is and see that it is looping through all section headers.
What is it doing?
It takes the PE file stored in payloadBytes and reconstructs it in memory at new_Mem, section by section. This is exactly what the Windows loader normally does, but in this case the malware is doing it manually.
There are two separate scenarios for whether it will perform a memcpy on the section or a memset.
- Section has raw data = perform a
memcpy - Section has no raw data (BSS-style section) = perform a
memset(zero-fills the section)
This is what the Windows loader does:
- Reads PE headers
- Allocates memory
- Maps sections
- Fixes imports
- Applies relocations
- Calls entry point
In this case, the code is doing step 3 manually. This pattern is extremely common in reflective loaders, packers, and fileless malware.
| Hash | Description |
|---|---|
| Reflective loaders | * Load DLL/EXE from memory * No disk write |
| Packers | * Unpack payload into memory * Jump to it |
| Fileless malware | * Inject payload into another process |
Import Resolution Phase
The previous loop mapped sections into memory. This next block walks the Import Directory and fills in the IAT so the unpacked PE can call Windows APIs normally.
What it does at a high level
It is manually doing the equivalent of what the Windows loader does for imports:
- Read the PE
Import Directory - For each imported DLL:
- load or locate the DLL
- For each imported function:
- resolve its address with
GetProcAddress
- resolve its address with
- Write the resolved address into the unpacked image’s IAT
The DataDirectory[1] is Import Directory. Re-typing the v17 variable to the proper type that will resolve the other cases of offsets to named properties.
Base Relocation Phase
At this point, the loader has already parsed the PE, allocated memory, copied sections, resolved imports, and now it is fixing up absolute addresses because the image was not loaded at its preferred ImageBase.
Final memory-protection stage
At this point the loader has already mapped the PE into memory, resolved imports, applied relocations, and now it is changing the mapped memory from broad writeable/executable loader permissions into section-appropriate protections based on each section’s Characteristics.
That is usually one of the last steps before execution transfers to the unpacked payload.
Since Characteristics is being treated as a signed integer:
- if Characteristics >= 0 -> high bit not set -> not writable
- if Characteristics < 0 -> high bit set -> writable
Now back in the WinMain function, we can fully understand what the startAddress is now pointing to. This is used to create a new thread to execute it in memory.
Saving the embedded payload to disk
To extract the payload out of the loader, I wrote some python scripts to assist in dumping the payload from memory right after it is written. We can set a breakpoint right after the bytes are copied to the heap. The memory pointed to by &ptr_mem should now contain the proper bytes to dump.
The address to know at this point is the &ptr_mem address. In this case it is 0x5F6FD0.
We can attempt to dump out the beginning of the file to determine if we are getting a good binary. The following script can be executed to dump the first 0x1000 bytes to validate it is dumping correctly.
Make sure you enter the correct payload address and the test size when prompted:
- Payload address:
0x5F6FD0 - Payload size:
0x1000
dump_ptr_mem_runtime.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
import struct
import ida_kernwin
import idaapi
import idc
def ask_hex(prompt: str, default: str = "") -> int:
s = ida_kernwin.ask_str(default, 0, prompt)
if not s:
raise RuntimeError("Cancelled")
return int(s, 16)
def ask_path(default_name: str) -> str:
path = ida_kernwin.ask_file(True, default_name, "Save dumped payload")
if not path:
raise RuntimeError("Cancelled")
return path
def read_dbg_mem(ea: int, size: int) -> bytes:
data = idc.read_dbg_memory(ea, size)
if not data or len(data) != size:
raise RuntimeError(f"Failed reading memory at 0x{ea:X}, size 0x{size:X}")
return data
def validate_pe(buf: bytes) -> None:
if len(buf) < 0x40:
print("[!] Too small")
return
if buf[:2] != b"MZ":
print("[!] Missing MZ")
print(f" First bytes: {buf[:16].hex()}")
return
e_lfanew = struct.unpack_from("<I", buf, 0x3C)[0]
print(f"[*] e_lfanew = 0x{e_lfanew:X}")
if e_lfanew + 4 > len(buf):
print("[!] e_lfanew out of range")
return
sig = buf[e_lfanew:e_lfanew+4]
if sig != b"PE\x00\x00":
print(f"[!] Invalid PE sig at 0x{e_lfanew:X}: {sig.hex()}")
return
print("[+] Valid MZ/PE detected")
file_hdr = e_lfanew + 4
machine = struct.unpack_from("<H", buf, file_hdr)[0]
num_sections = struct.unpack_from("<H", buf, file_hdr + 2)[0]
size_opt = struct.unpack_from("<H", buf, file_hdr + 16)[0]
opt_off = file_hdr + 20
magic = struct.unpack_from("<H", buf, opt_off)[0]
ep = struct.unpack_from("<I", buf, opt_off + 0x10)[0]
print(f"[*] Machine : 0x{machine:04X}")
print(f"[*] NumberOfSections : {num_sections}")
print(f"[*] SizeOfOptionalHdr : 0x{size_opt:X}")
print(f"[*] OptionalHdr.Magic : 0x{magic:04X}")
print(f"[*] AddressOfEntryPoint: 0x{ep:X}")
def main():
if not idaapi.is_debugger_on():
raise RuntimeError("Debugger is not active")
payload_ea = ask_hex("Payload address", "0x5F6FD0")
payload_size = ask_hex("Payload size", "0x100000")
out_path = ask_path("payload_manual_dump.bin")
print(f"[*] Dumping 0x{payload_size:X} bytes from 0x{payload_ea:X}")
buf = read_dbg_mem(payload_ea, payload_size)
with open(out_path, "wb") as f:
f.write(buf)
print(f"[+] Wrote {len(buf):,} bytes to {out_path}")
validate_pe(buf)
if __name__ == "__main__":
try:
main()
except Exception as e:
print(f"[!] Error: {e}")
The bytes will be dumped to the specified location.
In Detect-it-Easy we can see even from just the 4KB that this appears correct.
The following script will read the headers and print the right sizes to dump. Run the following script against the previously dumped file that is 4KB is size to determine what the real size should be.
get_header_info.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
import struct
import ida_kernwin
def u16(b, o):
return struct.unpack_from("<H", b, o)[0]
def u32(b, o):
return struct.unpack_from("<I", b, o)[0]
def parse_pe(buf: bytes):
if len(buf) < 0x40:
raise ValueError("Too small")
if buf[:2] != b"MZ":
raise ValueError("Missing MZ")
e_lfanew = u32(buf, 0x3C)
if e_lfanew + 4 > len(buf):
raise ValueError(f"e_lfanew out of range: 0x{e_lfanew:X}")
if buf[e_lfanew:e_lfanew+4] != b"PE\x00\x00":
raise ValueError("Missing PE signature")
file_hdr = e_lfanew + 4
machine = u16(buf, file_hdr + 0x00)
num_sections = u16(buf, file_hdr + 0x02)
size_opt = u16(buf, file_hdr + 0x10)
opt = file_hdr + 20
magic = u16(buf, opt + 0x00)
size_of_image = u32(buf, opt + 0x38)
size_of_headers = u32(buf, opt + 0x3C)
sec_off = opt + size_opt
print(f"e_lfanew = 0x{e_lfanew:X}")
print(f"Machine = 0x{machine:04X}")
print(f"NumberOfSections = {num_sections}")
print(f"OptionalMagic = 0x{magic:04X}")
print(f"SizeOfImage = 0x{size_of_image:X}")
print(f"SizeOfHeaders = 0x{size_of_headers:X}")
print()
max_raw_end = size_of_headers
for i in range(num_sections):
s = sec_off + i * 0x28
name = buf[s:s+8].split(b"\x00", 1)[0].decode(errors="ignore")
virtual_size = u32(buf, s + 0x08)
virtual_address = u32(buf, s + 0x0C)
size_of_raw_data = u32(buf, s + 0x10)
pointer_to_raw_data = u32(buf, s + 0x14)
characteristics = u32(buf, s + 0x24)
raw_end = pointer_to_raw_data + size_of_raw_data
if raw_end > max_raw_end:
max_raw_end = raw_end
print(
f"[{i}] {name:<8} "
f"VA=0x{virtual_address:08X} "
f"VSz=0x{virtual_size:08X} "
f"RawPtr=0x{pointer_to_raw_data:08X} "
f"RawSz=0x{size_of_raw_data:08X} "
f"RawEnd=0x{raw_end:08X} "
f"Chars=0x{characteristics:08X}"
)
print()
print(f"Computed on-disk file size = 0x{max_raw_end:X} ({max_raw_end})")
print(f"In-memory image size = 0x{size_of_image:X} ({size_of_image})")
def main():
path = ida_kernwin.ask_file(False, "*.bin", "Select dumped payload")
if not path:
print("Cancelled")
return
with open(path, "rb") as f:
buf = f.read()
parse_pe(buf)
if __name__ == "__main__":
try:
main()
except Exception as e:
print(f"Error: {e}")
Upon running this, we get the exact size we should dump for an on-disk representation of the file.
Utilize the dump_ptr_mem_runtime.py script from before and use the correct file size. This will produce the properly constructed binary it embedded.
Embedded Payload
The final dropped payload has the MD5 hash 96CA9282847651CB806ADAA82E532D17 and can be found on VirusTotal. This payload is the LightPhoenix backdoor attributed to the MuddyWater threat group.






















