Introduction to Malware Analysis
Key Takeaways
- •Before analyzing a sample, knowing the major malware families shapes your approach and tells you what to look for:.
- •Static analysis extracts information from the file without running it.
- •Dynamic analysis means executing the sample in a controlled environment and observing what it does.
- •When sandbox and dynamic analysis aren't enough — the sample is evasive, you need to understand the exact encryption algorithm, you need to write a decoder for the C2 protocol, or you're documenting a novel malware family — you need a disassembler.
- •Thorough IOC extraction is the final deliverable of malware analysis.
Every piece of malware that gets analyzed started as an unknown executable on a compromised machine. Incident responders triaging a breach, threat intelligence analysts tracking a new campaign, and malware researchers trying to understand what the latest ransomware group deployed all face the same initial problem: what does this thing do, and what did it do to this system?
Malware analysis answers that question. The methodologies — static analysis, dynamic analysis, and code-level reverse engineering — form a tiered approach where each layer reveals more detail while demanding more skill and time. This guide covers professional-level malware analysis from initial triage through Ghidra-based reverse engineering, with working tool commands and code examples throughout.
Real Malware Families Worth Understanding
Before analyzing a sample, knowing the major malware families shapes your approach and tells you what to look for:
Emotet (2014-2021, 2022-present): The most prolific malware loader in history. Originally a banking trojan, it evolved into a dropper-for-hire that distributed TrickBot, QakBot, and eventually Ryuk ransomware. Emotet used Word macro documents and PowerShell for initial delivery, polymorphic code to evade signatures, and peer-to-peer C2 architecture. The global coordinated takedown in January 2021 (Europol + FBI + multiple national agencies) seized 700+ servers and disrupted its botnet for about 10 months before it resurfaced.
LockBit Ransomware (2019-2024): The most prolific ransomware-as-a-service operation until law enforcement action in February 2024. LockBit 3.0 (Black) implemented novel anti-analysis techniques including an encryption key passed on the command line (the sample cannot decrypt itself without the key, defeating static analysis). It targeted print spoolers and domain controllers for lateral movement, used MITRE ATT&CK T1490 (inhibit system recovery) to delete VSS copies, and achieved encryption speeds of ~4GB/minute using AES-NI. Estimated total ransoms: $120 million+ before disruption.
Pegasus (NSO Group, 2016-present): Commercial spyware sold to governments. Multiple iOS zero-days including FORCEDENTRY (CVE-2021-30860), a zero-click exploit in the JBIG2 image processing library. The sample weaponized a logic bug in the JBIG2 decoder to execute arbitrary code in the context of a maliciously crafted PDF attachment — no user interaction required. Citizen Lab's forensic analysis using MVT (Mobile Verification Toolkit) was the first to document the technical indicators.
XZ Utils Backdoor (CVE-2024-3094): Not ransomware or spyware — a supply chain attack in a fundamental Linux compression library. A malicious contributor (operating as "Jia Tan") spent two years building trust in the xz-utils project before inserting a backdoor that modified the OpenSSH authentication flow, potentially allowing remote code execution as root. Discovered by accident when a Microsoft engineer noticed unexplained 500ms SSH login latency on Debian Sid. CVSS score: 10.0. If widely deployed, this would have been the most significant Linux backdoor in history.
Safe Analysis Environment Setup
Never analyze malware on your host machine, on a machine connected to your corporate network, or on any system whose compromise you aren't prepared for. Assume every sample can detect the VM environment and either go dormant or attempt to escape. Modern malware routinely checks for VM artifacts before deploying its payload.
The Baseline Lab Architecture
[Host Machine]
├── VMware Workstation / VirtualBox (Hypervisor)
│ ├── [Flare-VM] (Windows 10/11 — analysis workstation)
│ │ IP: 192.168.100.2
│ │ Network: Host-only adapter
│ │ Snapshot: "Clean-Baseline" (taken before any analysis)
│ │
│ └── [REMnux] (Linux — network simulation)
│ IP: 192.168.100.1
│ Network: Host-only adapter
│ Runs: INetSim, FakeNet-NG (simulates DNS/HTTP/SMTP/FTP)
│
└── NO internet connection from analysis VMs
# Set up Flare-VM (run in a clean Windows 10 VM with admin rights)
# Install from https://github.com/mandiant/flare-vm
Set-ExecutionPolicy Unrestricted
$env:Path += ";C:\Python312"
(New-Object net.webclient).DownloadFile('https://raw.githubusercontent.com/mandiant/flare-vm/main/install.ps1', "$env:TEMP\install.ps1")
Unblock-File "$env:TEMP\install.ps1"
.\install.ps1
# This installs: IDA Free, x64dbg, Ghidra, PEView, CFF Explorer, Wireshark,
# ProcessMonitor, ProcessHacker, PE-bear, FLOSS, strings, binwalk, and ~100+ more tools# REMnux setup — on a separate Linux VM
# Download from https://remnux.org/
# After import, configure INetSim to simulate network services
# /etc/inetsim/inetsim.conf — configure fake network services
# Enable all services so malware believes it has real internet connectivity
service_bind_address 192.168.100.1 # REMnux IP on the host-only network
inetsim &
# From Flare-VM, configure DNS to point to REMnux:
# Settings → Network → DNS: 192.168.100.1
# Now any DNS query from Flare-VM goes to INetSim, which resolves everything to 192.168.100.1
# HTTP requests return a default page, SMTP accepts mail, FTP accepts connectionsAnti-Analysis Bypass
Modern malware checks for virtual environments before deploying its payload. Common checks:
# Common VM detection techniques malware uses:
# 1. Check for VMware registry keys
# HKLM\SOFTWARE\VMware, Inc.\VMware Tools
# HKLM\HARDWARE\DEVICEMAP\Scsi\Scsi Port 0\Scsi Bus 0\Target Id 0\Logical Unit Id 0: Identifier = VBOX HARDDISK
# 2. Check for VM-specific processes
# vmtoolsd.exe, vmwaretray.exe, vboxservice.exe, VBoxTray.exe
# 3. Check for specific hardware (VM-typical values)
# CPU cores: 1-2 (physical machines typically have more)
# RAM: 1-4GB (VM baseline)
# Screen resolution: default VM resolution
# 4. Timing checks (CPUID/RDTSC are slower in VMs due to hypervisor interception)
# 5. Disk size checks (VMs often have small disks)
# 6. Check for analysis tool processes (procmon.exe, wireshark.exe, ollydbg.exe)
# Bypass strategies:
# - Give the VM 4+ cores and 8+ GB RAM
# - Install VMware Tools but then remove the registry artifacts
# - Use Pafish to check what fingerprints your VM exposes
# - Use al-khaser to test your VM against 100+ anti-analysis checks
# - Use SCYLLA_HIDE or x64dbg's ScyllaHide plugin to hide the debugger# Check your VM's fingerprint before analyzing malware
# Download Pafish: https://github.com/a0rtega/pafish
.\pafish.exe
# Lists all VM/sandbox detection methods that would trigger
# Fix each one before detonating a sample
# Remove VMware artifacts that malware checks
Remove-ItemProperty -Path "HKLM:\SOFTWARE\VMware, Inc.\VMware Tools" -Name "InstallPath" -ErrorAction SilentlyContinue
# Note: this may break some VMware functionality — snapshot before doing thisStatic Analysis: No Execution Required
Static analysis extracts information from the file without running it. It's safe, fast, and often reveals enough to identify the malware family, understand its capabilities, and extract network indicators.
Step 1: Hashing and Threat Intelligence
# Compute file hashes — SHA256 is the standard identifier
sha256sum malware.exe
md5sum malware.exe # Legacy compatibility, also useful for VirusTotal search
ssdeep malware.exe # Fuzzy hash — detects slight variants of the same sample
# Submit to VirusTotal (if the sample is not sensitive)
# Web UI: virustotal.com
# API:
vt file <sha256_hash>
# Or submit the file directly (only if engagement allows — public VirusTotal means all subscribers can see it)
vt file submit malware.exe
# Check other threat intel platforms
# AbuseIPDB: https://www.abuseipdb.com (for C2 IPs extracted from the sample)
# URLhaus: https://urlhaus.abuse.ch (for malicious URLs/domains)
# MalwareBazaar: https://bazaar.abuse.ch (sample search, IOC lookup)
# Hybrid Analysis: https://www.hybrid-analysis.com (sandbox + static analysis)
# IMPORTANT: Before submitting to ANY public platform, check your engagement rules
# If the sample contains or was found on client infrastructure, it may have embedded IPs,
# internal hostnames, or proprietary information. Public submission = public disclosure.Step 2: Strings Extraction
# Basic strings extraction — 4-character minimum by default, increase to reduce noise
strings -n 8 malware.exe | sort -u > strings_ascii.txt
# Unicode strings (16-bit encoding — common in Windows applications)
strings -e l -n 8 malware.exe | sort -u > strings_unicode.txt
# Combine both
strings -n 8 malware.exe > strings_all.txt
strings -e l -n 8 malware.exe >> strings_all.txt
sort -u strings_all.txt > strings_deduped.txt
# Extract interesting categories from strings
# Network indicators
grep -iE '([0-9]{1,3}\.){3}[0-9]{1,3}' strings_deduped.txt # IP addresses
grep -iE 'https?://[a-zA-Z0-9./\-]+' strings_deduped.txt # URLs
grep -iE '[a-z0-9.-]+\.(com|net|org|io|ru|cn|onion)' strings_deduped.txt # Domains
# Windows API calls (visible in plain text before import obfuscation)
grep -i "VirtualAlloc\|CreateRemoteThread\|WriteProcessMemory\|LoadLibrary" strings_deduped.txt
# File system operations
grep -i "\\\\Windows\|\\\\System32\|AppData\|Temp\\\\" strings_deduped.txt
# Registry persistence
grep -i "CurrentVersion\\\\Run\|SOFTWARE\\\\Microsoft" strings_deduped.txt
# Encryption indicators
grep -i "CryptEncrypt\|CryptAcquireContext\|bcrypt\|AES\|RSA" strings_deduped.txt
# C2 communication patterns
grep -i "User-Agent\|GET\|POST\|HTTP" strings_deduped.txt
# Mutex names (unique per malware family, excellent detection IOC)
grep -i "Global\\\\\|\\\\Global\\\\" strings_deduped.txt# FLOSS — FLARE Obfuscated String Solver
# Recovers strings that are constructed at runtime (stack strings, encoded strings)
# These are invisible to standard 'strings' but are recovered by FLOSS
floss malware.exe > floss_output.txt
# FLOSS output categories:
# STATIC STRINGS: same as strings command
# STACK STRINGS: strings built on the stack at runtime (encryption key loading pattern)
# TIGHT STRINGS: strings decoded in short loops (common in packed samples)
# DECODED STRINGS: strings that were base64/XOR decoded before useWhat the strings tell you:
Suspicious strings and their implications:
"cmd.exe /c" + "powershell -enc" → Command execution capability
"CreateRemoteThread" + "VirtualAllocEx" + "WriteProcessMemory" → Process injection
"net.exe user /add" + "net.exe localgroup administrators" → Privilege escalation
"HKCU\Software\Microsoft\Windows\CurrentVersion\Run" → Registry persistence
"bcrypt.dll" + file enumeration → Ransomware candidate
"GetTickCount" + "QueryPerformanceCounter" → Anti-sandbox timing check
".onion" → Tor-based C2
"Mozilla/5.0" → Custom HTTP client mimicking legitimate browser
"MachineGuid" → Unique machine fingerprinting for victim tracking
Global\\<unique_string> → Mutex (prevents multiple instances — IOC for YARA rule)
Step 3: PE Header Analysis
The Portable Executable (PE) format is the Windows executable container. Its headers contain metadata that reveals the file's capabilities, compilation details, and structure before a single instruction executes.
#!/usr/bin/env python3
# PE analysis script — requires: pip install pefile
import pefile
import math
import sys
from datetime import datetime
def entropy(data: bytes) -> float:
if not data:
return 0.0
freq = [data.count(bytes([i])) / len(data) for i in range(256)]
return -sum(p * math.log2(p) for p in freq if p > 0)
def analyze_pe(filepath: str):
pe = pefile.PE(filepath)
print("=" * 60)
print("PE HEADER ANALYSIS")
print("=" * 60)
# Compilation timestamp (often spoofed, but sometimes genuine)
ts = pe.FILE_HEADER.TimeDateStamp
print(f"[*] Compilation timestamp: {datetime.utcfromtimestamp(ts).isoformat()}")
if ts == 0:
print(" ^ Timestamp zeroed — possible anti-forensics")
# Architecture
machine = pe.FILE_HEADER.Machine
arch = {332: "x86 (32-bit)", 34404: "x86-64 (64-bit)", 452: "ARM", 43620: "ARM64"}.get(machine, f"Unknown ({hex(machine)})")
print(f"[*] Architecture: {arch}")
# Subsystem (GUI app vs console vs driver)
subsystem = pe.OPTIONAL_HEADER.Subsystem
subsystems = {2: "GUI", 3: "Console", 1: "Driver", 14: "Tray App (EFI)"}
print(f"[*] Subsystem: {subsystems.get(subsystem, subsystem)}")
# Section analysis
print(f"\n[*] Sections ({len(pe.sections)} total):")
print(f" {'Name':<12} {'VirtSize':>10} {'RawSize':>10} {'Entropy':>8} {'Flags'}")
for section in pe.sections:
name = section.Name.decode('utf-8', errors='replace').strip('\x00')
vsize = section.Misc_VirtualSize
rsize = section.SizeOfRawData
sec_entropy = entropy(section.get_data())
flags = []
if section.Characteristics & 0x20000000: # Executable
flags.append('EXEC')
if section.Characteristics & 0x40000000: # Readable
flags.append('READ')
if section.Characteristics & 0x80000000: # Writable
flags.append('WRITE')
flag_str = '+'.join(flags)
alert = " ← HIGH ENTROPY (packed/encrypted)" if sec_entropy > 7.0 else ""
print(f" {name:<12} {vsize:>10} {rsize:>10} {sec_entropy:>8.3f} {flag_str}{alert}")
# Import analysis
if hasattr(pe, 'DIRECTORY_ENTRY_IMPORT'):
print(f"\n[*] Imported DLLs and key functions:")
suspicious_apis = {
'CreateRemoteThread': 'Process injection',
'VirtualAllocEx': 'Remote memory allocation (injection)',
'WriteProcessMemory': 'Remote memory write (injection)',
'NtUnmapViewOfSection': 'Process hollowing',
'CreateProcess': 'Process creation',
'ShellExecute': 'Process/URL execution',
'WinExec': 'Command execution',
'CryptEncrypt': 'Encryption (ransomware)',
'CryptAcquireContext': 'Crypto context (ransomware)',
'RegSetValueEx': 'Registry modification (persistence)',
'IsDebuggerPresent': 'Anti-debug',
'CheckRemoteDebuggerPresent': 'Anti-debug',
'GetTickCount': 'Timing (anti-sandbox)',
'WSAStartup': 'Network capability',
'InternetOpenUrl': 'HTTP download',
'URLDownloadToFile': 'File download',
}
for entry in pe.DIRECTORY_ENTRY_IMPORT:
dll = entry.dll.decode('utf-8', errors='replace')
imports = []
for imp in entry.imports:
if imp.name:
name = imp.name.decode('utf-8', errors='replace')
if name in suspicious_apis:
imports.append(f"{name} [{suspicious_apis[name]}]")
if imports:
print(f" {dll}:")
for i in imports:
print(f" - {i}")
if __name__ == '__main__':
analyze_pe(sys.argv[1])What the PE analysis reveals:
-
High entropy section (>7.0): The section is compressed or encrypted — the sample is packed. The actual malicious code is hidden and unpacked at runtime. Common packers: UPX, MPRESS, Themida, custom LZMA/APLIB.
-
.textsection with write permissions: Legitimate code sections are read+execute only. A writable.textsection suggests self-modifying code — the sample patches itself at runtime (anti-disassembly technique). -
Missing imports: A sample with very few imports (especially suspicious:
LoadLibraryandGetProcAddressonly) is resolving all other imports dynamically at runtime to hide its actual capabilities from static analysis. -
Imports containing
CreateRemoteThread+VirtualAllocEx+WriteProcessMemory: This triad is the classic process injection signature. The sample will inject code into another process's memory space.
Dynamic Analysis: Runtime Behavior
Dynamic analysis means executing the sample in a controlled environment and observing what it does. You lose the safety of static analysis but gain ground truth about behavior.
Automated Sandboxing
For initial behavioral overview without setting up manual analysis:
Recommended public sandboxes:
Any.Run (app.any.run):
- Interactive sandbox — you can click through installer dialogs, observe process trees live
- Free tier: public results, 60-second execution
- Community: full execution time, task sharing
- Reveals: process tree, network connections, registry changes, file drops, screenshots
Triage (tria.ge):
- Fast automated analysis with YARA and Suricata IDS integration
- Extracts configuration from known malware families automatically
- Free tier for researchers
VirusTotal Sandbox:
- Runs the sample across multiple sandbox environments simultaneously
- Available for uploaded samples — view under "Behavior" tab
Hybrid Analysis (hybrid-analysis.com):
- Deep static + dynamic analysis
- MITRE ATT&CK technique mapping
- Free tier available
Joe Sandbox (joesecurity.org):
- Most comprehensive of the commercial options
- Excellent evasion bypass capabilities
- Paid only, but most thorough for evasive samples
Uploading samples to public sandboxes means the sample hash, indicators, and sometimes the full sample become accessible to all subscribers. For incident response involving a client breach, a sensitive engagement, or a government-classified environment, use a private sandbox or analyze manually. Public submission of a novel sample also tips off the malware authors that their implant has been discovered.
Manual Dynamic Analysis
When automated sandboxing isn't sufficient (evasive samples, custom malware, in-depth IOC extraction):
# Pre-execution baseline (Flare-VM — Windows analysis machine)
# 1. Run Autoruns — captures current persistence mechanisms
# Save to baseline_autoruns.arn
# 2. Run Procmon filter setup
# Process Monitor → Filter → Add:
# "Process Name" "is" "malware.exe" → Include
# "Architecture" "is" "System" → Exclude (reduces noise from system processes)
# Reset and clear existing events
# 3. Start Wireshark capture
# Interface: your host-only adapter
# Filter: ip.dst != 192.168.100.1 (exclude INetSim traffic — focus on unusual connections)
# Or just capture everything and filter after
# Execute the sample (in an isolated terminal):
.\malware.exe
# Observe in real-time:
# - Process Hacker: new processes spawning, memory injections, network connections
# - Procmon: file system writes, registry modifications, network activity
# - Wireshark: DNS queries (even to INetSim — the domain names reveal C2 infrastructure)# Post-execution analysis:
# 1. Process Hacker analysis
# Check for:
# - Suspicious processes spawning from malware.exe (cmd.exe, powershell.exe, rundll32.exe)
# - Injected code in existing processes (Properties → Memory → look for executable regions without a mapped file)
# - Network connections: which process is connecting where
# - Strings in process memory (can recover decrypted C2 configs from memory)
# 2. Procmon analysis
# Filter: Process Name is malware.exe
# Operations to focus on:
# - WriteFile: what files are created or modified?
# - RegSetValue: what registry keys are written (persistence)?
# - Process Create: what child processes are launched?
# - TCP Connect/UDP Send: what network connections were made?
# 3. Wireshark analysis
tshark -r malware_capture.pcap -Y "dns" -T fields -e dns.qry.name | sort -u
# Extract all DNS queries — even failed ones reveal C2 domains
tshark -r malware_capture.pcap -Y "http.request" -T fields \
-e http.host -e http.request.uri -e http.user_agent | sort -u
# HTTP requests with User-Agent (C2 beacons often use distinctive UA strings)
tshark -r malware_capture.pcap -Y "tls.handshake.type == 1" -T fields \
-e ip.dst -e tls.handshake.extensions_server_name
# TLS ClientHello — reveals destination IPs and SNI (domain names in encrypted traffic)
# 4. Compare Autoruns
# Run Autoruns again — compare to baseline
# New entries reveal persistence mechanisms:
# HKCU\...\Run: malware installed itself as startup item
# Scheduled Tasks: malware created a scheduled task
# Services: malware installed itself as a Windows serviceCode-Level Reverse Engineering
When sandbox and dynamic analysis aren't enough — the sample is evasive, you need to understand the exact encryption algorithm, you need to write a decoder for the C2 protocol, or you're documenting a novel malware family — you need a disassembler.
Ghidra
Ghidra is the NSA-developed open-source reverse engineering framework. It disassembles binaries and attempts to decompile them back to C-like pseudocode. Free, capable, and now the industry standard for most analysts who aren't paying for IDA Pro.
# Ghidra initial analysis
# 1. Launch Ghidra
# 2. New Project → Non-Shared Project
# 3. File → Import File → select malware.exe
# 4. Auto-Analysis → select all options → click Analyze
# Analysis takes 1-5 minutes for most samples
# Headless analysis for automation
analyzeHeadless /tmp/ghidra_projects MalwareProject \
-import /path/to/malware.exe \
-postScript PrintStrings.java \
-scriptPath /opt/ghidra/Ghidra/Features/Base/ghidra_scriptsWhat to focus on in Ghidra:
1. Entry Point (main function)
Window: Functions → _entry or main
Follow the control flow from entry — most malware starts with:
- Anti-analysis checks (debugger detection, VM detection)
- Decryption/unpacking routines
- C2 initialization
- Persistence setup
2. Suspicious Function Patterns
Search → Search for Strings → your C2 domain → References
Double-click any reference to jump to the function using that string
3. Import Analysis
Symbol Table → filter "EXTERNAL" → look for Windows API calls
Click an API name → Show References → see all call sites
4. Cryptography Identification
Look for: byte arrays with high entropy (encryption keys, encrypted payloads)
Functions with modular arithmetic (AES, ChaCha20 patterns)
FLOSS output pointing to decryption functions
5. C2 Communication
Search for: "HTTP/1.", "GET /", "POST /", "User-Agent"
Follow references to find the function that builds C2 requests
Look for base64 encoding/decoding routines near network calls
# Ghidra Python script — extract all string references with function context
# Run from Ghidra Script Manager (Window → Script Manager → New Script)
from ghidra.program.model.symbol import SymbolType
from ghidra.util.task import TaskMonitor
program = getCurrentProgram()
listing = program.getListing()
refManager = program.getReferenceManager()
print("=== String References with Context ===")
for dataUnit in listing.getDefinedData(True):
if str(dataUnit.getDataType()) == 'TerminatedCString' or \
str(dataUnit.getDataType()) == 'TerminatedUnicode':
string_value = str(dataUnit.getValue())
if len(string_value) > 6: # filter short strings
refs = list(refManager.getReferencesTo(dataUnit.getAddress()))
if refs:
func = getFunctionContaining(refs[0].getFromAddress())
func_name = func.getName() if func else "unknown"
print(f"[{func_name}] {string_value}")YARA Rule Development
After analysis, YARA rules convert your findings into detection signatures that can scan file systems, memory dumps, and network traffic.
// YARA rule for a hypothetical ransomware sample
// Based on identified strings and behavioral patterns
rule Ransomware_HypotheticalFamily {
meta:
description = "Detects HypotheticalFamily ransomware based on strings and imports"
author = "analyst@yourcompany.com"
date = "2026-03-27"
hash = "sha256:abc123def456..." // SHA256 of analyzed sample
mitre_attack = "T1486" // Data Encrypted for Impact
strings:
// C2 domain found in strings analysis
$c2_domain = "update-service-cdn.org" ascii wide nocase
// Mutex name — prevents double-execution on victim
$mutex = "Global\\NightOwlMutex_" ascii wide
// Ransom note filename
$ransom_note = "HOW_TO_RECOVER.txt" ascii wide
// API sequence suggesting process injection (could be loader component)
$api_virtualalloc = "VirtualAllocEx" ascii
$api_writepm = "WriteProcessMemory" ascii
$api_createthread = "CreateRemoteThread" ascii
// File extension being appended to encrypted files
$ext = ".nlock" ascii wide
// Bitcoin address pattern in ransom note template
$btc = /bc1[a-zA-HJ-NP-Z0-9]{25,39}/ ascii
condition:
// File is a PE (MZ header)
uint16(0) == 0x5A4D and
(
// Either the C2 domain or mutex, plus the ransom note indicator
($c2_domain or $mutex) and $ransom_note
) or
(
// Injection capability with encryption APIs suggests dropper
all of ($api_*)
)
}# Test YARA rules against samples
yara -r ransomware_rule.yar /path/to/samples/
yara -r ransomware_rule.yar suspicious.exe
# Scan running processes in memory
yara -r ransomware_rule.yar --pid 1234
# Scan a directory of memory dumps
yara -r ransomware_rule.yar /tmp/memory_dumps/
# Update and download community YARA rules
git clone https://github.com/InQuest/yara-rules
git clone https://github.com/Neo23x0/signature-base
# Florian Roth's signature-base is the most maintained community YARA collectionIOC Extraction and Documentation
Thorough IOC extraction is the final deliverable of malware analysis. These indicators feed detection systems, threat intelligence platforms, and hunting queries.
# Malware Analysis Report — Sample IOCs
## File Indicators
| Type | Value | Notes |
|---|---|---|
| SHA256 | abc123def456... | Original sample |
| SHA256 | def789ghi012... | Dropped secondary payload |
| Filename | svchost32.exe | Masquerades as Windows process |
| Path | C:\Users\%USERNAME%\AppData\Roaming\Microsoft\svchost32.exe | Drop location |
| File size | 286,720 bytes | |
## Network Indicators
| Type | Value | Notes |
|---|---|---|
| C2 Domain | update-service-cdn.org | Primary C2 |
| C2 Domain | analytics-tracker-v2.com | Secondary/fallback C2 |
| C2 IP | 185.x.x.x | Resolved IP (may change) |
| URI Pattern | /api/v2/telemetry?uid= | Beacon URI format |
| User-Agent | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 | Custom beacon UA |
| Beacon interval | ~300 seconds ± 30 second jitter | Observed in network capture |
| Protocol | HTTPS on 443/tcp | TLS 1.2 |
## Host Indicators
| Type | Value | Notes |
|---|---|---|
| Registry key | HKCU\Software\Microsoft\Windows\CurrentVersion\Run\WindowsUpdate | Persistence |
| Registry value | C:\Users\%USERNAME%\AppData\Roaming\Microsoft\svchost32.exe | Persistence value |
| Scheduled task | \Microsoft\Windows\WindowsUpdate\AuditPolicy | Persistence (task name) |
| Mutex | Global\\NightOwlMutex_[GUID] | Prevents multiple instances |
| Process | svchost32.exe | Masquerades as svchost |
| Parent process | winlogon.exe (injected) | Process injection target |
## MITRE ATT&CK Techniques
| Technique ID | Technique Name | Evidence |
|---|---|---|
| T1059.001 | PowerShell | Uses PowerShell for secondary payload download |
| T1547.001 | Registry Run Keys / Startup Folder | HKCU Run key persistence |
| T1055.001 | Dynamic-link Library Injection | Injects into winlogon.exe |
| T1071.001 | Application Layer Protocol: Web Protocols | C2 over HTTPS |
| T1486 | Data Encrypted for Impact | AES-256 file encryption |
| T1490 | Inhibit System Recovery | Deletes VSS copies via vssadmin |
| T1497.003 | Virtualization/Sandbox Evasion: Time Based Evasion | RDTSC timing checks |Recommended Analysis Workflow
For each sample, in this order:
1. INTAKE AND INTEL CHECK (5 minutes)
├── Compute SHA256 hash
├── Query VirusTotal — how many vendors detect it? Existing reports?
├── Query MalwareBazaar and Any.Run for existing analysis
└── If well-known sample: read existing reports and proceed to targeted analysis
2. QUICK STATIC ANALYSIS (15 minutes)
├── strings + FLOSS → extract network indicators and API hints
├── pefile/PE-bear → check imports, sections, entropy
└── Determine: is the sample packed? (entropy > 7.0 in code section)
3. AUTOMATED SANDBOX (20 minutes)
├── Submit to Any.Run for interactive analysis
├── Review: process tree, network activity, file drops, registry changes
└── Extract: C2 domains/IPs, dropped file hashes, persistence mechanisms
4. MANUAL DYNAMIC ANALYSIS (60-90 minutes if sandbox insufficient)
├── Prepare Flare-VM + INetSim
├── Procmon + Wireshark + Process Hacker running before execution
├── Execute sample → observe and document behavior
└── Run Autoruns diff to catch persistence
5. CODE-LEVEL ANALYSIS (hours to days for novel/evasive samples)
├── Load in Ghidra → auto-analyze
├── Find entry point → trace control flow
├── Identify: anti-analysis, decryption, C2 init, persistence
└── Decode: encrypted config, C2 protocol, encryption key material
6. DOCUMENTATION AND DETECTION
├── Complete IOC table with all indicators
├── Write YARA rules for detection
├── Map techniques to MITRE ATT&CK
└── Write report or threat intel brief
The depth you apply at each stage depends on the objective. An incident responder triaging an active breach may stop at stage 3 to get enough indicators to block the C2 and start remediation. A malware researcher producing a public threat intelligence report goes all the way through stage 6. Know your objective before you start, and go deep enough — but not deeper — than the situation requires.