Skip to content

dexisworking/Project_Helix_Write-up

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧬 Project Helix — CTF Digital Forensics Write-Up

"It speaks in strands. A four-fold tongue." — Dr. Owens, LOG_088

Platform: TCM Security CTF
Category: Digital Forensics
Difficulty: Hard
Flag: TCM{V01D_S1GN4L_4U7H3N71C473D}
Author: Dibyanshu Sekhar
Date: March 12, 2026


🗂️ Table of Contents


🔬 Scenario

Three days ago, lead geneticist Dr. Owens sent a final frantic transmission claiming he had intercepted a biological broadcast — a repeating signal he believed to be a synthetic RNA sequence manifesting as radio frequency interference.

In violation of Protocol 4, Owens began unauthorized testing on a high-risk specimen. All comms went dark shortly after. A security sweep of his lab confirmed he's missing. His workstation was found powered on, but all research has been wiped.

Mission: Recover the deleted notes and find the RNA-based key that unlocks his encrypted specimen archive.


⚡ Quick Summary

Finding Value
Recovered file freq.txt (via NTFS MFT Zone.Identifier ADS)
Download URL https://ctf.tcmsecurity.com/3c7d7997c1a7/freq.txt
Broadcast frequency 3817.3 Hz
Signal encoding 4-frequency FSK → RNA bases → quaternary → ASCII
Decoded key |RNAPolymeraseSlip|
Flag TCM{V01D_S1GN4L_4U7H3N71C473D}

🔗 Attack Chain

NTFS Triage Image
       │
       ▼
 LNK Shortcut  ──────────────► freq.txt existed in Downloads
       │
       ▼
 $MFT Parser   ──────────────► Zone.Identifier ADS → Download URL
       │
       ▼
 curl freq.txt ──────────────► Lab logs + 3817.3 Hz frequency
       │
       ▼
 WebSocket Capture ──────────► 164s raw PCM audio
       │
       ▼
 FFT + FSK Decode ───────────► |RNAPolymeraseSlip|
       │
       ▼
 pyzipper AES-256 ───────────► TCM{V01D_S1GN4L_4U7H3N71C473D}

Phase 1 — Identifying the Forensic Artifact

The artifact at /home/dex/Projects/C/ was an extracted NTFS triage image — not a full disk dump, but plenty to work with.

$Boot           — NTFS boot sector
$MFT            — Master File Table (134 MB)
$LogFile        — Journal
$Secure_$SDS
ProgramData/
Users/
  └── drowens/  — Dr. Owens' profile
Windows/

Classic NTFS layout. One real user: drowens.


Phase 2 — LNK Shortcut Analysis

Windows .lnk files in %APPDATA%\Microsoft\Windows\Recent persist even after the target file is deleted. They contain the original file path and machine name.

find /home/dex/Projects/C/Users/drowens -name "*.lnk" 2>/dev/null
# → .../Recent/freq.lnk

strings .../freq.lnk
# → freq.txt
# → C:\Users\drowens\Downloads\freq.txt
# → helix

Finding: A file called freq.txt once lived in Owens' Downloads folder. Machine name: helix.


Phase 3 — MFT Analysis & Zone.Identifier Recovery

Why the MFT Matters

When a file is deleted on NTFS, the MFT record doesn't immediately disappear — only the in-use flag is cleared. Metadata, including Alternate Data Streams (ADS), often survives.

The Zone.Identifier ADS is attached to every internet-downloaded file and contains a HostUrl field — the exact source URL.

MFT Parser

See scripts/mft_parser.py for the full script. Core logic:

  1. Walk all 1024-byte FILE records in $MFT
  2. Apply the USA fixup (sector protection)
  3. Extract 0x30 (filename) and 0x80 (data) attributes
  4. Filter for freq.txt, extract the Zone.Identifier ADS

Result

Record 41: freq.txt [DELETED]
$DATA (resident): Zone.Identifier stream

[ZoneTransfer]
ZoneId=3
HostUrl=https://ctf.tcmsecurity.com/3c7d7997c1a7/freq.txt

The actual file data clusters were wiped — but the ADS was small enough to be stored inline in the MFT record. Deletion didn't touch it.


Phase 4 — Recovering and Analyzing freq.txt

curl -o /tmp/freq.txt "https://ctf.tcmsecurity.com/3c7d7997c1a7/freq.txt"

The 441 KB file contained two parts:

Owens' Lab Logs

LOG_086: Found a hum. It's too constant... did I just discover the next WOW signal??
         Not a machine. Too rhythmic. Too... wet. Like a heartbeat in the wires.

LOG_087: It's getting louder... it repeats|it repeats|it repeats
         Every 78 seconds.

LOG_088: It speaks in strands. A four-fold tongue. I saw it on the waterfall today.
         A... C... G... U... quaternary? If I start counting from 0 it starts to make sense.

LOG_089: Black van in the driveway again. They know what I found.
         THEY ARE TRANSCRIBING ME. STAY OFF THE LADDER.

Frequency Scan

A scan from 1000.0 to 5000.0 Hz, all (X) except one:

3817.3 (?)  ← THE ONE

Encoding Scheme (from LOG_088)

RNA Base Quaternary Value
A 0
C 1
G 2
U 3

Formula for 4-base codons:

ASCII = b0×64 + b1×16 + b2×4 + b3

Example: CCAG = 1(64) + 1(16) + 0(4) + 2 = 82 = 'R'


Phase 5 — Capturing the Signal

The CTF radio page exposed a WebSocket endpoint:

wss://ctf.tcmsecurity.com/3c7d7997c1a7?freq={freq}

Audio format: raw Int16 PCM at 44100 Hz, mono.

See scripts/capture_signal.py for the capture script. Key decision: capture 165 seconds (>2 full 78-second cycles).

⚠️ Lesson learned the hard way: My first capture was only ~80 seconds. I decoded eraseSlip||RNAPolym — the tail of one cycle glued to the head of the next. Always capture multiple full cycles.


Phase 6 — Decoding the FSK Signal

See scripts/fsk_decoder.py for the full demodulator.

Approach:

  1. Load raw PCM into numpy
  2. Sliding FFT window (N=2048, hop=1024) to find dominant frequency per frame
  3. Merge consecutive same-frequency frames into segments
  4. Discard segments < 0.15s (noise)
  5. Group segments between silence gaps into 4-base codons

FSK Frequency Map

Frequency RNA Base Quaternary
1000 Hz A 0
2000 Hz C 1
3000 Hz G 2
4000 Hz U 3

Each base: ~0.85s · Silence separator: ~0.75s

Full Codon Table (one cycle)

# Codon Calc Char
1 CUUA 1(64)+3(16)+3(4)+0 = 124 |
2 CCAG 1(64)+1(16)+0(4)+2 = 82 R
3 CAUG 1(64)+0(16)+3(4)+2 = 78 N
4 CAAC 1(64)+0(16)+0(4)+1 = 65 A
5 CCAA 1(64)+1(16)+0(4)+0 = 80 P
6 CGUU 1(64)+2(16)+3(4)+3 = 111 o
7 CGUA 1(64)+2(16)+3(4)+0 = 108 l
8 CUGC 1(64)+3(16)+2(4)+1 = 121 y
9 CGUC 1(64)+2(16)+3(4)+1 = 109 m
10 CGCC 1(64)+2(16)+1(4)+1 = 101 e
11 CUAG 1(64)+3(16)+0(4)+2 = 114 r
12 CGAC 1(64)+2(16)+0(4)+1 = 97 a
13 CUAU 1(64)+3(16)+0(4)+3 = 115 s
14 CGCC 101 e
15 CCAU 1(64)+1(16)+0(4)+3 = 83 S
16 CGUA 108 l
17 CGGC 1(64)+2(16)+2(4)+1 = 105 i
18 CUAA 1(64)+3(16)+0(4)+0 = 112 p
19 CUUA 124 |

Decoded: |RNAPolymeraseSlip|

The | characters aren't just formatting — they're codon CUUA = ASCII 124, transmitted as a 4000→4000→1000 Hz sequence. They're part of the key.

The name references RNA Polymerase Slippage — a real molecular biology phenomenon where RNA polymerase slips on repetitive template sequences. The endlessly repeating broadcast is a thematic nod to this.


Phase 7 — Corroborating Evidence

Windows Search Database

sqlite3 /tmp/Windows_search.db \
  "SELECT System_ItemName FROM SystemIndex WHERE System_ItemName LIKE '%freq%';"
# → freq.txt

Indexed before deletion. Confirms the file existed.

Registry (NTUSER.DAT)

OpenSavePidlMRU\* → 0 = freq.txt
RecentDocs        → 0 = freq.txt / freq.lnk
FileExts\.TS\UserChoice → Hash = xKeyJTOPm78=  ← red herring (standard Windows UserChoice hash)
OneDrive          → HostNameCollection = helix

Phase 8 — Cracking the Archive

curl -o /tmp/flag.zip "https://ctf.tcmsecurity.com/3c7d7997c1a7/flag.zip"
# AES-256 encrypted, requires pyzipper (standard unzip can't handle it)

Password Graveyard

Tried Why
eraseSlip||RNAPolym Wrong-framed single-cycle decode
eraseSlipRNAPolym Without pipes
RNAPolymeraseSlippage Full biology term
RNAPolymeraseSlip Right text, missing delimiters
3817.3 The frequency
helix / HELIX Machine name
drowens Username
xKeyJTOPm78= Registry red herring

The Winner

import pyzipper

with pyzipper.AESZipFile('/tmp/flag.zip') as z:
    z.setpassword(b'|RNAPolymeraseSlip|')
    print(z.read('flag.txt').decode())

# TCM{V01D_S1GN4L_4U7H3N71C473D}

See scripts/decrypt_flag.py.


🛠️ Tools Used

Tool Purpose
strings Extract readable text from LNK file
Python 3 MFT parsing, WebSocket capture, FFT decoding
curl Download files from the CTF server
numpy FFT computation and signal processing
websockets Connect to the audio WebSocket stream
python-registry Parse NTUSER.DAT registry hive
sqlite3 Query Windows Search database
pyzipper Handle AES-256 encrypted zip files

Install dependencies:

pip install numpy websockets python-registry pyzipper

💡 Key Takeaways

  1. Zone.Identifier ADS is incredibly useful. Even when a file's data clusters are gone, the download URL can survive in the MFT record. It's stored inline as a small resident stream — deletion often doesn't touch it.

  2. Capture more signal than you think you need. One cycle isn't enough to identify where the message starts and ends. Two full cycles make boundaries obvious.

  3. Delimiters can be data. The | characters looked like formatting — they were actually explicitly encoded as codon CUUA and are part of the password itself.

  4. Read the narrative carefully. Every clue was in Owens' logs: the period (78s), the alphabet (A/C/G/U), the encoding (quaternary), and the starting index (0). CTF designers don't add lore for decoration.

  5. Don't chase red herrings too long. The registry hash xKeyJTOPm78= looked promising but was just a standard Windows UserChoice hash. If something doesn't fit in a few minutes, move on.


📁 Repo Structure

Project-Helix-Writeup/
├── README.md                            ← Main technical write-up
├── C/                                   ← NTFS triage artifact set
├── docs/
│   ├── writeup.md                       ← Full narrative write-up
│   └── Project_Helix_CTF_WriteUp.pdf    ← PDF export
├── scripts/
│   ├── mft_parser.py                    ← NTFS $MFT parser with USA fixup
│   ├── capture_signal.py                ← WebSocket audio capture (165s)
│   ├── fsk_decoder.py                   ← FFT-based FSK demodulator + RNA decoder
│   └── decrypt_flag.py                  ← AES-256 zip decryption
└── artifacts/
    └── codon_table.md                   ← Full codon decode reference

— Dibyanshu Sekhar · March 2026

About

Project Helix Write-up is a digital forensics CTF repository documenting the full investigation chain—from NTFS artifact recovery and MFT/ADS analysis to RF signal capture/FSK decoding and final AES zip decryption—along with reproducible Python scripts, detailed write-ups, and supporting artifacts.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages