Decoding SQL Injection Exfiltrated Data from Network Traffic Dump

Decoding SQL Injection Exfiltrated Data from Network Traffic Dump

Sequel Dump task was hard level task included into the latest Hackfinity Battle Room. Players had network traffic dump available. Opening it with Wireshark reveals HTTP traffic and long SQL queries in the URL query parameters. Something like:

/search_app/search.php?query=1%20AND%20ORD%28MID%28%28SELECT%20IFNULL%28CAST%28%60description%60%20AS%20NCHAR%29%2C0x20%29%20FROM%20profile_db.%60profiles%60%20ORDER%20BY%20id%20LIMIT%202%2C1%29%2C40%2C1%29%29%3E100
/search_app/search.php?query=1%20AND%20ORD%28MID%28%28SELECT%20IFNULL%28CAST%28%60description%60%20AS%20NCHAR%29%2C0x20%29%20FROM%20profile_db.%60profiles%60%20ORDER%20BY%20id%20LIMIT%202%2C1%29%2C36%2C1%29%29%3E41        

Or URL decoded:

/search_app/search.php?query=1 AND ORD(MID((SELECT IFNULL(CAST(`description` AS NCHAR),0x20) FROM profile_db.`profiles` ORDER BY id LIMIT 2,1),40,1))>100
/search_app/search.php?query=1 AND ORD(MID((SELECT IFNULL(CAST(`description` AS NCHAR),0x20) FROM profile_db.`profiles` ORDER BY id LIMIT 2,1),36,1))>41        

This is a typical SQLMap traffic. In the example above, SQL map enumerates the description field. In the first query it checks if the code of the 40-th character in the field description in the second (2) table row is greater than 100. The second query does the same for the 36-th character of the same field. SQL Map is the orchestrator that guides the "greater or less game" and generates the queries. We shall just observe the results. The server responds with:

<h2>Search Results:</h2><p><strong>Void:</strong> The cryptography expert who deciphers the toughest encryptions, searching for vulnerabilities in Void's encoded fortress.</p>

<h2>Search Results:</h2>No results found.	        

The first response is generated if the query evaluates to True condition and returns content. The second one means the query evaluated to False.

The decoding algorithm is simple: For each row, field and character position we shall remember the largest value that yielded True statement. The character code will be that value + 1. In other words we shall find the max(value) < character code and after that character code = max(value)+1.

For example, tracking the 40-th character in description in the second row could be:

ORD(MID((SELECT `description` FROM profile_db.`profiles` ORDER BY id LIMIT 2,1),40,1))>100 # True
ORD(MID((SELECT `description` FROM profile_db.`profiles` ORDER BY id LIMIT 2,1),40,1))>110 # False
ORD(MID((SELECT `description` FROM profile_db.`profiles` ORDER BY id LIMIT 2,1),40,1))>105 # True
ORD(MID((SELECT `description` FROM profile_db.`profiles` ORDER BY id LIMIT 2,1),40,1))>108 # False
ORD(MID((SELECT `description` FROM profile_db.`profiles` ORDER BY id LIMIT 2,1),40,1))>106 # True
ORD(MID((SELECT `description` FROM profile_db.`profiles` ORDER BY id LIMIT 2,1),40,1))>107 # False

# 106 is the maximum confirmed value the character code is greater from
#character code is 106 + 1 = 107        

SQLMap does not enumerate sequentially, but in parallel. The order of the request/response pairs in the dump is arbitrary. We shall correlate somehow requests and responses. For that purpose we will use the concept of streams available in the Wireshark like dumps.

First, let's extract the requests and their stream number.

tshark -r challenge.pcapng -Y 'http.request.method=="GET"' -T fields -e http.request.uri -e tcp.stream > requests.txt        

Extract the responses too:

tshark -r challenge.pcapng -Y 'http.response' -T fields -e http.file_data  -e tcp.stream > responses.txt        

The following Python script loads these 2 files, processes requests and responses, correlates them and implements the decoding logic.

import urllib.parse
import re
from collections import OrderedDict

request_stream = {}

response_stream = {}

streams = []

with open("requests.txt") as f:

    lines = f.readlines()

    for line in lines:

        decoded = line.strip().split("\t")

        stream = decoded[1]

        request = urllib.parse.unquote(decoded[0])

        request_stream[stream] = request[35:]

with open("responses.txt") as f:

    lines = f.readlines()

    for line in lines:

        decoded = line.strip().split("\t")

        if len(decoded) > 1:

            stream = decoded[1]

            response = decoded[0]

        else:

            stream = decoded[0]

            response = ""

        response_stream[stream] = response

for stream, request in request_stream.items():

    if stream in response_stream:

        response = response_stream[stream]

        streams.append((request, response))

    else:

        print(f"No response for {stream} and request {request}")

        pass

# ORD(MID((SELECT IFNULL(CAST(`description` AS NCHAR),0x20) FROM profile_db.`profiles` ORDER BY id LIMIT 0,1),1,1))>64

pattern = r"ORD\(MID\(\(SELECT IFNULL\(CAST\(`?([a-zA-Z_]+)`? AS.+LIMIT (\d+),\d+\),(\d+).*>(\d+)+$"

state = {}

for request, response in streams:

    if "INFORMATION_SCHEMA" in request:
        continue

    if "CHAR_LENGTH" in request:
        continue

    if "DATABASE()" in request:
        continue

    if not "IFNULL(CAST(" in request:
        continue

    if "COUNT(*)" in request:
        continue

    if len(response) == 0:
        continue

    result = True

    if "No results found" in response:
        continue

    match = re.match(pattern, request)

    if not match:
        continue
  
    field_name, row, pos, bound = match.groups()

    row = int(row)

    pos = int(pos)

    key = f"{row:04d}_{field_name}_{pos:04d}"

    bound = int(bound)

    if key not in state:

        state[key] = bound

    else:

        value = state[key]

        if bound > value:

            state[key] = bound

sorted_state = OrderedDict(sorted(state.items()))

current_prefix = ""

current_value = []

for key, value in sorted_state.items():

    prefix = "_".join(key.split("_")[0:2])

    if prefix != current_prefix:

        if current_prefix != "":
            print(f"{current_prefix}: {''.join(current_value)}")
        
        current_prefix = prefix
        
        current_value = []
    
    current_value.append(chr(value + 1))

print(f"{current_prefix}: {''.join(current_value)}")
        

Executing the script gives us the exfiltrated data from the profile table:

0000_description: The cryptography expert who deciphers the toughest encryptaons, searching for vulnerabilities in Void's encoded fortress.
0000_id: 1
0000_name: Void
0001_description: The exploit hunter who detects and patches vulnerabilities before they can be weaponized, skilled in penetration testing and reverse engineering.
0001_id: 2
0001_name: Zero-Day
0002_description: The OSINT (Open-Source Intelligence) specialist who tracks Voids movements through the dark web and gathers intelligence from hidden networks.
0002_id: 3
0002_name: Phantom
0003_description: The network infiltrator who can breach even the most fortified systems, bypassing firewalls and uncovering hidden data.
0003_id: 4
0003_name: Root
0004_description: The forensic analyst who reconstructs digital crime scenes, piecing together evidence from fragmented files and corruptedkmgs.
0004_id: 5
0004_name: Specter
0005_description: The AI security specialist who builds countermeasures against Voids evolving attack algorithms and neutralizes rogue AI threats.
0005_id: 6
0005_name: Sentinel
0006_description: Here's the flag: THM{REDACTED}
0006_id: 7
0006_name: supeqsacrethiddenuser        

That's it.


To view or add a comment, sign in

More articles by Marjan Sterjev

  • Secure Multi-Party Computation: Idea and Example

    We are seeing massive AI adoption these days. At least everyone is talking about it.

    1 Comment
  • Theseus TryHackMe Room Writeup

    Theseus is an old room, almost 5 years old. Its difficulty has been specified as Insane.

  • Breaking weak Captcha: Tesseract OCR

    Capture Returns TryHackMe room has been recently added. Sometimes CTF scenarios could be artificial.

  • Behaviour Analysis: Outlier Sequence Detection

    Each production system deals with a lot of signals and sequence of events. Some of the sequences are unusual and…

    6 Comments
  • RSA Optimizations: Think Twice

    I will start here straight ahead with the summary: Resist your temptations and do not implement your customised and…

    2 Comments
  • RSA & Chinese Remainder Theorem MEGA Exploits

    There is a lot of data we are dealing with today. Most of us have multiple digital devices.

    4 Comments
  • A word or two about the parallelisation

    This article will be a short one. It talks about parallelisation and efficiency, so it shall waste as less as possible…

  • Covert Tcp - Scapy Version

    Covert Tcp is one of the tools for covert communication mentioned in the white hacking courses. Instead of sending…

  • Network Traffic Anomaly Detection

    It is well known today that anomaly detection helps us spot interesting data, events, malicious behaviour, potential…

  • Kyber: NTT and Efficient Polynomial Multiplication

    In my previous article From Baby to Teenage Kyber I've explained how Kyber encryption/decryption works. As it was…

    4 Comments

Insights from the community

Others also viewed

Explore topics