Domain OSINT: From WHOIS to Hidden Infrastructure

Dec 12, 2025 · By UserSearch Team · 11 min read

Disclaimer: All information provided in this article is for educational purposes and authorized security research only. The tools and techniques discussed should only be used on systems you own or have explicit permission to test. Unauthorised information gathering may violate laws such as the Computer Fraud and Abuse Act (CFAA), GDPR, or the Investigatory Powers Act.

TL;DR: The Executive Summary

The Problem: WHOIS privacy services redact ownership data, making 90% of domains look anonymous.
The Solution: Domain OSINT pivots on "hidden" infrastructure signals—historical records, favicon hashes, analytics IDs, and SSL certificates.
The Tooling: While manual CLI tools (`dig`, `whois`) provide raw data, UserSearch automates the correlation, linking a single scam site to an entire affiliate network in seconds.
Key Takeaway: Don't stop at the "redacted" screen. Use technical breadcrumbs to unmask the operator.

Every cyberattack, every scam shop, and every disinformation campaign needs a home. In the digital world, they rent domains. But unlike a physical safehouse, a domain is tethered to a massive, public ledger of technical dependencies. It has an IP address, a registrar, a nameserver, an SSL certificate, and often, a messy trail of analytics codes that the operator forgot to scrub.

The problem for investigators is that the "front door"—the WHOIS record—is usually locked. Privacy protection services redact the registrant’s name. But if you look at the infrastructure—the "wiring" of the house—you find the breadcrumbs. You find that the scam site uses the exact same Google Analytics ID as a known personal blog. You find that the domain pointed to a personal IP address three years ago before the privacy proxy was turned on.

This connects deeply with other infrastructure analysis techniques, such as wireless signal tracing, where physical proximity meets digital identifiers.

This is Domain OSINT. It is the art of pivoting from a single URL to a complete map of an adversary’s infrastructure.

What Is Domain OSINT?

Domain OSINT (Open Source Intelligence) is the process of collecting data about a domain name to understand its ownership, technical configuration, and relationships with other entities on the internet. Unlike simple "website visiting," domain OSINT focuses on the metadata and infrastructure records that exist regardless of whether the website is online or offline.

At its core, a domain is just a human-readable pointer. To function, it must connect to servers (A Records), mail exchanges (MX Records), and name servers (NS Records). Each connection creates a "link" in the chain of evidence. Even if a threat actor uses a fake name, they often reuse the same hosting account, the same SSL certificate issuer, or the same analytics tracking codes across their entire network. Domain OSINT is about finding these reused signals to map the full scope of an operation.

For a deeper technical definition of how domain registration works, refer to ICANN's Guide to WHOIS.

Why It Matters in Investigations

In modern investigations, the domain is often the first—and sometimes only—indicator of compromise (IOC) you have. Whether you are tracking a phishing campaign, investigating brand infringement, or mapping a disinformation network, the domain is the thread that unravels the sweater.

Consider the Pegasus spyware investigations conducted by groups like Citizen Lab. By analyzing the domain infrastructure used for Command and Control (C2) servers, researchers were able to identify patterns in how the spyware infrastructure was set up. Similarly, frameworks like MITRE ATT&CK (T1583: Acquire Infrastructure) highlight how adversaries buy, lease, or compromise domains to stage attacks. A single slip-up in domain configuration—such as reusing a specific SSL certificate or pointing to a shared IP address—can compromise an entire global spy network.

For corporate security teams, Domain OSINT is critical for:

Brand Protection: Identifying "typosquatting" domains (e.g., g0ogle.com) that steal credentials.
Attribution: Determining if a specific attack comes from a known APT group based on their infrastructure habits.
due diligence: Verifying if a potential partner company actually owns the web assets they claim to.

The Manual Method (The "Hard Way")

Before using automated tools, it is vital to understand the raw data. Automated tools are simply scripts that run these manual commands faster. If you don't understand the output of dig, you won't understand the output of an automated report.

1. WHOIS Registration Data

WHOIS is a query and response protocol that is widely used for querying databases that store the registered users or assignees of an Internet resource. In the past, this was a goldmine of names, emails, and phone numbers. Today, due to GDPR, it is often redacted.

To run a manual check:

whois example.com

What to look for:

Creation Date: A banking domain created 2 days ago is almost certainly a scam.
Registrar: Reputable companies use MarkMonitor or CSC. Scammers often use cheap, bulk registrars like NameCheap or smaller offshore providers.
Name Servers: Does the site use standard hosting (ns1.godaddy.com) or custom nameservers (ns1.evilcorp-internal.com)?

2. DNS Interrogation (dig)

The Domain Name System (DNS) is the phonebook of the internet. The dig command (Domain Information Groper) allows you to query these records directly. This tells you where the domain "lives."

# Get the IP address (A Record)
dig example.com A +short

# Get the Mail Server (MX Record)
dig example.com MX +short

# Get the Text Records (TXT Record)
dig example.com TXT +short

Analysis:

A Record: Tells you the hosting provider. If it resolves to 104.21.x.x, it's behind Cloudflare, masking the true server. If it resolves to a residential ISP range, that's a huge red flag.
MX Record: Tells you who handles their email. If a "Bank" uses ProtonMail or a temporary email service for their MX records, it is fraudulent. Legitimate organizations host their own email or use Google Workspace/Office 365.
TXT Record: Often contains verification strings for services like Google Search Console, SPF records for email, or ownership proofs. Scammers often copy-paste these records blindly, inadvertently linking their scam sites to their legitimate projects.

3. Source Code and ID Extraction

Websites are built with code, and that code often contains unique tracking identifiers. If a threat actor runs 50 scam sites, they often want to track traffic on all of them. To do this, they embed the same Google Analytics or AdSense code.

To manually find these:

# Download the page source
curl -sL https://example.com > source.html

# Grep for common patterns (UA- for Google Analytics, pub- for AdSense)
grep -oE "UA-[0-9]+-[0-9]+" source.html
grep -oE "pub-[0-9]+" source.html

If you find UA-12345678-1 on scam-crypto.com, you can then search the web for that specific ID. If you find it also running on john-doe-blog.com, you have likely identified the operator.

4. SSL Certificate Analysis

SSL certificates (the padlock icon) contain metadata about the entity that requested them. While "Let's Encrypt" certificates are anonymous, paid certificates (OV or EV) contain verified company names. You can also search Certificate Transparency logs using free tools like crt.sh to find every certificate ever issued for a domain.

# Connect and dump the certificate details
echo | openssl s_client -showcerts -servername example.com -connect example.com:443 2>/dev/null | openssl x509 -inform pem -noout -text

Look for:

Subject Alternative Name (SAN): Often, a single certificate is valid for multiple domains. You might find example.com also lists dev-server.com and admin-panel.com in its SAN field, revealing hidden parts of the infrastructure.
Issuer: Who verified them?

The Pivot: UserSearch Automation

The manual methods above are precise but slow. You cannot run dig and curl on 500 domains per hour without writing complex scripts. Furthermore, manual checks only show you the current state of the domain. They don't show you what the WHOIS record looked like three years ago before the owner enabled privacy protection.

UserSearch unifies this fragmented workflow into a single, historical, and correlational console. Instead of just querying the current state, it pivots across time and datasets to find the connections that manual tools miss.

1. Domain Ownership History (The "Time Machine")

The single most effective technique in Domain OSINT is historical regression. Most bad actors are not perfect from day one. They often register a domain using their personal email or home address, realize their mistake a month later, and then pay for WHOIS privacy.

A manual whois check today will only show "Redacted for Privacy." UserSearch’s Domain Ownership (History) module queries historical databases to find the record before it was scrubbed.

The Check: Enter the domain name in the Domain Search module.
The Output: You see a timeline of records. Scroll back to the earliest entries (2018, 2019, etc.).
The Win: You find [email protected] listed as the registrant in 2019. You can then pivot on this email address using the Email Search module to find their social media profiles, physical location, and other domains.

2. Favicon Search (Visual Fingerprinting)

A favicon is the small icon that appears in your browser tab. To the computer, this image is just a file with a specific cryptographic hash (often MMH3). Phishing kits are often lazy; they clone the legitimate banking website, including the favicon file.

When a scammer sets up 100 fake login pages on 100 different domains, they rarely change the favicon. This creates a unique digital fingerprint.

The Check: Use the Search by Favicon module in UserSearch.
The Logic: The system calculates the hash of the target’s favicon and searches the entire internet for other sites serving the exact same image.
The Win: You enter the favicon of a suspected phishing site. The results return 45 other domains—some created today, some months ago—all using that same icon. You have instantly mapped the entire active campaign, not just the single URL you started with.

3. 3rd Party Website Lookup (ID Correlation)

As mentioned in the manual section, tracking codes (Google Analytics, AdSense, New Relic IDs) are sticky. They often persist across different projects owned by the same person. UserSearch automates the extraction and reverse-search of these IDs.

The Check: Run the 3rd Party Website Lookup on the target domain.
The Output: A list of every unique ID found in the source code, followed by a list of other domains that share those same IDs.
The Scenario: You are investigating a "Fake News" site. You find it uses a specific Google AdSense ID. UserSearch shows that this same AdSense ID is used on a website selling "Herbal Supplements" and another site hosting "Pirated Movies." You have now connected the disinformation campaign to a financially motivated affiliate network, rather than a state actor.

4. Threat Intelligence Integration

Domain OSINT isn't just about ownership; it's about risk. UserSearch integrates with HudsonRock and SpamHaus to layer threat data on top of infrastructure data.

Compromised Employees (HudsonRock): This is a powerful, often overlooked vector. If a domain’s registered admin email appears in a stealer-log (malware infection), it means the credentials to manage that domain might be for sale on the dark web. If you are defending a client, this is a critical alert.
Reputation (SpamHaus): Has this domain or its IP been flagged for sending spam or hosting C2 malware? Checking the SpamHaus Block List (SBL) helps confirm malicious intent immediately.

5. The 'Registered-By' Pivot

In the pre-GDPR era, the "Registrant Email" was the gold standard of OSINT. While rare today on live WHOIS, it remains the most common pivot point in historical data. Tools like DomainTools built their reputation on this, and UserSearch provides similar historical lookups.

If you find an old email address like [email protected] in a 2017 record for a current scam site, you haven't just found a name—you've found a corporate entity. You can then investigate agency-design.com to see if the design agency is complicit, or if their employee went rogue. This technique—moving from a technical asset to a corporate identity—is central to infrastructure intelligence methodologies taught by vendors like ThreatConnect.

Advanced Strategies and Use Cases

Now that we have the tools, let’s look at how to combine them into sophisticated investigation workflows.

Scenario 1: The Phishing Clone Network

Context: You work for a mid-sized crypto exchange. Users are reporting a fake login page that stole their funds. (See our guide on crypto scam wallet infrastructure for more on tracing the funds themselves).

Initial Scan: You take the phishing URL secure-login-crypto.net and run a Domain Ownership check. It’s redacted.
Visual Pivot: You notice the site uses your company’s logo as the favicon. You run a Search by Favicon. UserSearch returns 12 other domains, including update-crypto-wallet.com and verify-assets-now.org.
Infrastructure Pivot: You run a DNS/IP check on the new domains. They all resolve to the same dedicated server IP in the Netherlands.
Historical Check: You check the history of the oldest domain in the cluster, verify-assets-now.org. The 2020 record shows a registrant email: [email protected].
Attribution: You pivot to Email Search on that Gmail address, finding a GitHub account full of "phishing kit" repositories. You now have a complete case package to send to law enforcement.

Scenario 2: Corporate Espionage Attribution

Context: A competitor seems to launch products identical to yours, days after you announce them. You suspect a leak.

Discovery: You identify a suspiciously timed blog industry-insider-leaks.com that leaks your product specs.
ID Correlation: You run 3rd Party Website Lookup on the blog. It finds a unique New Relic application ID.
Reverse Search: The module reveals that this New Relic ID is also active on competitor-marketing-portal.com.
Conclusion: This technical link proves that the "independent leak blog" is running on the exact same monitoring infrastructure as your competitor’s official marketing site. It is not an independent leak; it is a coordinated campaign.

Common Pitfalls in Infrastructure Analysis

Even seasoned investigators get tripped up by false positives. Infrastructure analysis is rarely binary; it deals in probabilities.

1. The Cloudflare Curtain

Finding an IP address like 104.21.x.x or 172.67.x.x does not tell you where the server is. These belong to Cloudflare. If you report this IP to a hosting provider, you are reporting the CDN, not the host. You need to look for historical IP records (before the CDN was enabled) or subdomains (like mail.target.com or ftp.target.com) that often bypass the CDN and point directly to the origin server.

2. Shared Hosting Noise

If you find that scam-site.com is hosted on 192.0.2.1, and you perform a reverse-IP lookup, you might find 50,000 other domains on that same IP. This is "Shared Hosting" (e.g., GoDaddy, Bluehost). Being on the same IP as a legitimate flower shop does not mean the flower shop is involved. You must look for unique identifiers (shared Analytics IDs, specific sub-network ranges), not just shared public IPs.

3. The Parked Domain Trap

Many domains show up in threat feeds simply because they are "parked" (expired and bought by an ad aggregator). These pages often host generic ads that might flag as "spam," but they represent an abandoned asset, not an active threat actor. Always check the content using a safe scanner like URLScan.io or VirusTotal before concluding it is an active attack.

Legal and Ethical Guardrails

Domain OSINT is powerful because it relies on publicly available data. However, there is a fine line between observation and interaction.

Passive vs. Active: Looking at WHOIS records, DNS entries, and passive DNS history is passive. You are looking at records that already exist. This is generally legally safe for research.
Port Scanning and Probing: Actively scanning the server for vulnerabilities (e.g., running Nmap against the IP found in the A record) is active. Without explicit permission, this can be interpreted as an attack under laws like the Computer Fraud and Abuse Act (CFAA) in the US or the Computer Misuse Act in the UK.
Touching the Glass: Be careful when visiting suspicious domains in a browser. They may host browser exploits. Always use a sandboxed environment or a tool like UserSearch that fetches data on your behalf, keeping your machine safe.

Always ensure your investigation has a legitimate purpose (security research, fraud prevention, legal defense) and document your process.

Conclusion

The domain name is just the tip of the iceberg. Beneath the surface lies a complex web of historical records, technical configurations, and shared identifiers that can link even the most careful adversary to their other operations. By combining manual understanding with the automated historical and correlational power of UserSearch, you can turn a single opaque URL into a rich intelligence map.

Don't stop at the "Redacted for Privacy" screen. Dig deeper into the infrastructure.

Stop guessing. Start investigating. Run structured domain OSINT with UserSearch.

About the author

UserSearch Team

This profile is managed by a team of industry-leading Cyber and OSINT specialists. Our articles are authored by subject matter experts, including Lee Lewis—a Digital Forensics veteran and the founder of UserSearch.

View profile

Updated on Dec 13, 2025