In the first three parts of this series, we learned about manual methods for gathering intelligence about a target company, their external hosts, and their employees manually through a myriad of services. Now, we will cover different ways of automating the OSINT gathering process using theHarvester, Amass, and Recon-ng.
Links to posts in this series:
Recon Methods Part 1 – OSINT Host Discovery
Recon Methods Part 2 – OSINT Host Discovery Continued
Recon Methods Part 3 – OSINT Employee Discovery
Recon Methods Part 5 – Traffic on the Target
theHarvester
The first tool we use after manual discovery is theHarvester. This tool gathers subdomain names, IP addresses, email addresses and employee names while only needing an initial domain name to start. To fully utilize theHarvester, you will need to get API keys for the following services:
- Bing (paid)
- Github
- Hunter.io
- Intelx
- SecurityTrails
- Shodan (paid)
- Spyse
theHarvester will work fine without these API keys but the search results may be limited. All API keys listed above outside of Shodan and Bing can be obtained for free. A list of the sources that theHarvester uses for OSINT gathering can be seen below.

Once you have your API keys configured, just run the following command and go grab a drink. It will take a while to complete but should return quite a bit of information.
The following pulls the top 1,000 results from the target domain:
python3 theHarvester.py -d <target_domain> -b all -l 1000
Amass
Another great tool for subdomain recon is Amass. While Amass does not perform
OSINT on employee names or email, it makes up for it in the large amount of subdomain
OSINT sources. Amass is set up as a suite of tools that can search for
subdomains, ASNs, and IP addresses as well as perform brute force subdomain
discovery.
Amass also performs best when configured with API keys to various OSINT services. A lot of the API keys theHarvester uses can also be used with Amass. A full list can be seen below. I was able to track down quite a few for free but was not able to acquire all of them for free.
- AlienVault
- BinaryEdge
- Censys
- CIRCL
- DNSDB (paid)
- GitHub
- NetworksDB
- PassiveTotal (paid)
- SecurityTrails
- Shodan (paid)
- Spyse
- Umbrella (paid)
- URLScan
- VirusTotal
- WhoisXML (paid)
Once Amass is configured, the first step is to use the Intel module. With Intel, you can search using a starting domain with the -d flag, search for ASNs containing the company name with the -org flag, and search for potentially related domains through reverse WHOIS searches with the -whois flag. Examples of each of these commands can be found below.
Search for domain names associated with target domain through reverse whois:
amass intel -d <target_domain> -whois
Search for company name is ASN names:
amass intel -org <target_organization>
Search for domain names, associated IP addresses in an ASN, and prints where Amass found them:
amass intel -asn <target_asn> -ip -src
Next, the Amass Enum module can be used to search for subdomains through pure OSINT resources or through active DNS brute forcing. The enum module will also let you pick which DNS servers to perform the queries against using the -r flag. Be warned that public DNS servers will sometimes block your requests if you focus solely on one company’s DNS servers. Amass will spread requests out over multiple sources such as Google, Cloudflare, Hurricane Electric, Verisign, and Yandex. Additional resolvers can be configured in the config file as well. The brute force subdomain discovery can accept a file of common subdomains as well as accept masks similar to Hashcat.
Search for subdomains found strictly in open-source intelligence resources:
amass enum -d <target_domain> -passive
Search for subdomains and verify info about the host through direct connections:
amass enum -d <target_domain> -active
Brute force subdomains using a mask of aaa-[a-z][a-z][a-z]:
amass enum -d <target_domain> -active -brute -wm "aaa-?l?l?l"
One last thing about Amass is it does not automatically create a configuration file. I grabbed the example file from Github and filled in my API keys. There are other features of Amass not covered here such as the ability to track changes over time with discovered domains and generating a JavaScript based VizDB.
Recon-ng
The final tool we’ll talk about is Recon-NG. Recon-NG is more of a framework of tools rather than just one tool. What makes it great is the extensibility through the Recon-NG Marketplace. You can choose which addons you want to install as well as create your own for others to use.
As with Amass and theHarvester, Recon-NG works best with API keys to various services. The full list at the time of writing can be seen below. I was able to get most of these for free however some required a subscription.
- BinaryEdge
- Bing (paid)
- BuiltWith
- Censys
- Flickr
- FullContact
- Github
- GoogleAPI
- GoogleCSE
- Hashes.org
- HaveIBeenPwned (paid)
- IPInfoDB
- IPStack
- NameChk (paid)
- Shodan (paid)
- VirusTotal
On first run, you’ll need to add the API keys. You’ll also find that no modules are installed by default. Installing is as easy as running the following commands. Some of the modules will require Python dependencies which will need to be installed outside of Recon-NG. Modules that have external dependencies will have an asterisk in the D column of Marketplace results and those requiring an API key will have an asterisk in the K column. In both cases, Recon-NG will warn you about missing dependencies and API keys after installing.
Gives a list of all modules in the Marketplace:
marketplace search
Install an individual module
marketplace install <relative path listed in search results>
Install all modules in the recon relative path:
marketplace install recon
To use a module, run modules load <module name>
, enter any prerequisite information such as domain name or CIDR IP address ranges and type run. Once the module is loaded, you can type info to see what information is required. This works great if you only need to run one or two modules. The real power of using Recon-NG comes in when you script out the commands. We typically like to start a script with the following commands.
#Logs all Recon-NG Output
spool start <absolute_filepath>/recon-ng.log
#Creates a workspace to store results separate from other workspaces
workspaces create <target_name>
workspaces load <target_name>
#Sets a timeout for requests, sets DNS server to Google, and sets user agent to whatever you choose
options set TIMEOUT 30
options set NAMESERVER 8.8.8.8
options set USER-AGENT <user-agent>
#Insert domains for the target with the domain name on the line after the command and a blank line after the domain name. This can be repeated for each domain name you have.
db insert domains
<domain name>
<intentionally blank line>
#Insert full company name with a blank line after the name
db insert companies
<company name>
<intentionally blank line>
#Insert netblocks one per line
db insert netblocks
<netblock CIDR IP range>
<intentionally blank line>
Once all of those options are set, its just a matter of putting in a line for each module that you want to run as modules load <module name>
followed by run
on the next line. Each module will need to be listed on a separate line. Also, the modules will need to be listed in the order they should be executed. For example, you’ll need to do subdomain enumeration before trying to pull information about the hosts with Shodan or gathering email addresses before attempting to verify them. The final section of the script we like to add is reporting. You can get a summary of all the OSINT gathered by having the following line at the end of your script.
#Loads the Excel spreadsheet export module and creates a report at the file path given.
modules load reporting/xlsx
options set FILENAME <absolute_filepath>/recon-results.xlsx
run
Conclusion
Today, we covered three great tools for performing OSINT. While there is a lot of overlap in the results they will provide, each tool covers additional ground from the others giving a more complete picture. In the next and final installment of this series, we will cover some recon methods that interact with the target while masking the source of the scan as well as unintended information disclosures that can help with password spraying and social engineering attempts.
Related Stories
View MoreObfuscating Shellcode Using Jargon
By Red Siege | July 31, 2023
by Mike Saunders, Principal Security Consultant In a recent blog , we discussed how encrypting shellcode leads to increased entropy, which may result in your shellcode loader being blocked and/or […]
Learn MoreBrowser Only Web Application Testing
By Red Siege | July 24, 2023
By: Ian Briley, Security Consultant Spoiler Alert: Burp is the number one tool most people use while testing web applications. If you want to be an open-source champion, ZAP from […]
Learn MoreIntroduction to Mythic C2
By Red Siege | June 28, 2023
By: Justin Palk, Senior Security Consultant Continuing along with my occasional series looking at how to set up and use various C2 frameworks, this is a guide to Mythic C2. Developed […]
Learn More