TheHarvester

Information Gathering Tool


Author: Christian Martorella
License: GPLv2
Software: TheHarvester
Date created: 2011
Updated: Yes
GitHub: TheHarvester GitHub

Last updated: 21 november 2018

  1. Description
  2. TheHarvester is an OSINT tool for gathering subdomains, email addresses, open ports, banners, employee names, and much more from different public sources. (Google, Bing, PGP key servers, ...). In order to gather this information it will do active and passive information gathering.

    You would want to use this tool when you are curious about the visibility of your company on the internet or for information gathering purposes during a penetration test.

  3. Cheatsheet
  4. # Syntax theharvester -d [domain] -l [amount of depthness] -b [search engines] -f [filename]
    # Basic scan of the given domain, returns 500 results for each search engine. theharvester -d google.com -l 500 -b all
    # Output your scan results to a visual report in HTML format theharvester -d google.com -l 500 -b all -f results.html

    -d : Specifies the domain to scan

    -l : Specifies how deep the scan should go. More is better but slower! :)

    -b : Specifies the search engine to search on. (options as of may 2018: google, googleCSE, bing, bingapi, pgp, linkedin, google-profiles, jigsaw, twitter, googleplus, all)

    -f : Specifies an output file for the found results. This file will be saved in the current directory from your terminal, unless specified otherwise, in the HTML format.

  5. Passive gathering sources:
  6. - ThreadCrowd: Open source threat intelligence

    - crtsh: Comodo certificate search

    - Google: Google search engine

    - GoogleCSE: Google custom search engine

    - Google-profiles: Google search engine, specified search for Google profiles

    - Bing: Microsoft search engine

    - BingAPI: Microsoft search engine API

    - Dogpile: Dogpile search engine

    - PGP: PGP key server

    - LinkedIn: Google search engine, specified search for LinkedIn users

    - Shodan: Shodan search engine, will search for ports and banners of the discovered hosts.

    - Baidu: Baidu search engine

    - Yahoo: Yahoo search engine

    - vhost: Bing virtual hosts search

    - Twitter: Searches for Twitter accounts related to a certain domain, uses Google search

    - Google+: Searches for users that work in the target company, uses Google search

  7. Active gathering sources:
  8. - Port scanning and takeover options

    - DNS bruteforce: A plugin that will run a dictionary brute force enumeration attack.

    - DNS reverse lookup: Reverse lookup of discovered IP's to find hostnames.

    - DNS TLD (Top-level domain) expansion: TLD dictionary brute force enumeration.

  9. Modules that require API keys:
  10. - GoogleCSE: You need to create a Google Custom Search engine(CSE), and add your Google API key and CSE ID in the plugin (discovery/googleCSE.py)

    - Shodan: You need to provide your API key in discovery/shodansearch.py

** For more information, check out the extra links and sources. **

50URC35: