[1]Google hacking, also named Google dorking,[2][3] is a hacker technique that uses Google Search and other Google applications to find security holes in the configuration and computer code that websites are using.
Basics
Google hacking involves using operators in the Google search engine to locate specific sections of text on websites that are evidence of vulnerabilities, for example specific versions of vulnerable Web applications. A search query with intitle:admbook intitle:Fversion filetype:php would locate PHP web pages with the strings "admbook" and "Fversion" in their titles, indicating that the PHP based guestbook Admbook is used, an application with a known code injection vulnerability. It is normal for default installations of applications to include their running version in every page they serve, for example, "Powered by XOOPS 2.2.3 Final", which can be used to search for websites running vulnerable versions.
Devices connected to the Internet can be found. A search string such as inurl:"Mode=" will find public web cameras.
History
The concept of "Google hacking" dates back to August 2002, when Chris Sullo included the "nikto_google.plugin" in the 1.20 release of the Nikto vulnerability scanner.[4] In December 2002 Johnny Long began to collect Google search queries that uncovered vulnerable systems and/or sensitive information disclosures – labeling them googleDorks.[5]
The list of Google Dorks grew into a large dictionary of queries, which were eventually organized into the original Google Hacking Database (GHDB) in 2004.[6][7]
Google Dorking has been involved in some notorious cybercrime cases, such as the Bowman Avenue Dam hack[12] and the CIA breach where around 70% of its worldwide networks were compromised.[13] Star Kashman, a legal scholar, has been one of the first to study the legality of this technique.[14] Kashman argues that while Google Dorking is technically legal, it has often been used to carry out cybercrime and frequently leads to violations of the Computer Fraud and Abuse Act.[15] Her research has highlighted the legal and ethical implications of this technique, emphasizing the need for greater attention and regulation to be applied to its use.
Protection
Robots.txt is a well known file for search engine optimization and protection against Google dorking. It involves the use of robots.txt to disallow everything or specific endpoints (hackers can still search robots.txt for endpoints) which prevents Google bots from crawling sensitive endpoints such as admin panels.
^Kashman, Star (2023). "GOOGLE DORKING OR LEGAL HACKING: FROM THE CIA COMPROMISE TO YOUR CAMERAS AT HOME, WE ARE NOT AS SAFE AS WE THINK". Wash. J. L. Tech. & Arts. 18 (2).