Case Studies

Case Study #1 - Deep Web Database - Network Security

As an example of applying some of the principles in this presentation, let’s do a search on “network security” using a surface search engine and a deep Web database.

First let’s do a surface search on Google. The result is 42 million hits. I don’t really have time to look at 42 million hits, even a million might take a while. Realistically, I’ll look at the first 100 or so and perhaps adjust my keywords then search again. The results show too many vendor sites and only a few dozen sites that might have good information. These results were above average for ten minutes of work. However, I will need to evaluate these few dozen sites and that could take a few hours – maybe I will find something useful among these, maybe not.

Now let’s try a deep Web site like Educause, again using the same keywords, “network security.” There are 2,620 hits. I look at the first hundred or so and none of these are vendors. There are however many PDF files that look like they contain useful information. The first hit says, “Welcome to the Computer and Network Security Web site, developed by the EDUCAUSE/Internet2 Computer and Network Security Task Force” (para. 5). Going to this link, it says, “The Web site is intended to be a focal point of information and resources on computer and network security for the higher education community. The navigation on the left will lead you to content determined to be most relevant by the Task Force” (para. 1). I go to the "About Task Force" link and it tells me this task force has 36 members, lists their names, positions and contact information, tells me they have been working together on this project for the past five years, and their goal is stated as “actively promotes effective practices and solutions for the protection of information assets and critical infrastructures” (para. 1). Now, I go back to the main page. This page is the hub of access to all the resources assembled for security, including best practices, reports, seminars and a cyber security forum.

Comparing the quality of the results between the two methods, for this search, the deep Web results have more substance and credibility. Of course this will not always be the case. The surface and deep Web each have their advantages and disadvantages depending on the search topic. You need both aspects of the Web plus a phone to call people (not all information is on the Web). In this example, within three minutes, the deep Web search revealed a goldmine of high-quality information very relevant to the search topic.

Case Study #2 - Specialized Search Engines - PDA Security

As another example of applying some of the principles in this presentation, let’s do a search on “PDA security” using data mining with a specialized search engine.

Specialized search engines search for databases and help eliminate the “noise” associated with general search engines. For example, using the specialized search engine Beaucoup to search on the keyword "security" finds 69 sites having security databases. Going to one of these 69 websites, SANS, and doing a keyword search on "security mobile handheld" yields 8 hits. The first hit is a PDF file entitled "S.C.O.R.E Personal Digital Assistant Audit Checklist, July 2005." This is precisely what was sought. It gives a checklist for securing PDAs and a list of vendors that provide security devices for PDAs in these categories: user authentication, anti-virus, theft protection, file encryption, firewall, virtual private, network, data integrity, device enterprise management, and device backup. 

As a bonus, in this search we have learned that the SANS website has security checklists available that will help with a wide variety of security concerns besides PDAs. SANS is a website we will want to visit again in the future. 

Beaucoup is just one of many specialized search engines. You may need to try several before you find a database that will answer your needs. Additional specialized search engines are listed in the "Data Mining" section of this presentation.