EPIC logo

Faulty Filters:
How Content Filters Block Access to
Kid-Friendly Information on the Internet

December 1997

Electronic Privacy Information Center
Washington, DC
http://www.epic.org/


SUMMARY

In order to determine the impact of software filters on the open exchange of information on the Internet, the Electronic Privacy Information Center conducted 100 searches using a traditional search engine and then conducted the same 100 searches using a new search engine that is advertised as the "world's first family-friendly Internet search site." We tried to locate information about 25 schools; 25 charitable and political organizations; 25 educational, artistic, and cultural institutions; and 25 concepts that might be of interest to young people. Our search terms included such phrases as the "American Red Cross," the "San Diego Zoo," and the "Smithsonian Institution," as well as such concepts as "Christianity," the "Bill of Rights" and "eating disorders." In every case in our sample, we found that the family-friendly search engine prevented us from obtaining access to almost 90 percent of the materials on the Internet containing the relevant search terms. We further found that in many cases, the search service denied access to 99 percent of material that would otherwise be available without the filters. We concluded that the filtering mechanism prevented children from obtaining a great deal of useful and appropriate information that is currently available on the Internet.


INTRODUCTION

The subject of whether to promote techniques to limit access to information available on the Internet grows out of the litigation against the Communications Decency Act. In that case, the Supreme Court ruled that the First Amendment protected the right to publish information on the Internet. The Court also found that "the interest in encouraging freedom of expression in a democratic society outweighs any theoretical but unproven benefit of censorship."

Shortly after the Supreme Court issued its decision, the White House convened a meeting to discuss the need to develop content filters for the Internet. The Administration unveiled a "Strategy for a Family Friendly Internet." According to the White House proposal, a key component would be the promotion of labeling and screening systems designed to shield children from inappropriate Internet content.

President Clinton said that he thought it was necessary to develop search engines specifically designed to screen out objectionable material. He said that it "must be our objective" to ensure that the labeling of Internet content "will become standard practice." Vice President Gore said, "Our challenge is to make these blocking technologies and the accompanying rating systems as common as the computers themselves."

In a statement released during the White House meeting, five Internet companies -- CNET, Excite, Infoseek, Lycos and Yahoo! -- expressed their support of the "White House proposal for the Internet industry to adopt a self-regulated rating system for content on the Web."

Following the White House summit, several companies announced that they would develop products and services for content filtering. On October 6, Net Shepherd and AltaVista launched Family Search. They described the product as "the world's first family-friendly Internet search site." Family Search is the first product to incorporate two of the goals identified at the July White House meeting -- content rating and filtered search engines.
 

THE FAMILY SEARCH SERVICE

Net Shepherd Family Search is a web-based search engine located on the Internet at http://family.netshepherd.com. According to the "Frequently Asked Questions" (FAQ) file available at the site, Family Search "is designed to make the Internet a friendlier, more productive place for families. This is achieved though filtering out web sites judged by an independent panel of demographically appropriate Internet users, to be inappropriate and/or objectionable to average user families."

The Family Search service operates as follows: A user submits a search request, such as "American Red Cross." That request is then directed to the AltaVista search engine. The AltaVista results are then filtered through Net Shepherd's ratings data base, and the filtered results are presented to the user. For this reason, conducting a search using the AltaVista search engine, and then conducting the same search using the Net Shepherd search engine, shows exactly how much information is removed by the Net Shepherd filter.

Net Shepherd claims that it has completed the most comprehensive rating of material on the World Wide Web. According to the company (as reported in the FAQ), in March of 1997 it had rated "97% of the English language sites on the Web."

For this survey, it is particularly important to emphasize two claims made by Net Shepherd about its family-friendly search engine. First, Net Shepherd states that the filtering criterion is whether a web site is "inappropriate and/or objectionable to average user families." Second, Net Shepherd states that its review of material available on the Web is comprehensive -- "97% of the English language sites."
 

SURVEY METHODOLOGY

We set out to determine the actual effect of the filtering process -- to quantify the amount of information that was actually blocked by a filtered search engine. Family Search's use of AltaVista results enabled us to conduct a straightforward comparison of a filtered and an unfiltered search. We first entered our search criteria into the AltaVista search engine [http://altavista.digital.com] and recorded the number of documents produced in reponse to our request. This number appeared at the top of search results returned by AltaVista.

We then duplicated our search request with Family Search and recorded the number of documents located through that search engine. Unlike AltaVista, Family Search does not report the number of matching documents. We had to read each page of the search results and manually count the number of documents retrieved.

All of our searches that contained more than one word in the search were submitted in quotation marks.

Family Search allows the user to designate a desired "quality" level for its search results. In conducting our searches, we used the default of "no preference." This is the most comprehensive setting and allowed us to retrieve all of the documents that Family Search would provide.

All of our searches were conducted between November 17 and November 26, 1997. We conducted 100 searches for key phrases using the unfiltered and the filtered search engines. We divided the 100 searches into four groups:

Elementary, middle and high schools
Charitable and political organizations
Educational, artistic and cultural institutions
Miscellaneous concepts and entities

We were particularly interested in the topics that would interest young people. For this reason we selected search phrases for organizations and ideas that we thought would be or should be of interest to children ages 18 and below. We are aware that not all families would agree that all of the phrases we selected would be appropriate for their children, but by and large we thought the 100 phrases we selected would likely be the types of searches that children who are using the Internet for non-objectionable purposes would conduct and that their parents would probably encourage.

Our findings are contained in the attached table. The results are summarized below:
 

Survey of Elementary, Middle and High Schools

With the growth of the Internet, many schools are today taking advantage of new communications technology. Not only are students able to access information around the world from a computer terminal in their classroom, they are also able to set up web sites. Many of these sites contain practical information -- how to contact teachers, homework assignments, and cancellation policies. Many sites also include school projects. Although the content of the sites is as different as are the schools, one thing seems clear -- the web sites in this category are web sites created for young people and often by young people. Thus when we tried locating these sites through the family-friendly search engine, we were surprised by the outcome.

The Arbor Heights Elementary School in Seattle, Washington maintains a highly regarded web site at http://www.halcyon.com/arborhts/arborhts.html. More than 70,000 people have visited the web site in the last two years. The school also publishes a magazine specifically for kids aged 7 through 12 called "Cool Writers Magazine" that is available at the web site.

If you go to the AltaVista search engine and search for "Arbor Heights Elementary," you will get back 824 hits. But if you use the Net Shepherd family-friendly search engine, only three documents are returned. In other words, Net Shepherd blocks access to more than 99 percent of the material that would otherwise be available on AltaVista containing the search phrase "Arbor Heights Elementary."

We found similar results with other searches. More than 96 percent of the material referring to "Providence School" is blocked by Family Search. Over 98 percent of the material referring to "Ralph Bunche School" is also blocked.

This seemed extraordinary to us. The blocking criteria deployed by Net Shepherd is, according to the company, whether a site is "inappropriate and/or objectionable to average user families." We looked at several of the pages that were returned with the unfiltered search engine but not with the filtered search engine. We could not find anything that an average user family would consider to be inappropriate or objectionable.

We also noticed that as the web sites became more popular, that is to say as more documents were returned, the percentage of materials available dropped. In our survey of school web sites, the range of materials blocked went from 86 percent to 99 percent , but once more than 250 documents were available from an AltaVista search, at least 94 percent of the material would always be blocked by Family Search. Once more than 500 documents were available from an AltaVista search, that number rose to 98 percent .

Survey of Charitable and Political Organizations

We selected 25 organizations representing national charities and groups across the political spectrum. Many of these organizations were established to provide services and assistance to children and parents. All have made important use of the Internet to provide timely and useful information on-line at little or no cost to families across the country.

The American Red Cross site (http://www.crossnet.org/), for example, provides an extraordinary collection of information about public health and medical resources. The American Red Cross has a special interest in famailies. It designated November "Child Safety and Protection Month." If you go to this web page [http://www.crossnet.org/healthtips/firstaid.html] you will find a special section devoted to "Health and Safety Tips: How to Protect Your Family with First Aid Training."

These resources and other similar materials are available if you conduct an AltaVista search for "American Red Cross." Almost 40,000 document were returned with the search. But a search with Family Search for the same phrase produced only 77 hits. The search engine filter had blocked access to 99.8 percent of the documents concerning the "American Red Cross" that would otherwise be available on the Internet.

Similar results were found when we conducted searches for the "Child Welfare League of America," "UNICEF" and "United Way."

Political organizations are also subject to extensive filtering. More than 4,000 documents about the NAACP can be found by means of AltaVista, but Family Search seems to believe that only 15 documents on the Internet concerning the NAACP are appropriate for young people.

Again we found that as search phrases became more popular, that is to say that as more documents were returned in response to an unfiltered search request, Family Search was more likely to block a higher percentage of materials. In this category, the amount of blocked material ranged from 90 percent to 99 percent , but once more than a thousand documents would be available with the unfiltered search, we found that 99 percent of the material would be blocked by Family Search.
 

Survey of Educational, Artistic and Cultural Institutions.

Many organizations use the Internet today to provide all types of valuable information for young people. We conducted searches for many well known kids' activities, such as "Disneyland," "National Zoo," and the "Boy Scouts of America."

The National Aquarium in Baltimore is one of top attractions for young people in the mid-Atlantic region. The Aquarium has created an extensive web site [http://www.aqua.org/], filled with a lot of neat stuff. If you go to Think Tank, you can try to answer a daily question about aquatic life. In the Education section of the web site, titled "Wonder Leads to Understanding," you will learn more about special programs at the National Aquarium for young people. The Aquarium's resources are widely found across the Internet. An AltaVista search produced 2,134 responses. But the family-friendly search engine produced only 63 responses.

Intrigued by the tremendous discrepancy, we decided to visit every one of the first 200 web pages returned by Alta Vista to see how it could be that, on average, 97 percent of the material would be considered objectionable to the average user family.  We did find several speeches and papers that mentioned the National Aquarium as well as several events that were held at the National Aquarium. We also learned that the United States does not have the only National Aquarium. Others can be found in Australia and the Phillipines. We learned that a few people take family pictures when they go to the National Aquarium and that people who work at the Aquarium mention it on their resumes. But we couldn't find any objectionable or inappropriate material.

Again, we found that as the sites became more popular, it was more difficult to find information through Family Search. For searches of information on the Internet on many of the most popular educational institutions in the United States for kids, Family Search routinely blocked 99 percent of the documents. "Yellowstone National Park" produced a blocking rate of 99.8 percent. The blocking rate for the "San Diego Zoo" was 99.6 percent.

One of the most peculiar results in the entire survey concerned our search for the "National Basketball Association." A straightforward search on AltaVista produced 18,018 hits. But when we tried Family Search, only two documents were provided. We have no idea what is in the remaining 18,016 documents that Family Search considers to be objectionable for the average family using the Internet.
 

Survey of Miscellaneous Concepts and Entities

For this last category, we considered the topics that students might be interested in learning more about as part of a school research paper or similar project. We tried to select concepts and entities from a range of areas appropriate for young people -- science, history, geography, government, religion, as well as famous people.

Consider, for example, a young student who is writing a research paper on "Thomas Edison," one of the greatest inventors of all time. If the student undertakes a search with AltaVista, 11,522 documents are returned. But if the student uses the Family Search site, only nine documents are produced. Similar results will be found with such search phrases as "Betsy Ross," "Islam," "Emily Dickinson," and "United States Supreme Court."

We recognize that young people also have concerns about sensitive topics such as eating disorders, puberty, and teen pregnancy. Parents' views on how best to handle such issues varies considerably from family to family. Not surprisingly, most of the documents available on the Internet about these topics are extensively blocked by Family Search. But what was surprising to us is that the blocking of these sensitive matters was not any greater than with such topics as "photosynthesis" (99.5 percent), "astronomy" (99.9 percent) or "Wolfgang Amadeus Mozart" (99.9 percent). In other words, it is just as difficult to get information about the "Constitution of the United States"  -- actually, somewhat more so -- as it is to get information about "puberty" using a family-friendly search engine.

Even Dr. Seuss fares poorly with this family-friendly search engine. Only eight of the 2,638 references on the Internet relating to Dr. Seuss are made available by Family Search. And one of the eight documents that was produced by the search engine turned out to be a parody of a Dr. Seuss story using details from the murder of Nicole Brown Simpson.
 

LIMITATIONS OF SURVEY

We recognized in the course of the survey a number of limitations on our survey method. First, the figures that we provide regarding how much material the search engine blocks actually represent a percentage of the information blocked that would otherwise be available by means of the AltaVista search engine. There is material available on the Internet that is not located by AltaVista, but could be found by other locator services such as Yahoo! or Hotbot. If this factor were taken into account, the percentage of materials blocked by Family Search, expressed as a percentage of all the material available on the Internet containing the relevant search phrases, would necessarily increase.

We also recognize that there is some ambiguity in search terms and that context is often necessary to establish meaning. We tried where possible to select search terms that would reduce the risk of ambiguity.

W did not attempt to review all of the filtering products currently available. For the reasons described above, and particularly the emphasis that filter proponents have placed on search engines that can perform this task, we believed it was appropriate to limit our study to the one search engine specifically designed to block access to "inappropriate material."

CONCLUSION

Our research showed that a family-friendly search engine, of the kind recommended by proponents of Internet rating schemes at the White House summit in July 1997, typically blocked acccess to 95-99 percent of the material available on the Internet that might be of interest to young people. We also found that as information on popular topics became more widely available on the Internet, the search engine was likely to block an even higher percentage. We further found that the search engine did not seem to restrict sensitive topics for young people any more than it restricted matters of general interest. Even with the very severe blocking criteria employed, we noted that some material which parents might consider to be objectionable was still provided by the family-friendly service.

Our review led us to conclude that proponents of filters and rating systems should think more carefully about whether this is a sensible approach. In the end, "family-friendly" filtering does not seem very friendly.


RECOMMENDATIONS

While it is true that there is material available on the Internet that some will find legitimately objectionable, it is also clear that in some cases the proposed solutions may be worse than the actual problem. Filtering programs that deny children access to a wide range of useful and appropriate materials ultimately diminish the educational value of the Internet.

We hope that additional research will be done on the impact other filtering programs may have on the ability of young people to obtain useful information on the Internet. Without such studies, it is not possible to say whether it is sensible to promote these programs.


RESOURCES

Internet Free Expression Alliance [http://www.ifea.net/] -- IFEA was established to protect the free flow of information on the Internet. It includes more than two dozen member organizations. Information is available from the IFEA web site about rating and filtering systems, including the views of the American Civil Liberties Union, the American Library Association, the Computer Professionals for Social Responsibility, the Electronic Frontier Foundation, the Electronic Privacy Information Center, the National Campaign for Free Expression, the National Coalition Against Censorship, and others.
 


ABOUT EPIC

The Electronic Privacy Information Center is a public interest research organization, based in Washington, DC.

Electronic Privacy Information Center
666 Pennsylvania Ave., SE Suite 301
Washington, DC 20003
+1 202 544 9240 (tel) +1 202 547 5482 (fax)
http://www.epic.org/