| Last spring the Institute for Social Research conducted the 1997 Canadian Election Study (CES) in which 100 or more thirty-minute telephone interviews were completed on each of the 36 days of the Canadian election campaign (April 4 to June 6). Respondents were asked, among other things, for which party they expected to vote in the coming election. As an experiment, and to parallel this study, the Institute decided to mount an Internet survey asking people to answer a set of basic voting items and an abbreviated set of demographic questions.
For more than two years different organizations have been conducting surveys on the World Wide Web. Some of these are product usage or satisfaction surveys, others are personality trait scoring instruments, while others are forms of public opinion polling. Our intention was to see how closely an Internet survey would parallel a more traditional survey of the general population. A hot link to the survey was placed on York University's home page. No other advertising, solicitation or promotion of the site was attempted.
The site was visited more that 1,300 times and 695 completed surveys were recorded during the 32 days the survey was accessible on the Internet. York University students represented 54% of all visitors, clearly not a representative sample. Also, more than 91% of those who visited the site and who 'voted' were from Ontario, 82% of whom were from Metropolitan Toronto.
Given that the preponderance of the Internet voters were from Metropolitan Toronto, and ignoring the biased sample and issues of representativeness, we proceeded to compare the voting intentions of the 465 people who indicated they lived in Metropolitan Toronto and 'voted' only once, with the 249 respondents from the CES telephone survey who lived in the same area.
Figures 1 and 2 show the voting intentions of these two groups. As one can see, while the size of the Liberal vote was virtually the same (around 32%), the NDP vote from the Internet was twice that from the telephone survey. Also, the size of the don't know group was very different in the two samples: 29% of the CES telephone survey respondents did not know what party they were going to vote for while only 19% of the Internet voters did not know.

To correct the Internet sample for representativeness we used 1991 census data supplied on the Public Use Microdata File (Individuals) from Statistics Canada. Using a three-way cross-tabulation, we applied post-stratification weights to the Internet data such that the number of individuals in each age, gender and education cohort would be the same as in the population. After applying these correction weights there were still large differences and the Internet survey results were no closer to the telephone survey data. For example, 48% of the respondents to the Internet survey were between 18 and 34 years of age although only 39% of the Canadian population is between these ages. The telephone survey produced a statistic of 38% in this same age cohort, a number almost identical to the population.
To say this Internet survey was a form of public opinion polling would be very misleading - nothing could be further from the truth. We discovered this by looking at the people who 'voted' in our Internet survey, already knowing that half of them were students. Table 1 provides a profile of the average type of individual who visited the survey site.
| Table 1 |
| Comparisons of Demographic Items by Survey Type |
| |
Telephone Survey
(weighted) |
Web Survey
(unweighted) |
Web Survey
(post-stratified) |
PUMF 1991
(Stats Canada) |
| Female |
54.1% |
35.7% |
52.6% |
51.1% |
| Age 18-34 |
37.7% |
66.1% |
48.2% |
39.0% |
| High School Ed. |
23.5% |
36.1% |
16.9% |
15.1% |
| Working (FT/PT) |
63.3% |
76.6% |
73.7% |
63.9% |
The most noticeable problem was the fact that only one third were women. (This predominance of males is quite common among Internet users and computer users in general, although the ratio is changing over time.) Also, two thirds were between 18 and 34 years old and more than one third had a high school education, figures far exceeding those in the actual population of Metropolitan Toronto.
As the telephone survey used a probability sample design, the demographic distributions almost exactly mirrored the census data except for education. (Over- representation of individuals who are more highly educated than the general population is well documented for surveys conducted in North America.) Using weights, as indicated above, to correct for the lack of a representative sample forced the Internet data to closely mirror the census data. (The age groups could not be corrected more precisely due to the fact that several of the 50+ age cells in the 'age by education by gender' table were empty.)
In a final attempt to compare the two data collection methods with respect to voting intention, we looked at the single largest cohort in the Internet sample: males aged 18 to 34 who had a high school education and lived in Ontario. Figures 3A and 3B show the voting preferences for this single cohort of educated young males. Again, ignoring statistics and the standard errors (at least +/-10% for the telephone sub-sample), the voting patterns for one group are clearly different from the other.

In conclusion, the samples from the two surveys are simply from different populations something we knew intuitively and expected from the outset. An Internet survey is a survey of a self-selected sample; any representativeness of the general population is accidental and not the result of the application of sampling methodology. Weighting or post-stratifying the results of the Internet survey fails to yield comparable results. As a result, we doubt that web surveys can be used as a substitute for more traditional methods of surveying the general population. A reliance on an Internet survey could lead to a misunderstanding of public opinion. |