The following is a series of blogs by Billy Almarinez, published in his facebook notes. I am sharing this with you because it gives a revealing insight on those SWS and Pulse Asia surveys. You can likewise access his notes directly on his Facebook account by clicking HERE.
1.Analysis of Pulse Asia and
Social Weather Station Survey Methods
by Billy Almarinez
27 March 2010
Sample Size – Part 1
Not only have I been studying and teaching statistics as a college instructor, but I have also been using it as a scientist and researcher for quite some time now. However, it was only during the early morning of March 26, 2010 that I thought of trying to scrutinize SWS and Pulse Asia surveys from a statistical standpoint. Although SWS and Pulse Asia never reveal how they actually conduct the surveys (aside from indicating the questions asked, the number of respondents, and the margin of error they set) and that they usually argue that their methods are “tried and tested” ones, I think it
would not hurt if we try to take a look at how representative their survey results are of the entire population of registered voters, using another generally accepted and “tried and tested” method we use in statistics.
I’m talking about Slovin’s formula.
It is only from members of a sample (as respondents) that data would be obtained through a survey, since a census (or gathering data from the entire population) is not feasible for data gathering given a short span of time and limited resources. It is important, however, that the sample used be as representative of the population as possible, so that inferences derived from analysis of sample data may be more or less applicable to the whole population. This may be ensured by using appropriate sampling methods and using appropriate sample sizes.
In statistics, Slovin’s formula is a generally accepted way of how to determine the size appropriate for a sample to ensure better representation of the population of a known size. The formula may be expressed as follows:
n = N / (1 + Ne^2)
where n = sample size
N = population size
e = margin of error
Again, in the context of SWS and Pulse Asia surveys, these groups could argue that they use formulas other than Slovin’s formula in coming up with sample sizes of 2,100 and 1,800 respondents respectively (I did a review of news clips from the ABS-CBN News web site, and noted that these two figures are the most commonly used sample sizes of the two survey groups). However, if we use Slovin’s formula (which is, again, a generally accepted and commonly used method in statistics), rather alarming ideas may be derived (alarming, considering how some Filipinos base and defend their decisions on who to vote for on survey results).
Considering that both survey groups usually set the margin of error at plus or minus 2 percent (or 0.02), here’re what Slovin’s formula says about the SWS and Pulse Asia sample sizes (anybody with a considerable aptitude in algebra may verify these):
- SWS’s survey over 2,100 respondents, with margin of error set at 0.02, assumes a population composed of only 13,125 individuals. In the context of election-related surveys, that would point to the survey results being possibly representative of a population of 13,125 registered voters nationwide.
- Pulse Asia’s survey over 1,800 respondents, with margin of error set at 0.02, assumes a population composed of only 6,429 individuals. In the context of election-related surveys, that would indicate that the survey results may be representative of a population of 6,429 registered voters nationwide.
Now let’s see…. Based on Slovin’s formula, the SWS sample size seems to assume that there are only 13,125 registered voters, while the Pulse Asia sample size seems to assume that there are only 6,429 registered voters nationwide. How many registered voters are there in the country? Can anyone provide the actual population size of registered voters? Is there anybody reading this who knows anyone from COMELEC? I’m sure there’s a definite figure.
Well, with or without the actual figures from COMELEC, I believe 13,125 and 6,429 are gross underestimations of the actual number of registered voters in the country.
I’m not trying to disprove SWS or Pulse Asia here. Again, it is highly likely that they are using methods that do not include Slovin’s formula. However, here’s my case in point: Before you believe that SWS and/or Pulse Asia survey results are what can actually be expected if elections were held then and there, think more than twice; it is also highly likely that the results may not really be reflective of what the entire Filipino electorate may actually and ultimately reflect, from a statistical standpoint.
And that is not yet considering the sampling method employed by these survey groups.
Hence, to the SWS and Pulse Asia survey frontrunners and their supporters, I suggest for you not to keep your hopes too high, or you may end up disappointing yourselves if the actual results of the elections will not reflect the trends reflected by those survey results. And to survey tailenders and their supporters, there may actually be valid bases for you not to give much credence to these survey results. Quoting from Sen. Gordon: “The real ‘survey’ is on May 10, 2010.”
2. ANOTHER TAKE ON SWS AND PULSE ASIA
SURVEYS BY A STATISTICS INSTRUCTOR
By Billy Almarinez
27 March 2010
Sampling Method Part 2
Last time, I attempted to discuss the questionable sample sizes being employed by SWS and Pulse Asia in their surveys. Here, I am going to give my take on the sampling method.
Again, as an instructor of college statistics and methods of research, I am inclined to question the methodology used by SWS and Pulse Asia in their conduct of surveys. Why? Because it is actually dubious how these two survey firms come up with results almost every few weeks if they’re actually using scientifically and statistically sound methodologies. The question is rooted mainly in the reluctance (for some reason or another) to disclose the details of the procedures they employed. Scientific technical reports like those that present results of surveys should include a detailed description of how sampling was carried out, and unfortunately every time SWS and Pulse Asia come up with results of their survey they only provide findings without specifying in detail how they conducted their study. They only indicate the sample size used and the margin of error they set as well as the question they gave a respondent, but the sampling method and how they actually conducted the survey (i.e., how they distributed the survey questionnaires) seem to remain undisclosed to the general public.
Recently, news reports (mainly from ABS-CBN and GMA 7) on the most recent Pulse Asia survey results indicate that the survey firm used a “multistage random sampling method”. What did Pulse Asia mean by that? And how did they actually determine who the respondents would be? It is very easy for a researcher to say that he/she used or is going to use a random sampling method, but conducting such is actually not that simple. It is not as simple as going out in the street and handing out a survey form to somebody the researcher meets “randomly”. Such an activity is not a probability (random) sampling method, in which all members of the population are supposed to have an equal probability of being selected into the sample. In the case of surveys conducted via true random sampling, all members of the population of registered voters (including me and you) should have an equal chance of being selected as a respondent
How should sampling for an election-related survey (like the ones Pulse Asia and SWS supposedly conduct) be carried out in order for the results to be valid and reflective of the characteristics of the population? Here’s my take, and my attempt to discuss why it is not as simple and as easy as how Pulse Asia and SWS want us and gullible voters to believe:
- Given that the population size of registered voters is actually known, the general sample size should be determined using a tried-and-tested, statistically and generally accepted formula like Slovin’s formula, which I attempted discussing in my previous entry.
- Given the heterogeneity (i.e., differences) and at the same time homogeneity (i.e., similarities) in demographics inherent in Filipino voters, a multi-stage sampling method composed of both cluster and stratified sampling methods should be employed. This further complicates the methodology, since registered voters can be subdivided into homogenous groups in many ways. Sub-grouping can be done based on age range, economic status, occupation, and other demographic parameters. Complications further arise since the proportion of each stratum (homogenous sub-group) and the proportion of elements in a cluster (which in this case is a geographic location) in the population should be considered in the sample size. For instance, if the sample size is determined to be 2,500 and 20 percent of the population is in Metro Manila, then 500 respondents should come from Metro Manila. This is not yet considering the percentage of each strata identified (for instance, the youth sector). To make things less highfaluting, in short, sampling is not as simple as it seems.
- Identification of respondents is another complicated aspect, especially if random sampling is to be employed. The researcher should first have a list of names of all of the members of the population. In this case, a complete list of registered voters from the COMELEC is to be used. Elements of sub-samples (following cluster and stratified sampling) are to be identified from the voters list. Here, another complication exists in the fact that although the names in the list are usually arranged alphabetically and divided by precinct, they are not grouped according to demographic parameters like gender, age range, economic status, and others. Hence, the burden of grouping the names according to gender, or age range, or economic status, or other demographic variables lies in the researcher. And that would require painstaking and time-consuming effort. And then the prospective respondents (elements of the sample) are to be chosen randomly, first by assigning numbers to each member of the population in the voters list and then choosing numbers via lottery (via fishbowl or tambiolo) or by using a table of random numbers, or by systematic sampling (where every nth member is chosen, n being a random number).
- Another tricky part is the distribution of survey questionnaires to the respondents selected from the voters list. The researcher should exhaust all possible means of making sure a survey questionnaire is given to the specific name that has been selected as a respondent. Since the COMELEC voters list also contains addresses of the registered voters, distribution of the questionnaires can be done either by sending them through mail or courier or by conducting actual house visits. Complications may further arise if: (a) the voter selected as a respondent has already transferred residence but has not updated his/her address; (b) the selected respondent is unavailable during the time a house visit was conducted by the researcher; or (c) the selected respondent declines to send back an accomplished questionnaire that has been received by courier or mail. The third case is very common in the conduct of surveys via courier or mail, hence a researcher usually sets a sample size that is greater than the one determined via Slovin’s formula for contingency purposes.
You may ask, what or where am I getting at with those technical hullaballoo that I just presented? Well, let me put it in the way of another list of points:
- Surveys conducted by asking or giving out questionnaires to random passersby or just visiting random households, even if researchers conduct such methods in different locations, is not following statistically and scientifically sound sampling protocols. The burden of explaining whether or not this is the type of method employed by SWS or Pulsa Asia lies in these survey firms, and unfortunately they decline to disclose specifics of their protocols.
- Respondents have to be selected via systematic or simple random sampling from the complete list of registered voters. Just asking or giving out questionnaires to random passersby or visiting households at random will not suffice, because doing so is not really a probability or random sampling method as not all members of the population of registered voters will not have an equal chance of being selected as a respondent.
- Do SWS and Pulse Asia select their respondents randomly from the COMELEC list to ensure that the people they get data from are actually registered voters? This one is highly doubtful about these commercial survey firms, because I learned from one Facebook user that a household helper under his employ was onceselected as a respondent of a survey of either of the two firms (I can’t remember which), when in fact she (the household helper) wasn’t even a registered voter. This implies that it is possible that Pulse Asia and SWS are choosing respondents who may not actually be registered voters, further pointing to the possibility that they do not select their respondents from the COMELEC list of registered voters. It is possible that they may be merely handing out questionnaires to random passersby or conducting visits of random households without employing a true random or probability sampling protocol.
- Since the burden of answering with clarity the question in the previous item lies with Pulse Asia and SWS, if they do not prove that they are conducting their surveys properly by using a statistically valid and scientifically sound methodology (as they haven’t done so with their reluctance to provide specifics on the methodology that they carried out), then it is but proper for them not to present the results of their survey as if they actually reflect the characteristics of the voting population. As in any scientific report like a thesis or a dissertation, validity of findings must be established by also establishing the validity and reliability of the methodology employed in the study.
- Given that a properly conducted survey would entail a huge amount of effort and would consume a considerable amount of time and resources, isn’t it very dubious how SWS and Pulse Asia seem to very easily come up with results almost every month, whose trends seem to vary very minimally? Take for instance the percentage ratings of Sen. Noynoy Aquino, Atty. Gibo Teodoro, Sen. Dick Gordon, Bro. Eddie Villanueva, Sen. Jamby Madrigal, Mr. Nick Perlas, and Coun. JC de los Reyes. Isn’t it suspicious how their ratings seem to have become almost static? Unless SWS and Pulse Asia are conducting their periodic surveys over the same respondents again and again, variations should be present in the results, but that is not what we are seeing, is it? Come to think of it, it is indeed much easier if surveys are conducted over the same people; much less hassle in sampling, no need to go through all of the scientific and statistical hullaballoo I presented earlier, don’t you think? Pun intended, of course.Given the points and arguments I have discussed herein, let me reiterate what I mentioned in my previous blog entry: To the SWS and Pulse Asia survey frontrunners and their supporters, I suggest for you not to keep your hopes too high, or you may end up disappointing yourselves if the actual results of the elections will not reflect the trends reflected by those survey results. And to survey tailenders and their supporters, there may actually be valid bases for you not to give much credence to these survey results.
Unfortunately, the alarming thing is that gullible Filipinos readily believe that Pulse Asia and SWS survey results are reflective of what the results of the election would be, and that gullible voters even tend to base their decisions on who to vote for based on these surveys. Minds of the Filipino electorate are just being conditioned to believe that what the results of these survey show may be the same results that can be expected in the May 10 elections. Survey frontrunners hail and tout Pulse Asia and SWS as highly credible and almost infallible, and they and their spin-doctors already presume and arrogantly declare that the presidential elections are going to be merely a two-man competition. Their supporters are being conditioned to think that the only way they would lose is if they are cheated out of victory, so that they would have a reason to vehemently protest an outcome that may be different from what they are expecting.
Again, quoting from Sen. Gordon, “The real ‘survey’ is on May 10, 2010” (Technically, it is not a survey but a census.) Do not be surprised should the results of the May 10 elections turn out to be different from what the results of Pulse Asia and SWS surveys indicate.
Comments and rebuttals are most welcome.