On Github mclevey / research-methods
Dr. John McLeveyUniversity of Waterloojohn.mclevey@uwaterloo.caFall 2016, Knowledge IntegrationUniversity of Waterloo methods.f2016.slack.com
Lecture Slides (Updated Continuously)
Methods
Design
Research Proposal (Oct. 13, 10%)
Presentation of Research Proposal (Oct. 13, 5%)
Empirical Research Papers (Ongoing, 30%)
Presentation of 1 Rule of Social Research (Dec. 7, 40%)
10 Comprehension Quizzes (Ongoing, 5%)
Engagement / Participation (Ongoing, 10%)
You will write a 2,500 word research proposal (don't waste words!) that presents the initial idea for your final empirical research paper. You may use any of the research methods introduced in this class.
Each group will give a formal 10-15 minute presentation of their research proposal to the class.
The main deliverable in this course is an empirical research paper. It should build directly on the research proposal submitted earlier in the course, but it is OK to deviate slightly as the project evolves. I expect the final papers to be about 7,000 words, and no more than 7,500 words under any circumstances.
I will drop your lowest grade.
integ120.f2016.slack.com
Laptops may be used in the classroom on the honors system. If I see Facebook, email, an IM client other than #slack, a newspaper story, a blog, or any other content not related to the class, I will remove 1 point from your participation grade on the spot. No exceptions.
You will need a laptop for all classes marked as "computing" in the syllabus. You will need a laptop, a tablet, or a phone for all classes where a quiz is scheduled.
R & NVivo
Please complete Get to know people and help John make this a good class survey, which is active on LEARN until the coming Tuesday. This is to help us get to know one another, and to help me make this a better course.
I will also complete the survey and you can see my responses. I will discuss the data in class but I will not disclose names. I would like to make parts of your survey available to the rest of the class so that they can get to know you better, but I will only do so with your permission.Babbie and Benaquisto pages 4-21, 31-33, 41-57
Babbie and Benaquisto Chapters 1 and 2
Selections from Babbie and Benaquisto"Human Inquiry and Science" & "Paradigms, Theory, and Research"
Looking for Reality The Foundations of Social Science Some Dialectics of Social Research Paradigms & Two Logical Systems Deductive Theory Construction Inductive Theory Construction Linking Theory & Research
(Photo Credit Luke Peterson) flickrErrors in Inquiry and some solutions
Logic & Observation
(Photo Credit James Cridland) flickr
A model or framework for observing and understanding, which shapes both what we see and how we see it. Paradigms are not "right" or "wrong," but theories can be.Babbie and Benaquisto Ch. 2
(Photo Credit Yoppy) flickr
Deductive and Inductive
Babbie and Benaquisto Fig. 2-2 (pp. 45)
Babbie and Benaquisto Fig. 2-4 (pp. 50)Reproduced from Walter Wallace (2012)
Inductive & Deductive Approaches
specify topic specify who / what theory will apply to specify major concepts and theories find out what is know (propositions) about the relationships between those variables reason logically from the propositions to your specific topic
(Photo Credit Sunshinecity) flickr
...
(Photo Credit Sunshinecity) flickr
Firebaugh Chapter 1
Samantha Afonso!
Are you here?
There should be the possibility of surprise.
If you already know the answer, why do the research?
Ideally not this kind of surprise.
What are some examples of advocacy research? What makes advocacy research different than social research?
This is not surprising. Who cares?
What can you do?
(Photo Credit Evan Blaser) Flickr
"You don't need to eat the whole ox to know that it is tough."
very large populations do not require larger sample sizes than smaller populations do
we can make confident generalizations about a large population from a sample containing only a tiny fraction of that population
How cases are selected is more important than how many cases are selected.
Larger samples tend to be better than smaller samples, but that is because they are more likely to be representative. Size is much less important than representativeness.
A representative sample with just 100 people can be remarkably accurate. A non-representative (i.e. biased) sample of a million people can be remarkably inaccurate.
Select a sample that permits powerful contrasts for the effects of interest.
If you use stratified random sampling (more on this later), stratify on the explanatory variable. You can't explain a variable with a constant! This is especially important for small samples.
The same principles discussed above apply. Given smaller sample sizes, qualitative researcher must decide which comparisons are strategic and sample accordingly.
(Note: more on sampling in qualitative research when we get to the class on sampling.)"In its most extreme version, empirical nihilism in the social sciences denies the possibility of discovering even regularities in human behavior. That position is obviously silly. Consider the life insurance industry..."
"The best response to empirical nihilism is to ignore it and the research."
Is Thinking Statistically in the bookstore yet?
Babbie and Benaquisto Chapter 4
Typically, all three are present.
e.g. What factors make attending university more or less likely for some group of people?
Isolate a few factors that provide a partial explanation across many cases. You want the greatest amount of explanation with the smallest number of variables.
When one variable changes, so does the other.
... There are few perfect correlations.
Spurious correlations: http://www.tylervigen.com/spurious-correlations
Not always clear in cross-sectional studies, but logic helps.
Possibility of explanation by a third variable?
E.g. individuals, groups, organizations, households, artifacts, etc.
You can't use an analysis of an ecological unit (e.g. ridings) to draw conclusions about individuals within those units (e.g. voters).
Sometimes you are limited by the data you have, in which case logic is your friend.
Babbie and Benaquisto (pp. )Adapted from Joseph Leoni
Panel studies are the gold standard, but they do have unique problems. The most important is panel attrition. What happens if people drop out of the study? What if there is an underlying pattern in who drops out?
Babbie and Benaquisto Fig. 4-4 (pp. 112)
Babbie and Benaquisto Ch. 5
"The Two Recruits: A day in the life of an economist and a sociologist at Statistics Canada"
Babbie and Benaquisto Chapter 5
How would you measure:
"Conceptions summarize collections of seemingly related observations and experiences."
"Concepts are constructs derived by mutual agreement from mental images (conceptions)" In this sense, concepts aren't "real." But they can still be measured.
"Conceptualization produces a specific agreed-upon meaning for a concept for the purpose of research. This process of specifying exact meaning involved describing the indicators we'll be using the measure our concept and the different aspects of the concept, called dimensions."
Conceptualize "love." Yes, I'm serious.
Rubin, Zick. 1970. "Measurement of Romantic Love." Journal of Personality and Social Psychology, 16:265-273.
Observations that we consider reflections of a concept we wish to study. In other words, indicating the presence or absence of the concept we are interested in.
Aspects of a concept. Religiosity, for example, might be specified in terms of a ritual dimension, a belief dimension, etc. Compassion might have dimensions of compassion for humans and compassion for animals, etc.
We might develop a list of 100 indicators for compassion and its various dimensions. We could then study all of them, or some subset of them. If all of the indicators represent, to some degree, the same concept, then they will behave the way the concept would behave if it were real and could be observed.
A nominal definition is one that is simply assigned to a term without any claim that the definition represents a "real" entity.
An operational definition specifies precisely how a concept will be measured.
Operationalization is the development of specific research procedures that will result in empirical observations representing those concepts in the real world.
To what extent are we willing to combine attributes into fairly gross categories? For example: income, age.
For research on attitudes and orientations, do you need to collect data on the full spectrum, or just part of it?
To what degree is the operationalization of variables precise?
"If you are not sure how much detail to pursue in a measure, get too much rather than too little. You can always combine precise measures into more general categories, but you cannot create more specific measures from general categories."
The attributes composing every variable must be:
attributes have only the characteristic of being jointly exhaustive and mutually exclusive.
attributes can be rank-ordered on some dimension (e.g. high, medium, and low socioeconomic status)
attributes are rank-ordered and have equal distances between adjacent attributes
attributes have all the qualities of nominal, ordinal, and interval measures, and are based on a "true zero" point. (e.g. age, income)
The level of measurement you use is determined by your research goals, and the inherent limitations of some variables. Generally speaking, try to measure at the highest level you can.
Precise measures are better than imprecise measures. Accurate measures are better than inaccurate measures.
Would a measurement method collect the same data each time in repeated observations of the same phenomenon?
Does a measure accurately reflect the concept it is intended to measure?
Take the same measurement more than once.
Split indicators into two groups and see if they classify people differently. Recall: interchangeability of indicators.
Use indicators that have proven to be reliable in previous research.
Clarity, specificity, training, and practice can help ensure reliability.
Does it make sense without a lot of explanation?
The degree to which a measure relates to some external criterion.
The degree to which a measure relates to other variables as expected within a system of theoretical relationships.
The degree to which a measure covers the full range of meanings included within a concept.
In most cases, a good researcher should look to both colleagues and research participants as sources of agreement on the most useful meanings and measurements of important concepts. Sometimes one source will be more useful, but neither should be dismissed.
We want our measures to be both reliable and valid, but there is a tension between those two goals. Generally speaking, quantitative, nomothetic, structured techniques tend to be more reliable. Qualitative and ideographic approaches tend to be more valid.
Typically, we need multiple indicators to measure a variable adequately and validly. There are specific techniques for combining multiple indicators into single measures.
Note: We will discuss indexes and scales in more detail later in the course.
A composite measure that summarizes and rank-orders several specific observations to represent some more general dimension.
A composite measure composed of several items that have a logical or empirical structure among them.
There are many dimensions to the concept of gender equality. List at least five different dimensions and suggest how you might measure each. It's OK to use different research techniques for measuring the different dimensions.
Babbie and Benaquisto Ch. 6, "The Logic of Sampling"
There are two general types of samples: non-probability samples and probability samples.
e.g. Stopping people at a street corner. This is extremely risky because you have no control over the representativeness of the sample. Do not to this!
You select people based on your own knowledge of the population. Having a representative sample is not the goal.
Members of a population are difficult to locate / identify. You find a few people, and then ask them to pass along information to people they know.
You have a matrix describing key characteristics of the population. You sample people who share the characteristics of each cell in the matrix, trying to assign equal proportions of people who belong to different groups to your sample (e.g.,if you know that 10% of all classics majors are female and international, then you select 10 female international students for a sample of 100 classics majors).
It can be hard to get up to date information about the characteristics of the population, and tend to be high rates of sampling bias.
An informant is a member of a group who is willing to share what they know about the group. Informants are not the same as respondents, who are typically answering questions about themselves. Informants are often used in field research.
A probability sample is more likely to be representative of a population than a non-probability sample.
Probability theory enables us to enables us to estimate the parameters of a population from a representative sample.
"That quality of a sample of having the same distribution of characteristics as the population from which it was selected. By implication, descriptions and explanations derived from an analysis of the sample may be assumed to represent similar ones in the population. Representativeness is enhanced by probability sampling and provides for generalizability and the use of inferential statistics."
The summary description of a given variable in a population (e.g. mean income, mean age).
The summary description of a variable in a sample, used to estimate a population parameter.
The goal is to define a population, produce a sampling frame, and then sample elements (i.e. people) from the frame in a way that contains essentially that same variation that exists in the population. Random selection enhances the likelihood of achieving this.
Figure from the 2008 American edition of Fundamentals of Social Research.
The possibilities of sampling bias are endless, and not always obvious.
Examples?
Surveying university students about their alcohol consumption.What are some possible sources of sampling bias?Among other things, random selection ensures that the procedure is not biased by the researcher.
Every element has an equal chance of being sampled independent of any other event in the selection process. EPSEM: Equal Probability of Selection Method.
Typically, random selection is done using computer programs that randomly select elements.
Random selection also provides access to probability theory, which we can use to estimate population parameters, and to arrive at a judgment of how likely the estimates are to accurately reflect the actual parameters in the population.
A single sample selected from a population will give an estimate of the population parameter. Other samples would give the same or slightly different estimates. Probability theory tells us about the distribution of estimates that would be produced by a large number of such samples.
Let's look at a simple example.
Figure from the 2008 American edition of Fundamentals of Social Research.
Figure from the 2008 American edition of Fundamentals of Social Research.
There are 10 possible samples of 1, and 45 possible samples of 2 (10 choose 2). For the samples of 2, take every possible pair, compute the mean, and then plot it. We can already see that the estimates are starting to converge around the true mean.Let's try some slightly larger sample sizes. Remember, we compute and plot the mean for every possible sample.
Sample Size 3 10 choose 3 = 120 possible unique samples
Sample Size 410 choose 4 = 210 possible unique samples
Sample Size 510 choose 5 = 252 possible unique samples
Sample Size 610 choose 6 = 210 possible unique samples
If you take random samples over and over and over and over again, they will converge on the true value. The larger the random sample, the more accurate it is likely to be.
Your team has been contracted by the University of Waterloo to consult on a brand redesign. You need to survey the population of undergraduate students, graduate students, and professors to determine how they feel about the new university logo.
The variable we are interested in is attitudes towards the new logo. Respondents may either approve or disapprove.
As of 2014, there were 30,600 undergraduate students, 5,300 graduate students, and 1,139 full-time professors in 6 faculties.
Let's randomly sample 600.
More on how we could do this properly below!
There could be between 0 and 100% approval for the new logo. Let's assume that 50% approve and 50% disapprove. (Obviously, the research team doesn't actually know this.)
Imagine taking 3 different samples of substantial size. None is a perfect reflection of the UW community, but each comes close.
We have 3 different sample statistics. If we kept sampling, we would continue to get different estimates of the percentage of people in the UW community that approve of the new logo. Again, they would converge on the true value. As we continue to sample and plot, we find that some estimates overlap. We begin to see a normal curve.
Obviously, in real research we only collect one sample. Knowing what it would be like to select thousands of samples allows us to make assumptions about the one sample we do select and study.If many independent random samples are selected from a population, the sample statistics provided by those samples will be distributed around the population parameter in a known way. We can see that most of the estimates fall close to 50%
We can also use a formula to estimate how closely the sample statistics are clustered around the true value. The formula to estimate sampling error is:
$$ s = \sqrt{\frac{P \times Q}{n}} $$Where $s$ is the standard error. $P$ and $Q$ are the population parameters for the binomial ($P$ is approval and $Q$ is disapproval), and $n$ is the number of cases in each sample.
The standard error can tell us how the sample estimates are clustered around the population parameter. Because the standard error, in this case, is the standard deviation of the sampling distribution, we can determine confidence levels and confidence intervals."Whereas probability theory specifies that 68 percent of that fictitious large number of samples would produce estimates falling within one standard error of the parameter, we can turn the logic around and infer that any single random sample has a 68 percent chance of falling within that range."
"We express the accuracy of our sample statistics in terms of a level of confidence that the statistics fall within a specified interval from the parameter. For example, we may say we are 95% confident that our sample statistics are within plus or minus 5 percentage points of the population parameter. As the confidence interval is expanded for a given statistic, our confidence increases. For example, we may say that we are 99.9% confident that our statistic falls within three standard errors of the true value."
In real research, we don't actually know what the population parameter is. So we use our best guess (i.e. the sample estimate) for the formula.
Probability sampling is messier in reality than in theory.
Break into groups! Each group takes on type of design.
Babbie and Benaquisto Ch. 8, "Survey Research"
I will also include some lecture material from Groves et al. (2009) Survey Methodology. It is not necessary to do extra reading.
Thank you, Perd Hapley. :-)
From Parks and Recreation
What are the signs?What makes a survey bad?
There are two "inferential steps" in survey methodology
between the questions you ask and the thing you actually want to measure between the sample of people you talk to and the larger population you care aboutErrors are not mistakes, they are deviations / departures from desired outcomes or true values.
Errors of observation are when there are deviations between what you ask and what you actually want to measure.
Errors of non-observation are when the statistics you compute for your sample deviate from the population.
You move from the abstract to the concrete when you design surveys. "Without a good design, good survey statistics rarely result." You need forethought, planning, and careful execution.
Survey Lifecycle from a Design Perspective
Adapted from Groves et al. (2009) Survey Methodology.
Survey Design as a Process
Adapted from Groves et al. (2009) Survey Methodology.
Let's Focus on the Left Side: Measurement
Adapted from Groves et al. (2009) Survey Methodology.
We can represent all of this with nice compact notation. In most cases, capital letters stand for properties of population elements and are used when we are talking about measurement and when sampling the population is not an issue. If we are drawing inferences about a population by using a sample, capital letters are for population elements and lower case are for sample quantities. Subscripts indicate membership in subsets of the population (e.g. $_i$ for the $i$th person).Recall Class on Conceptualization and Measurement
$\mu_i$ = value of a construct for the $i$th person in the population, $i$ = 1, 2, 3, 4 ... N $Y_i$ = value of a measurement for the $i$th sample person $y_i$ = value of the response to the application of the measurement $y_{ip}$ = value of the response after editing and processing steps
We are trying to measure $\mu_i$ using $Y_i$, which will be imperfect due to measurement error. When we apply the measurement $Y_i$ (e.g. by asking a survey question), we actually obtain $y_i$. This is due to problems with administration. Finally, we try to mitigate these errors by making final edits, resulting in $y_{ip}$.
The measurement equals the true value plus some error term ($\epsilon_i$).
$Y_i = \mu_i + \epsilon_i$The answers you provide on a survey are inherently variable. Given so many "trials," you might not provide the same answers. In theory, there could be an infinite number of trials! We can use another subscript $_t$ to denote the trial of the measurement. We will still use $_i$ to represent each element of the population (e.g. the person completing the survey).
$Y_{it} = \mu_{i} + \epsilon_{it}$Q: Have you ever, even once, used any form of cocaine?
Survey respondents tend to under report behaviors that they perceive of as being undesirable. Even if the answer is yes, they may answer no.
What if the discrepancy between responses and the true value is systematic?
If response deviations are systematic, then we have response bias, which will cause us to under-estimate or over-estimate population parameters. If they are not systematic, we have response variance, which leads to instability in the value of estimates over trials.
Let's Focus on the Right Side: Representation
Adapted from Groves et al. (2009) Survey Methodology.
There are people in the population that are not in our sampling frame, and there are people in our sampling frame that are not in our population.
:(
Coverage of a target population by a frame.
Adapted from Groves et al. (2009) Survey Methodology.
If there are some members of the sampling frame that are given no, or reduced, chance of inclusion, then we have sampling bias. They are systematically excluded. Sampling variance is not systematic and is due to random chance.
:(
$\bar{Y}$ = mean of the entire target population
$\bar{Y}_C$ = mean of the population on the sampling frame
$\bar{Y}_U$ = mean of the population not on the sampling frame
N = total number of members in the target population
C = total number of eligible members on the sampling frame
U = total number of eligible members not on the sampling frame
If the values of our statistics computed on the respondent data differs from the values we would get if we computed statistics on the entire sample data, then we have non-response bias.
$\bar{y}_s$ = mean of the entire sample as selected
$\bar{y}_r$ = mean of the respondents within the $s$th sample
$\bar{y}_n$ = mean of the nonrespondents within the $s$th sample
$n_s$ = total number of sample members in the $s$th sample
$r_s$ = total number of respondents in the $s$th sample
$m_s$ = total number of nonrespondents in the $s$th sample
$s$th sample? Yup. Conceptually this is similar to the idea of trials. The sample we draw is one of many that we might possibly have drawn. It's one single realization.
We make postsurvey adjustments to mitigate the damage of the types of errors we just discussed. Sometimes we introduce new errors.
If a source of error is systematic, we call it bias. If it is not systematic, we call it variance. Most errors probably contain both biases and variances.
Here it is one last time.
Adapted from Groves et al. (2009) Survey Methodology.
A response rate is the number of people participating in a survey divided by the number of people selected in the sample, in the form of a percentage.
Low response rates are a danger sign, suggesting that the nonrespondents are likely to differ from the respondents in ways other than their willingness to participate in your survey.
50% is often considered acceptable
60% is often considered good
70% is often considered very good
These are only rough guides. They have no statistical basis, and a demonstrated lack of response bias is better than a high response rate.
Bram Thinking Statistically Ch. 1
A zombie example...
The zombies from iZombie, not The Walking Dead...
Liv Moore, iZombie
Your friend takes the drug. She starts to look and behave like a zombie. What is the probability that she is a zombie?
Remember, there are 100 people in our village. Twenty are actually zombies. :(
Our drug will identify zombies correctly 90% of the time. Unfortunately, it will also give false positives (i.e. incorrectly identify non-zombies as probable zombies) 30% of the time.
$ .9 \times 20 = 18 $ zombies
$ .3 \times 80 = 24 $ non-zombies
Now let's focus on the $ 18 + 24 = 42$ that are possible zombies.
Of the 42 that appear to be zombies, 18 actually are and 24 are not.
$$18/42 = 3/7 = 43\%$$
So, equally important, what is the probability that someone who has taken the drug but does not appear to be a zombie is actually a zombie?
20 zombies, 80 non-zombies
90% success for zombies ("sensitivity")
30% false positives ("specificity")
Since 58 people did not appear to be zombies when given the drug, $2/58 = 1/29 = 3.4\%$ of the villagers continue to live as secret zombies.
$P(X)$
$P(X \mid Y)$
$X$ = Hypothesis $Y$ = Evidence
Bayes' insight was that the conditional probability depends on 4 different things:
$$ P(X \mid Y) = \frac{P(Y \mid X_{1}) \times P(X_{1})}{P(Y)} $$
Now, what are the hypotheses and evidence from the zombie example?
Another example: What is the probability that Spike is a vampire?
Bayes is about updating beliefs when confronted with new evidence. This requires a prior belief to update from. If you get the prior probability wrong, your conclusions will be wrong even if the update was correct.
Another name for the prior is base rate.
The base rate fallacy is when you do not take account of the base rate -- i.e. the prior probability that something was true before new evidence was introduced.
"What is the probability that a person is zombifying given that the test came out positive, and given that the test is pretty good but it isn't perfect, and given that non-zombies greatly outnumber zombies in our current population?"
Now... Explain what went wrong in the Sally Clark case?
In class activities.
See R scripts distributed for class lab sessions.
See R scripts distributed for class lab sessions.
With correlation, we measure the association between two quantitative variables. What if we want to predict one variable from another? Any particular outcome can be predicted by a combination of a model and some error.
$$ outcome_i = (model) + error_i $$
If there is a linear relationship between our response and explanatory variables, we can summarize the relationship between them with a straight line.
Why do criminal sentences vary? Could it be related to the number of prior convictions a person has?
We use the equation of a straight line. Let's start with just one explanatory variable, which makes this a simple regression:
$$ y = a + bx $$
If $b$ is positive, the value of y increases as x increases. If $b$ is negative, it decreases as x increases. If $b$ = 0, the value of y does not change with x.
We will fit a regression model to our data and use it to predict values for the response variable.
$$ Y_i = (b_0 + b_1 X_i) + \epsilon_i $$
$b_0$ and $b_1$ are regression coefficients.
Continuing with our example, if we want to predict the length of the sentence based on the number of prior convictions:
$$ Y_i = (b_0 + b_1 X_i) + \epsilon_i $$
or
$$ Y_i = (b_0 + b_1 Priors_i) + \epsilon_i $$
the length of an individual's sentence is a function of (1) a baseline amount given to all defendants + (2) an additional amount for each prior conviction, and (3) a residual value that is unique to each individual case.
There are many lines we could fit to describe the data. To find the line of best fit, we typically use a method called least squares. The method of least squares will go through, or get close to, as many of the points as possible.
We will have both positive and negative residuals, because there will be data points that fall both above and below our line of best fit. We square the differences before adding them up to prevent the positive residuals (points above the line) from canceling out the negative residuals (below the line).
If the squared differences are very big, the line does a poor job of representing the data. If the squared differences are small, it does a good job of representing the data.
The line of best fit is the one with the lowest Sum of Squared Differences ($SS$ for short, or $\sum residual^2$). The method of least squares selects the line with the smallest $SS$.
Goodness of Fit
We have the best line possible now. But what if it does a really bad job of actually fitting the data? To assess the goodness of fit:
These values will be reported in your R output.
Regression: $Y$ = 3 + 0.5($X$). The linear least squares regression is only a good summary of the relationship between $x$ and $y$ for the first dataset. In the second dataset, the relationship is non-linear. In the third dataset there is an outlier. In the fourth dataset, the least squares line chases the influential observation.
These assumptions can easily be wrong. We have attempt to check and see if the assumptions are reasonable.
It is dangerous to summarize a relationship between $x$ and $y$ beyond the range of the data.
A lurking variable is one that has an important effect on the relationship between $x$ and $y$ but is omitted from the analysis. This can lead to you missing a relationship that is present, or inducing a relationship that is not present.
The possible presence of lurking variables makes causal analysis more different in observational work. Multiple regression does better than simple regression. Experimental designs are best are mitigating lurking variables.
Why do criminal sentences vary? deserved? moody judge? long criminal record? vicious crime? defendant's race? race of victim? There could be many theories. What is the relative importance of each variable?
By extending regression analysis to 2 or more explanatory variables, we (1) reduce the size of the residuals and therefore account for more variation in the response variable, and (2) can hold these additional causes of the response variable constant statistically, resulting in a more accurate estimation of the effect of $x$ on $y$ because it is less likely that we will omit lurking variables.
We can extend the number of explanatory variables in multiple regression to $k$ variables, $x_1, x_2, ..., x_k$ for the regression equation:
$$ y = a + b_1 x_1 + b_2 x_2 + ... + b_k x_k + residual $$
Now we have a new coefficient for each explanatory variable we add! :) The outcome is predicted from the combination of all the variables multiples by their respective coefficients and, of course, the residual term.
$b_1$ is the average change in $y$ for a one-unit increase in $x_1$ holding the other explanatory variables constant.
Janice Aurini, University of WaterlooMelanie Heath, McMaster UniversityStephanie Howells, University of Guelph
All of our previous conversations about research design are relevant. What's different about today is the focus specifically on qualitative designs and methods.
Is your question researchable?