Introduction
This will be useful to social researchers and research philosophers. The purpose of the tutorial is to encourage young researchers to reflect on the philosophical dimension of research and motivate them to question whether they must follow a straight-jacketed research method and the null hypothesis statistical testing (NHST) technique. The philosophical dimension as used here refers to a questioning mind and not to the philosophical underpinnings of research (namely, ontology, epistemology, methodology, method, and axiology).
The article focuses on the hypothesis statement in the context of social scientific research and asks some fundamental questions that bear on the confusion and complications in stating a hypothesis and subsequent hypothesis testing. The questions are fundamental as formulating, and testing hypotheses constitute the key “distinguishing characteristic of the scientific method” (1), (p. 22). In short, the questions are as follows: (1) What is a hypothesis? What are the different kinds of hypotheses? (2) What is an acceptable format for a research hypothesis? (3) Is the historically used mix of the classical Fisher’s approach (F-test) and the more structured Neyman and Pearson’s approach (suggesting significance level or Type I error, α; Type II error; β; and statistical power, 1-β) for NHST appropriate? [Refer to Cumming (2) for meaningful comprehensive exposure to the controversy and the mix]. Are there alternatives to the conventional approaches to hypothesis testing and interpretation that provide a more precise conclusion and help in handling the controversy?
The article organizes the contents into six sections, including this section on the introduction. The next three sections present the confusion and complication to elaborate on the questions raised in the above paragraph. The fifth section presents a social research framework, which integrates the discussion on the hypothesis statement and statistical testing. The final section concludes the discussion.
Different kinds of hypotheses
Most students writing master’s dissertation, many research scholars, and researchers state a null hypothesis upfront in their report. They are generally confused or do not have a clear answer to the question, “Should I state the null hypothesis or the alternative hypothesis at the beginning of my report?” Unfortunately, since they do not get a clear answer to the question, they continue with what they think is appropriate. Non-clarity on the distinction between the research hypothesis and statistical hypotheses complicates the issue further. This section presents different kinds of hypotheses and their statement to salvage the confusion.
What is a hypothesis? Are there different kinds of hypotheses? Researchers and scholars have conceptualized hypotheses in different ways. A hypothesis is: (1) a tentative proposition that suggests a solution to a problem or as an explanation of some phenomenon; a hypothesis relates theory to observation and observation to theory (3); (2) a conjectural statement of the relation between two or more variables; it is a relational proposition (4); (3) a formal statement of the expected relationship between an independent and dependent variable (5). The hypothesis is a tentative explanation that accounts for a set of effects and can be tested by further investigation.
As a prediction, a hypothesis is an educated guess as to how a scientific experiment will turn out. It is an educated guess because it is based on previous research, training, observation, and a review of the relevant research literature. Two basic definitions of a hypothesis form the root of the discussion on the different kinds of hypothesis statements below: (1) a hypothesis is a tentative answer to a research question; and (2) a hypothesis is a predictive statement related to the research question. This subsection focuses on the different kinds of hypotheses.
A review of statistical inference comprehensively summarizes different kinds of hypotheses, their statement, and testing (1), (pp. 22–33). The key concept that the summary includes are as follows: (1) direct and indirect statements. Direct statements of hypotheses are inferred from directly observable limited phenomena, e.g., “This rat is running.” An indirect statement is a hypothesis inferred through inductive inference, e.g., “All rats run under condition X.” Hypothesis testing relates an indirect statement to a “scientific [research] hypothesis” and a direct statement to “statistical hypotheses,” which relate to different entities:
“A statistical hypothesis is a statement about one or more parameters of population distributions; it refers to a situation that might be true … scientific hypotheses [or research hypotheses] refer to the phenomena of nature and men” (6), cited in Kirk (1), (p. 22).
(2) A research hypothesis is deduced from the existing literature and/or the real-world phenomena through the hypothetico-deductive approach. The truth or falsity of a research hypothesis cannot be directly verified. Probabilistic verification requires stating and testing a statistical hypothesis to infer the research hypothesis. There are two kinds of statistical hypotheses, namely, null hypothesis, and alternate hypothesis. A null hypothesis is a statement of no effect of one variable on the other or no relationship between the variables. The research hypothesis is by default the alternate hypothesis. Statistical verification involves NHST under certain decision criteria. The section titled “The New Statistics for Hypothesis Testing” below summarizes the limitations of NHST and indicates an alternative to test the statistical significance of the hypothesis.
The above summary suggests that a researcher must state the research hypothesis upfront. The need for a statistical hypothesis appears only at the data analysis stage though it can be stated in the section dealing with the research design.
What are the situations/conditions that require a tentative answer or a prediction? Predictions entail future orientation. For example, a researcher who employs some treatment to participants in a study creates an empirical situation to establish which prediction is appropriate. The next section titled “Different Formats of Hypothesis Statement” discusses the hypothesis as a predictive statement.
Different formats of hypothesis statement
There is generally a debate and confusion among academicians from different disciplines (social sciences, sciences, and engineering) about the format of the hypothesis statement. Basically, this means how the hypothesis is set out. Primarily, there are two formats under which various other formats can be accommodated: (i) “X is related to Y” and (ii) “X will be related to Y.” In this article, the former setting is referred to as the is format, and the latter as the will format. As discussed below, the is format posits a tentative answer to the research question under investigation. The alternative format predicts the expected relationship between the variables. This section is an attempt at analyzing the above confusion.
Hypothesis statement: issues and confusion
The following question came up for discussion during a meeting of the Research Degree Committee (RDC) for psychology at a university: Is it appropriate to state a hypothesis in the is format? The view emerged amongst the committee members that will/would is the “correct” format for the hypothesis statement; the committee did not consider the is format appropriate for doctoral work (probably for all research!). The “correct” format view had a serious implication for candidates whose doctoral research proposals were under scrutiny by the committee. The committee decided to ask candidates who had used the is format for hypothesis statements to submit a revision with the hypotheses stated in the will format. That decision of the committee meant the concerned candidates lost up to another year to get seriously started with their doctoral research as the RDC does not meet so often. RDC probably made some other observations and comments as well for a revised submission.
The present paper delves into the question of correct (acceptable) format to state a research hypothesis. Is the is format for stating a research hypothesis wrong (unscientific, unacceptable)? Is it conceptually appropriate to use the will format in all situations? Alternatively, are there situations in which the will format can be shown to be limited? This article shows that the answers to the above questions emerge as “no,” “no,” and “yes,” respectively.
What reason did the committee give to reject the is format? The committee thought that the is format suggests that the researcher already knows the answer to the research question to which the hypothesis is related. “Should there be a need to do the proposed research if the answer to the research question is already known?” The committee asked a genuine question, conditional to the committee’s thought, and had an answer in the negative. On the basis of the above logic, the committee’s decision cannot be faulted. The more basic question to ask is if the committee is right in assuming that the is format makes the hypothesis invalid or unacceptable. Does the is format imply that the researcher knows the answer before the research is completed?
This section focuses on two major objectives. The first objective is to show that the is format of the hypothesis statement is a correct and acceptable format. The will format can be a possibility at a certain level of hypothesis statement; the is format is not wrong, but not necessarily the only correct format. The second objective is to show that most of the above issues can be handled by considering the hypothesis statement at various levels. A major suggestion in this direction is to state hypotheses at two levels, namely, the conceptual level (research hypothesis discussed above) and the operational level (hypotheses related to the variables of the study).
More clearly spelled, the present article explores the appropriateness of a format for a particular hypothesis. This section presents a logical argument and other inputs to answer a stronger question, “Is the is format of the hypothesis statement the only appropriate format, all other formats being inappropriate or appropriate depending on the research objectives?” If the answer to the last question is positive, a rethinking is required to eliminate the confusion that the is/will format causes doctoral researchers and learners.
Existing formats for hypothesis statement
What are the situations/conditions that require a tentative answer or a prediction? Predictions entail future orientation. For example, a researcher who employs some treatment to participants in a study creates an empirical situation to establish which prediction is appropriate.
Treatments may include such conditions as an electric shock, food deprivation, or mood. The researcher might be interested in investigating how these conditions affect some behavior, for example, convulsion reaction of the body, rat’s activity on an activity wheel, and liking a product. The will/would format is appropriate to such situations—H1: Electric shock will cause convulsion; H2: If the rat is deprived of food for a longer duration, the rat will run faster on the activity wheel; H3: Individuals in a positive mood will show a more positive evaluation of a product as compared to the evaluation by individuals in a negative mood. However, these statements do not rule out the appropriateness of the is format for such situations—H4: Electric shock causes convulsion; H5: Longer periods of food deprivation increase the rat’s speed on the activity wheel; H6: An individual in a positive mood shows a more positive evaluation of a product as compared to the evaluation by a person in a negative mood.
Now to the analysis of a hypothesis as a tentative answer to a research question. There are some situations in which prediction is inappropriate. Antecedent conditions are an example of preexisting conditions. For example, demographic variables (gender, family rearing, education, etc.) and personality characteristics (extraversion, agreeableness, neuroticism, etc.) are antecedent conditions. These are the conditions that a participant comes with to an experiment or a survey; the researcher has no way to change these conditions of a particular participant. Of course, in psychological research, these conditions are used as variables, but not through direct manipulation as done for regular independent variables, but by the selection of participants so they have those characteristics. Why is prediction inappropriate in such conditions? The answer to this question is that in such conditions there is a situation: X is a male; Y is high on agreeableness; Z has a university-level education.
Since antecedent conditions preexist (i.e., they exist before a study is conducted), relationships among those conditions must also preexist. Hence, the is format is appropriate to those relationships as well—H7: There is a positive relationship between socioeconomic status and extraversion.
Sample hypotheses
A screening of psychology journals indicates that researchers employ all possible statements for hypotheses. The following examples of hypothesis statements indicate how researchers employ different formats.
Chen et al. (7) investigated the following hypotheses in their research on “egocentricity and the role of friendship and anger.” Hypothesis 1: Reciprocity responses to negative exchange imbalance are more negative than reciprocity responses both to positive exchange imbalance and neutral exchange. Hypothesis 2: Egocentric reciprocity tendencies are more pronounced when interacting with strangers than with friends, such that negative reactions to negative imbalance exchanges are stronger for strangers than for friends, and strangers are less likely to differentiate between positive imbalance and neutral exchanges. Hypothesis 3a: Anger mediates the interaction effect of exchange imbalances and friendship on reciprocity responses. Hypothesis 3b: Indebtedness mediates the interaction effect of exchange imbalances and friendship on reciprocity responses.
In a study investigating “The impact of gender ideology on the performance of gender-congruent citizenship behaviors,” Clarke and Sulsky (8) proposed the following hypotheses: H1: Gender will predict civic virtue such that men will report performing more civic virtue than women. H2: Gender will predict helping such that women will report performing more help than men.
Huang et al. (9) investigated the impact of safety climate on job satisfaction, employee engagement, and turnover by employing the social exchange theory framework. They proposed the following hypotheses: Hypothesis 1a: Employee safety climate perceptions (both organizational-level and group-level safety climate) that are more positive will relate to higher levels of employee job satisfaction. Hypothesis 1b: Employee safety climate perceptions (both organizational-level and group-level safety climate) that are more positive will relate to higher levels of work engagement. Hypothesis 1c: More positive employee safety climate perceptions (both organizational-level and group-level safety climate) will relate to a lower turnover rate. Hypothesis 2a: Job satisfaction is hypothesized to mediate the relationship between safety climate (both organizational-level and group-level) and employee engagement. Hypothesis 2b: Job satisfaction is hypothesized to mediate the relationship between safety climate (both organizational-level and group-level) and employee turnover.
Edwards et al. (10) made the following predictions in their research that investigated the interaction between cognitive trait anxiety, stress, and effort: (1) Higher somatic trait anxiety would be associated with lower efficiency for those performing under the threat of electric shock, but that this effect would be restricted to those reporting lower effort. (2) Higher cognitive trait anxiety would be associated with lower efficiency for those in the ego-threat condition, and this effect would be restricted to those reporting lower effort. (3) Performance effectiveness would be positively associated with effort but independent of somatic and cognitive anxiety and stress.
Some data on hypothesis formats
Several articles published in the top-ranking APA journals present evidence in favor of both formats. A screening of 15 issues of the Journal of Consumer Psychology (the journal of the Society For Consumer Psychology, published by Wiley), without any sampling or showed interesting patterns. Hypotheses stated in “is,” “will,” “should,” and “would” formats were counted. Table 1 presents the data.
Table 1. Number of Hypotheses with Different Formats Published in 15 Issues of the Journal of Consumer Psychology.
A brief description of the counting procedure is in order. Only those hypotheses were counted that were stated in block paragraphs and numbered (for example, H1, H2, etc.). Hypotheses embedded in the running text were not counted. If a hypothesis stated several parts, each part was counted as a hypothesis. Thus, H1A and H1B, or H1 (a and b), were counted as two hypotheses. The reason was that in some cases, the hypotheses were stated in mixed formats. For example, in one article, a hypothesis had four parts, of which one part used the format “will,” one part used the format “should,” and two parts used the format “is.” Such cases were few. One article stated 13 propositions, all in is format, which was counted as hypotheses.
It can be inferred from Table 1 that about 72% of hypotheses used the will format (combining the will and the would statements) and about 28% used the is format (combining the is and the should statements). These data and the sample hypotheses presented in Section 2 above support the view that researchers use both formats, is and will, in stating the hypotheses related to their research questions.
The new statistics for hypothesis testing
American Psychological Association clearly states that NHST is a starting point, and additional measures (confidence intervals, CI; effect size, ES; and extensive description) must be added to convey the most complete meaning of the results (11), (Section “3.7. Quantitative research standards: statistics and data analysis”). Cumming highlights the limitations and drawbacks of NHST with the help of several examples and illustrations and suggests:
“… we should shift emphasis as much as possible from NHST to estimation, based on effect size and confidence intervals. Effect sizes and confidence intervals provide more complete information than does NHST. Meta-analysis allows accumulation of evidence over a number of studies” (2), (p. ix).
The reporting of confidence intervals and effect size entered in APA publications right at the beginning of the twenty-first century: “In 2005, the Journal of Consulting and Clinical Psychology (JCCP) became the first American Psychological Association (APA) journal to require statistical measures of clinical significance, plus effect sizes (ESs) and associated confidence intervals (CIs), for primary outcomes” (12), cited in Odgaard and Fowler (13). The American Psychological Association now requires that the articles submitted to APA journals report CI and ES: “estimates of appropriate effect sizes and confidence intervals must be reported” (14), (p. 33, italics added).
Confidence interval at a given level of confidence (1–α, generally chosen as 95 or 99%) is the range of values that lie between the mean of the estimate of a statistic (M) minus and plus the error in the estimate (M ± μ), where μ is the population mean. CI uses the largest likely estimation error, called the margin of error (MOE = tcritical × SE), where tcritical is the critical value of the test statistic at given degrees of freedom and the level of significance, and SE is the standard error of the test statistic). MOE measures the precision of estimation. Thus, the shorter the CI, the higher the precision. The 95% confidence interval is reported as follows: the 95% CI[M–t95%(N-1) × s/(N), M + t95%(N-1) × s/(N)], where N is the sample size, and s is the standard deviation of the sample data distribution. Values in the confidence interval are plausible values of μ. If CI includes the value zero, the null hypothesis cannot be rejected, otherwise, it is rejected. CI also helps in detecting outliers.
Effect size is “any of various measures of the magnitude or meaningfulness of a relationship between two variables” (15), (p. 352). It is a value that indicates the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. One or more measures of effect size constitute “the primary product of a research inquiry” (16). For example, the ES in the analysis of variance is historically reported as a correlation (η2) between the independent and dependent variables; the effect size based on the correlation coefficient is obtained as r2. But, there are suggestions to show that the conventional formulae are erroneous, and more accurate alternatives should be used (17). In general, the ES of a correlation coefficient is interpreted as the proportion of variance in one variable explained by the other. It indicates the practical significance (based on the statistical analysis, which is not the same as the economic significance) of the statistic. The statistical significance of a statistic as revealed by the confidence interval should not be confused with the practical significance. Every study must report the practical significance of the findings.
Meta-analysis is a “quantitative technique for synthesizing the results of multiple studies of a phenomenon into a single result by combining the effect size estimates from each study into a single estimate of the combined effect size or into a distribution of effect sizes” (15), (p. 644). It “can produce strong evidence where at first sight there seems to be only weak evidence.” The publication manual requires reporting all variables employed in the study, whether used in the analysis or not and irrespective of their significance level: “even when a characteristic is not used in analysis of the data, reporting it may… prove useful in meta-analytic studies that incorporate the article’s results” (18), p. 30, cited in Cumming (2). Meta-analysis generally turns long CIs into short ones, and thus turns weak evidence into a strong one.
The social research framework (SRF)
Some researchers make a distinction between hunches, hypotheses, and working hypotheses. “A” “working hypothesis” is little more than the common-sense procedure that people use routinely. Encountering certain facts, certain alternative explanations come to mind and we proceed to test them” (19), cited in Merton (20), (p. 61). “The investigator begins with a hunch or hypothesis, from this he draws various inferences and these, in turn, are subjected to empirical test which confirms or refutes the hypothesis” (20), (p. 176). “During latent learning the rat is building up a “condition” in himself, which I have designated as a set of “hypotheses,” and this condition—these hypotheses—do not then and there show in his behavior” (21), (p. 161).
This section presents a framework of the social research process (SRF, Figure 1) based on the above distinctions that integrate the different kinds and formats of the research hypotheses discussed in the sections above. The framework suggests that social research takes place in the real world, but research operations happen in the conceptual world. Below is a very brief description of the different components of the SRF and their interrelationships relevant to the focus of this article.
The primary objective of scientific research is to understand the real world. Since the real world is complex, the scientist focuses on a part of the real world, which is referred to as the conceptual world in Figure 1. The conceptual world comprises such elements as the extant concepts, constructs, theories, models, and research relevant to the discipline to which the scientist belongs. In an interdisciplinary context, the conceptual world appropriately includes the elements from the interfaces and the domains of the contributing disciplines. Owing to its contents, the conceptual world is primarily representational in nature and is a large storehouse of representational elements indicated above.
For a particular research investigation, the researcher actively creates the research world that encompasses the “doing” or the activity-related elements of research. Activities in the research world are intensely related to the representational elements of the conceptual world. In general, there is a strong interplay among the real world, the conceptual world, and the research world that the social researcher deals with. Broken arrows in Figure 1 indicate inflows to and outflows from a particular world.
The research world can be divided into two subspaces shown as the research operations space and the statistical operations space. Efficient researchers, particularly those involved in quantitative research, develop strong skills to move between these three worlds and the two subspaces.
The research operations space (ROS)
The ROS comprises concepts, constructs, methods, and analysis. Concepts are theoretical abstractions formed by generalizing particulars (for example, motivation, aggression, attitude, etc.). Constructs are deliberately and consciously invented or adopted for a special scientific purpose (for example, intelligence and personality). In some sense, constructs can be thought to be generalizations from concepts. For example, cognitive intelligence comprises such component concepts as numerical intelligence, verbal intelligence, mechanical intelligence, etc.
Conceptual and operational hypotheses
Figure 1 indicates a distinction between the conceptual and operational hypotheses. Conceptual hypotheses are initial hunches based on observations, facts, and theories.
Thus, conceptual hypotheses indicate research hypotheses. In this sense, they are tentative answers to research questions. Since facts and theories are relatively permanent, conceptual hypotheses are more appropriately stated in the is format. Operational hypotheses derive from conceptual hypotheses and make predictions about the relationships between observed variables. Below is an example to illustrate the difference between these two types of hypotheses.
If a company dealing in soft drinks is in the process of extending one of its brands, the extension may be congruent or incongruent with the existing brand. This leads to the concept of “brand extension incongruity.” What will be the attitude of the consumers toward the extended brand? The concept brand of “purchase intention” reflects the attitude toward the brand. The relationship between these two concepts can be stated as a conceptual hypothesis: “Brand extension incongruity is related to purchase intention.”
The measurement of concepts requires variables at different levels. The company can measure brand extension incongruity at different levels: congruent brand extension (developing another soft drink), moderately incongruent brand extension (developing snacks), and extremely incongruent brand extension (developing apparel). Similarly, purchase intention can be measured as the likelihood that a consumer will purchase the extended brand. A relationship between incongruity and purchase intention is then stated as an operational hypothesis stating the relationship between the variables: “Likelihood of purchasing a congruent brand extension will be more than for the moderately if extremely incongruent brand extensions.” In this form, the operational hypothesis is a predictive statement. In contrast, if some information is available in the literature about the relationship between these two variables (for example a theory or empirical research), the is format is more appropriate: “Likelihood of purchase is related to the brand extension incongruity.” Is there a linear or non-linear relationship between the extension incongruity and purchase intention? The answer to this question depends on the theory that predicts the relationship [refer to (22)].
Conclusion
Social researchers need to distinguish research hypotheses from statistical hypotheses. Research hypotheses directly relate to real-world phenomena, whereas statistical hypotheses relate to population parameters. If no hypothesis testing procedure is required in a particular study, stating statistical hypotheses makes no sense, but stating a research hypothesis is still meaningful.
There are several limitations of the historically used null hypothesis testing approach by psychologists. Most journals (particularly the APA journals) require reporting confidence intervals and effect sizes. These two measures are useful as they reveal the significance of a hypothesis as in the conventional approach and provide additional information. For example, they provide a precision of the measures and practical significance of the findings. The meta-analysis can reveal strong results whereas individual studies may indicate only weak evidence.
Author contributions
The author confirms the following responsibilities: conceptualization of the study and literature survey to get relevant information appropriate to the theme of the manuscript.
Conflict of interest
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
References
1. Kirk RE. Experimental Design: Procedures for the Behavioral Sciences. Belmont, CA: Books/Cole (1968). p. 22–4.
2. Cumming G. Understanding the new Statistics: Effect Size, Confidence Interval, and Meta-Analysis. London: Routledge (2012).
3. Ary D, Jacobs LC, Razavieh A. Introduction for Research in Education. 6th ed. Belmont, CA: Wadsworth (2002).
5. Creswell JW. Research Design: Qualitative and Quantitative Approaches. Thousand Oaks. CA: Sage (1994).
7. Chen X, Eberly M, Bachrach D, Wu K, Qu Q. Egocentric reciprocity and the role of friendship and anger. J Soc Psychol. (2017) 157:720–35.
8. Clarke HM, Sulsky LM. The impact of gender ideology on the performance of gender-congruent citizenship behaviors. Hum Perform. (2017) 30:212–30.
9. Huang Y, Lee J, McFadden AC, Murphy LA, Robertson M, Cheung JH, et al. Beyond safety outcomes: an investigation of the impact of safety climate on job satisfaction, employee engagement and turnover using social exchange theory as the theoretical framework. Appl Ergon. (2016) 55:248–57.
10. Edwards MS, Edwards EJ, Lyvers M. Cognitive trait anxiety, stress and effort interact to predict inhibitory control. Cogn Emot. (2017) 31:671–86.
11. American Psychological Association. Publication manual of the American Psychological Association. 7th ed. Washington, DC: APA (2020).
13. Odgaard EC, Fowler RL. Confidence intervals for effect sizes: compliance and clinical significance in the Journal of Consulting and clinical Psychology. J Consult Clin Psychol. (2010) 78:287–9.
14. American Psychological Association. Publication Manual of the American Psychological Association. 6th ed. Washington, DC: APA (2009).
15. VandenBos GR. APA Dictionary of Psychology. 2nd ed. Washington, DC: American Psychological Association (2015).
17. Learner MD, Mikami AY. Correct effect size estimates for strength of association statistics: comment on Odgaard and Fowler (2010). J Consult Clin Psychol. (2013) 81:190–1.