Partial Least Square Structural Equation Modelling’ use in Information Systems: An Updated Guideline of Practices in Exploratory Settings

The purpose of many studies in the field of Information Systems (IS) research is to analyse causal relationship between variables. Structural Equation Modelling (SEM) is a statistical technique for testing and estimating those causal relationships based on statistical data and qualitative causal assumption. Partial Least Square Structural Equation Modelling (PLS-SEM) is the technique that is mostly used in IS research. It has been subject to many reviews either in confirmatory or exploratory settings. However, it has recently emerged that PLS occupies the middle ground of exploratory and confirmatory settings. Thus, this paper intends to propose an updated guideline for the use of PLS-SEM in Information Systems Research in exploratory settings maintaining interpretability. A systematic literature review of 40 empirical and methodological studies published between 2012 and 2016 in the leading journal of the field guide future empirical work.


Introduction
The purpose of many studies in the field of Information Systems (IS) research is to analyse causal relationship between variables.Several techniques allow researchers to evaluate their models such as regression, structural equation modelling (SEM).SEM is a statistical technique for testing and estimating those causal relationship based on statistical data and qualitative causal assumption (Urbach & Ahlemann, 2010).Contrary to the first generation statistical tools such as regression, SEM enables researchers to answer a set of an interrelated research question in a: a) single, b) systematic, and comprehensive analysis by modelling the relationship between multiple independent and dependent constructs simultaneously.This capability for simultaneous analysis differs greatly from most first generation regression models such as linear regression, LOGIT, ANOVA, and MANOVA, which can analyse only Partial Least Square Structural Equation Modelling (PLS-SEM) is the most SEM technique used in IS research.PLS is regarded as the most fully developed and general system (Jörg Henseler, Hubona, & Ash, 2016).IS was identified as the primary user of PLS (Evermann & Tate, 2014).Rönkkö et al. (2012) argue that the use of partial least squares path modelling as a tool for theory testing has been increasing in the late 90's and PLS is currently one of the most common quantitative data analysis methods in the top IS journals.However, they emphasise that reliance on PLS method has possibly resulted in producing and publishing a large number of studies, whose results are invalid.These critics have been addressed by the literature (J.Henseler et al., 2014).
The technique has been subject to many reviews (Evermann & Tate, 2012;Jörg Henseler et al., 2016;Rouse & Corbitt, 2008;Urbach & Ahlemann, 2010).That has resulted in the production of guidelines for the use of PLS-SEM in IS research.Most of these guidelines focus on either explanatory (confirmatory) or exploratory research.For instance, Henseler et al. (2016) propose an updated guideline for the use of PLS in IS research in confirmatory settings.On the other hand, Urbach & Ahlemann (2010) come up with a guideline for the utilisation of the technique in exploratory contexts.
The literature provides three purposes of any research: exploratory, descriptive or explanatory (confirmatory).An exploratory study is a valuable means of finding out what is going on; to look for new insights; to ask questions and to evaluate phenomena in a new light (Saunders, Lewis, & Thornhill, 2009).Exploratory research goes with a predictive model (Evermann & Tate, 2014).The object of descriptive research is to portray an accurate profile of persons, events or situations (Saunders et al., 2009).Studies that establish causal relationships between variables may be termed explanatory research (Saunders et al., 2009).Explanatory research goes with the causal model (confirmatory) model (Kante, Oboko, & Chepken, 2017).
Nevertheless, Evermann & Tate (2014) argue that the causal and predictive modelling are dualities.Rather, there is a middle-ground between the two extreme positions.It is easier for decision makers and others to easily accept a predictive model if it is plausibly interpreted (Evermann & Tate, 2014).Further, they state that it may be simpler to determine the prediction boundaries, i.e. determine what situations the model will hold and under what Kabarak j. res.innov.Vol.6 No. 1, pp 49-67 (2018) situations the model will break, when a plausible substantive interpretation is available.Users of predictive models have more trust in its results, especially for unexpected or counterintuitive predictions, when there is a plausible interpretation possible (ibid.).In contrast to explanatory modelling, the plausible interpretations in this context do not entail a rigorous formal statistical testing of all posited relationships and model constraints as in causal-explanatory modelling (Evermann & Tate, 2014).
PLS path modelling was developed to occupy this middle ground and to straddle the traditional divide between causal-explanatory and predictive modelling at the extremes.It aims to maintain interpretability while engaging in predictive modelling (Evermann & Tate, 2014).Therefore, it is needed to review the guidelines of exploratory research by taking into account the middle ground.That justifies the purpose of this paper.
This article aims to update the guideline for the use of PLS-SEM in Information Systems Research in exploratory settings maintaining interpretability.It updated the paper of Urbach & Ahlemann (2010) that is mainly for exploratory settings.

Material and Methods
This section describes the methods that were used to conduct the study.
To efficiently perform the systematic literature, search criterion for inclusion in the dataset were defined.Table 1 provides the criterion.

Table 1: Criterion for inclusion/exclusion in the dataset
We had papers from proceedings and journals.Management Information Systems Quarterly (MISQ), Information Systems Research (ISR), Journal of Management Information Systems (JMIS) and Journal of the Association of Information Systems (JAIS) were identified as the four leading journals in the field of IS (Evermann & Tate, 2010).This paper is restrained to MISQ as it is recognised as the leading journal.We had for 26 research papers from MISQ:  Three papers in 2014: all of them were empirical studies.
 Six studies in 2015: one methodological paper and five empirical studies.
 One empirical study in 2016.
On the proceeding papers, we selected four papers from the conferences that were hosted or organised by the Association for Information Systems and its affiliated organisations.In conclusion, the data set was a sample size of 40 studies.From the data set, it was extracted: 1) reason for choosing PLS-SEM, 2) research epistemology, 3) research approach, 4) research strategy, 5) Model characteristics and 6) Model evaluation.

Results and Discussion
This section presents the in-depth analyses of the papers.

The Reasons for choosing PLS
Urbach & Ahlemann (2010) argue that overall, PLS can be an adequate alternative to CBSEM if the problem has the following characteristics:  The phenomenon to be investigated is relatively new, and measurement models need to be newly developed. The structural equation model is complex with a large number of LVs and indicator variables. Relationships between the indicators and LVs have to be modelled in different modes (i.e., formative and reflective measurement models). The conditions relating to sample size, independence, or normal distribution are not met, and.CB requires a large sample size while PLS does not require large sample size.
If the sample size is small, PLS is recommended in Information System research (Evermann & Tate, 2014), in Marketing research (Hair et al., 2011).
Table 2 gives an overview of the reason that underlines studies from our dataset to choose PLS.

2013; 2015
Analyse formative constructs (Kankanhalli, Ye, & Teo, 2015) 2015 Number of interaction terms (Venkatesh, Thong, & Xu, 2012) 2012 Mediated Models (Bartelt & Dennis, 2014) 2014 None of the studies used the small sample sizes criterion to justify the use of PLS-SEM.Instead, each one had another argument to justify their use of the technique.The use of small sample size for PLS-SEM is not recommended.For instance, Oodhue, Ewis, Hompson, Marcoulides, & Chin (2012) argue that when determining the minimum sample size to obtain adequate power, use Cohen's approach (regardless of the technique to be used).Do not rely on the rule of 10 (or the rule of 5) for PLS (ibid.).In addition, Kline (2013) argues that a "typical" sample size in studies where SEM was used is about 200 cases.Moreover, Garson (2016) quoting (Chin & Newsted, 1999) argues that sample sizes equivalent to those commonly found in SEM (ex., 150-200) are needed.Therefore, we conclude that a sample size of 200 or above is the right sample size for using PLS-SEM.

Research Epistemology
Many philosophical positions characterise information System research.Saunders et al. (2009) draw a comparison of the four research philosophies, which can be applied in information management research (Positivism, realism, interpretivism and pragmatism).
In Information System research, Urbach & Ahlemann (2010) argue that the investigation that applies SEM follows a positivist epistemological belief.Furthermore, they report that the positivist researcher does not intervene in the inquiry and thus plays a neutral roleEpistemologically, the positivist perspective is concerned with the empirical testability of theories (Urbach & Ahlemann, 2010).In other words, these theories are either confirmed or rejected.None of the paper that we reviewed had addressed the philosophical point of view.Therefore, we are consistent with Urbach & Ahlemann (2010) who argue that research that applies SEM (including PLS) follows a positivist epistemological belief.

Research Approach
The extent to which the researcher is evident about the theory at the beginning of his/her research raises an important question concerning the design of the research project (Saunders et al., 2009).That is whether his/her research should use the deductive approach, in which the researcher develops a theory and hypothesis (or hypotheses) and design a research strategy to test the hypothesis, or the inductive approach, in which he/she would collect data and develop a theory as a result of the data analysis (ibid.).None of the paper that we reviewed had reported their research approach.Nevertheless, the purpose of the empirical studies we reviewed was to gather data and test their hypotheses.That is a deductive approach, and thus, we conclude that studies using PLS-SEM apply a deductive approach.This research approach was not provided by the guidelines of Urbach & Ahlemann (2010).Saunders et al. (2009) argued that survey is a popular and shared strategy in business and management research and is most frequently used to answer who, what, where, how much and how many questions.It, therefore, tends to be used for exploratory and descriptive research.

Research Strategy
Our data set reveals that PLS-SEM studies applied survey as a strategy.That was also consistent as these studies were mainly done in exploratory settings.

Outer Model
Measurement model specification requires the consideration of the nature of the relationship between constructs and measures.Latent variable measurement concerns the process of ensuring that local independence is satisfied for a selected set of observed variables or indicators and this can be done via the use of a model such as a common factor model (Oodhue et al., 2012).There are two types of measurement models: reflective and formative (Figure 2) (Hreats, Becker, & Ringle, 2013).Formative and reflective are thus the two currently accepted ways of specifying the relationship between latent constructs and observed variables that are causally related to them (Aguirre-urreta & Marakas, 2012).In reflective measures, changes in the construct are reflected in shifts in all of its indicators, and the direction of causality is from the construct to the indicators (Garson, 2016).Reflective indicators are assessed regarding their loadings, which entails the simple correlation between the indicator and the construct (Hreats et al., 2013).The reflective model were reported by some reviewed empirical studies (Bartelt & Dennis, 2014;Fang et al., 2014 In formative measures, the indicators do not reflect the underlying construct but are combined to form it without any assumptions about the intercorrelation patterns among them (Garson, 2016).The direction of causality is from the indicators to the construct, and the weights of formative indicators represent the importance of each indicator in explaining the variance of the construct (Hreats et al., 2013).Reviewed empirical studies reported the use of formative model (Han et al., 2015;Jarvis, Mackenzie, & Podsakoff, 2012;Kankanhalli et al., 2015;Majchrzak et al., 2013;Marett, Otondo, & Taylor, 2013;Schmitz, Teng, & Webb, 2016;Setia et al., 2013;Venkatesh et al., 2012;Wang et al., 2013;Wu, Straub, & Liang, 2015).

Inner Model or Structural Model
The inner model (structural model) has also two types of variables: Exogenous and Endogenous (see figure 1).A latent variable is exogenous if it is not an effect of any other latent variable in the model (there are no-incoming arrows from other latent variables) (Garson, 2016).A latent variable is endogenous if it is an effect of at least one other latent variable (there is at least one incoming arrow from another latent variable) (Garson, 2016).
The inner model can also have other variables such as moderating variable, mediating variable and controlling variable.A moderator is a qualitative (e.g., sex, race, class) or quantitative (e.g., level of reward) variable that affects the direction and/or strength of the relation between an independent or predictor variable and a dependent or criterion variable (Baron & Kenny, 1986).They further argue that the relationship between two variables changes as a function of the moderator variable.In other words, moderator effect = interaction effect.A mediator (or mediating variable) accounts for the relationship between the predictor and the criterion (Baron & Kenny, 1986).It is an intervening variable (Garson, 2016).An intervening variable (mediator) transmits the effect of an independent variable to a dependent variable (Chin, 2006).Control variable (controlling) is a variable that is not the focus or planned as part of a research study but its existence has certain impact over Dependent Variable (DV) that cannot be ignored in which it is included in the research model testing together with other Independent Variables (IVs) (Fung, 2015).Hence it is called control variable, i.e. it is kept under "controlled", "monitored" or "constant" to observe whether it has minimal impact on the relationships between the independent variable and dependent variable (Fung, 2015).Usually, the control variable is not included as part of a hypothesis statement.

Model Evaluation
The model evaluation requires the assessment of the two inter-related models: measurement model (outer model) and structural model (inner model).

Outer Model Fit Evaluation a. Reflective outer model fit evaluation
The measurement model should be tested for least internal consistency reliability, indicator reliability, convergent validity, and discriminant validity by applying standard decision rules from the IS research literature.Urbach & Ahlemann (2010) argued that the traditional criterion for assessing internal consistency reliability is Cronbach's alpha (CA), whereas a high alpha value assumes that the scores of all items with one construct have the same range and meaning (Cronbach 1951).However, Garson (2016) argued that Composite reliability is a preferred alternative to Cronbach's alpha as a test of convergent validity in a reflective model.Compared to Cronbach's alpha, composite reliability may lead to higher estimates of true reliability.Regardless of which coefficient is used for assessing internal consistency, values above .700are desirable for exploratory research (Urbach & Ahlemann, 2010).Kabarak j. res.innov.Vol.6 No. 1, pp 49-67 (2018) Convergent validity entails the degree to which individual items reflecting a construct converge in comparison to items measuring different constructs.Urbach & Ahlemann (2010) argued that a commonly applied criterion of convergent validity is the average variance extracted (AVE) proposed by Fornell and Larcker (1981).It measures the percent of variance captured by a construct by showing the ratio of the sum of the variance captured by the construct and measurement variance (Gefen et al., 2000).An AVE value of at least .500indicates that an LV is on average able to explain more than half of the variance of its indicators and, thus, demonstrates sufficient convergent validity (Garson, 2016;Urbach & Ahlemann, 2010).
Finally, discriminant validity involves the degree to which the measures of different constructs differ from one another.Whereas convergent validity tests whether a particular item measures the construct it is supposed to measure, discriminant validity tests whether the items do not unintentionally measure something else (Urbach & Ahlemann, 2010).In SEM using PLS, two measures of discriminant validity are commonly used: Cross loading criterion and Fornell-Larcker (Urbach & Ahlemann, 2010).However, simulation studies demonstrated that the lack of discriminant validity is better detected by the heterotrait-monotrait (HTMT) (Jörg Henseler, Ringle, & Sarstedt, 2014).Moreover, in Information System research, it was argued that Discriminant validity should be assessed by the Heterotrait-Menotrait Ration (HTMT) (Jörg Henseler et al., 2016).Its ratio is the geometric mean of the heterotraitheteromethod correlations (i.e., the correlations of indicators across constructs measuring different phenomena) divided by the average of the monotrait-heteromethod correlations (i.e., the correlations of indicators within the same construct) (Garson, 2016).Table 3 summarises the measurement model assessment.
Table 3. Reflective measurement model assessment

Internal consistency reliability
Composite reliability > 0.6 Attempts to measure the sum of an LV's factor loadings relative to the sum of the factor loadings plus error variance.Leads to (Fang et al., 2014;Garson, 2016;Han et al., 2015b;Urbach & Ahlemann, 2010; values between 0 (completely unreliable) and 1 (perfectly reliable).

Discriminant validity
Cross-loadings requires that the loadings of each indicator on its construct are higher than the cross loadings on other constructs (Gefen et al., 2000;Urbach & Ahlemann, 2010;Wang et al., 2013)

Heterotrait-Menotrait Ration (HTMT)
Its ratio is the geometric mean of the heterotrait-heteromethod correlations (i.e., the correlations of indicators across constructs measuring different phenomena) divided by the average of the monotrait-heteromethod correlations (i.e., the correlations of indicators within the same construct) (Garson, 2016).

b. Formative outer model fit evaluation
The Evaluation of formative measurement models needs a different approach than that applied for reflective models (Urbach & Ahlemann, 2010).Because the indicators represent different dimensions, the researcher would not expect that the indicators would correlate highly, implying that composite reliability and Cronbach's alpha might not be high (Garson, 2016).Conventional validity assessments do not apply to formative measurement models, and the concepts of reliability and construct validity are not meaningful when employing such models.Whereas reliability becomes an irrelevant criterion for assessing formative measurement, the examination of validity becomes crucial (Diamantopoulos 2006).Accordingly, Urbach & Ahlemann (2010) quoting Henseler et al. (2009) argue that the indicator and the construct levels are the two measure to assess in evaluating formative constructs.
To assess indicator validity, the researcher should monitor the significance of the indicator weights using bootstrapping (Garson, 2016;Urbach & Ahlemann, 2010;Venkatesh et al., 2012).Outer model weights are the focus in formative models, representing the paths from the constituent indicator variables to the composite factor (Garson, 2016).A significance level of at least .050suggests that an indicator is relevant for the construction of the formative index and, thus, demonstrates a sufficient degree of validity (Urbach & Ahlemann, 2010).Weights vary from 0 to an absolute maximum lower than 1 (Garson, 2016).Also, the degree of multicollinearity among the formative indicators should be assessed by calculating the variance inflation factor (VIF).The VIF indicates how much of an indicator's variance is explained by the other indicators of the same construct (Garson, 2016;Jörg Henseler et al., 2016;Urbach & Ahlemann, 2010).That said, Urbach & Ahlemann (2010) report that values below the commonly accepted threshold of 10 indicate that multicollinearity is not an issue (Diamantopoulos and Siguaw 2006;Gujarati 2003).
The first step for assessing construct validity could be a test for nomological validity (Urbach & Ahlemann, 2010).In this context, nomological validity means that, within a set of hypotheses, the formative construct behaves as expected.Accordingly, those relationships between the formative construct and other models' constructs, which have been sufficiently referred to in prior literature, should be robust and significant (Henseler et al. 2009;Peter 1981;Straub et al. 2004).Urbach & Ahlemann (2010) further propose assessing construct validity by checking discriminant validity.Correlations between formative and all other constructs of less than .700indicate sufficient discriminant validity (Urbach & Ahlemann, 2010).(Garson, 2016;Han et al., 2015b;Kankanhalli et al., 2015;Schmitz et al., 2016;Urbach & Ahlemann, 2010;Venkatesh et al., 2012)

Nomo logical validity
Means that, within a set of hypotheses, the formative construct behaves as expected.
Relationships between the formative construct and other models' constructs, which have been sufficiently referred to in prior literature (Urbach & Ahlemann, 2010;Wu et al., 2015)

Inter-construct correlations
If the correlations between the formative and all the other constructs are less than .700, the constructs differ sufficiently from one another.(Urbach & Ahlemann, 2010) Source: adapted from Urbach & Ahlemann (2010)

Inner model fit evaluation
Once the reliability and validity of the outer models established, several steps need to be taken to evaluate the hypothesised relationships within the inner model.The assessment of the model's quality is based on its ability to predict the endogenous constructs.The following criteria facilitate this evaluation: Coefficient of determination (R 2 ) (Urbach & Ahlemann, 2010), predictive relevance (Q 2 ) (Evermann & Tate, 2014), and path coefficients (Garson, 2016).Kabarak j. res.innov.Vol.6 No. 1, pp 49-67 (2018) Evermann & Tate (2012) argue that while in traditional regression models the R 2 proportion of explained variance is an indicator of the predictive strength of the model, researchers have recently advocated the use of blindfolding for assessing the predictive strength of structural equation models (Chin, 2010;Ringle et al., 2012).Garson (2016) reports that Blindfolding utilises a cross-validation strategy and reports cross-validated communality and crossvalidated redundancy for constructs as well as indicators.He further argued that the purpose is to calculate cross-validated measures of model predictive accuracy (reliability), of which there are four: Construct cross-validated redundancy, Construct cross-validated communality, Indicator cross-validated redundancy and Indicator cross-validated communality.
However, in IS research, Evermann & Tate (2012) quoting Chin (2010) recommend to use redundancy-based blindfolding to assess the predictive relevance of one's "theoretical/structural model" and suggests that a value of Q 2 > 0.5 indicates a a predictive model.R 2 is the measure of the proportion of the variance of the dependent variable about its mean that is explained by the independent variable(s) (Gefen et al., 2000).Urbach & Ahlemann (2010) quoting Chin (1998b) considers values of approximately .670substantial, values around .333 average, and values of .190and lower weak.Nevertheless, the "significan value" of R 2 depends on fielding (Garson, 2016).The path coefficients should also be assessed.Urbach & Ahlemann (2010) reports that the R 2 should be above .100.The paths coefficient significance test and p value should be done using the bootstrapping technique.
Finally, the model fitness should be assessed.Henseler et al. (2016) argued that currently, the only approximate model fit criterion implemented for PLS path modelling is the standardised root mean square residual (SRMR).They further claimed that as can be derived from its name, the SRMR is the square root of the sum of the squared differences between the modelimplied and the empirical correlation matrix, i.e. the Euclidean distance between the two matrices.By convention, a model has a good fit when SRMR is less than .08(Hu & Bentler, 1998).Some use the more lenient cut-off of less than .10(Garson, 2016).Table 5 gives an overview of the assessment of formative models.Four papers report the use of indicator weights to assess indicator validity while five reports the VIF for the same purpose.Only one paper reports the formative construct validity assessment.

Conclusion
Partial Least Square Structural Equation Modelling has been applied in the field of Information Systems and is characterised as the primary user of that technical statistic.Nevertheless, its use is subject to critics.This review has updated the guidelines for the use of PLS-SEM in IS settings by integrating new criterion for assessing the measurement and the structural model.Nevertheless, this update is a non-technical point of view.The further inquiry could be taken to show how to reports the results of these provided criterions.

acknowledgement
This material is based upon work supported by the United States Agency for International Development, as part of the Feed the Future initiative, under the CGIAR Fund, award number BFS-G-11-00002, and the predecessor fund the Food Security and Crisis Mitigation II grant, award number EEM-G-00-04-00013.
Figure 1.Inner vs Outer Model in a SEM Diagram Source: Wong (2014)

Figure 3 .
Figure 3. Model distribution per year

Table 4 .
Formative measurement model assessment