Chapter 6- Discussion and Conclusion
4. RESEARCH METHODOLOGY
4.4 Data Analyses Techniques
64distribution (Hair et al. 2006). Majority of the extant statistical techniques (e.g., correlations, regressions, t-tests, ANOVA), run on the assumption that data succumbs to a normal distribution;
the population from which samples are taken are normally distributed (Altman & Bland, 1995;
Pallant, 2007; Field, 2009). That said, two approaches exist for assessing normality:
(i) Visual methods: these include histogram, leaf plots, box plot, P-P plot (probability to probability), and Q-Q plot (Quantile to Quantile) (Field, 2009). While visual methods have been criticized for their reliability in assessing normality (Oztuna, Elhan, & Tuccar, 2006), they are still considered useful because when data is presented visually, an audience can judge the distribution assumption themselves (Altman & Bland, 1996).
(ii) Normality tests: these are supplementary to visual/graphical tests (Elliot & Woodward, 2007), and include the D’Agostinoskewness test, Anscombe-Glynn Kurtosis test, kolmogorov-smirnov (K-S) test, Lilliefors corrected K-S test, Shapiro-Wilk test, amongst others (Oztuna et al. 2006;
Elliot & Woodward, 2007; Peat & Barton, 2005). While these variety of tests exist, a commonly employed univariate test for detection of normality in data in the IS extant literature is skewness and kurtosis. On this note, recommendations vary amongst scholars in employing skewness and kurtosis values for normality assessment. For instance, Stevens (2001) recommends thresholds for skewness and kurtosis of (<2 and<7) respectively; Hair, Babin, Money, & Samouel (2003) reckon values of (-1 to +1 for skewness) and (-3 to +3) for kurtosis are acceptable, whilst Azzalini (2005) recommends values of (-2 to +2 for skewness) and (-3 to +3) for kurtosis. Given varying opinions in the extant literature, the researcher situates his adopted values of skewness and kurtosis within the noted ranges and stipulates the utilized values in this study will stand at (-2 to +2 for skewness) and (-3 to +3 for kurtosis).
4.4.1 Motivation for selected statistical packages
For quantitative studies, researchers possess various needs for data analysis and also varied levels of statistical education and training. Mindful of this, there are several statistical software (SAS, R, STATA, SPSS, Minitab, etc.,) available in the market. Most of these software’s cater for basic statistical analysis; nonetheless, some possess more advanced techniques. Therefore, the selection of an appropriate statistical package is dependent on a researcher’s ability to determine the relevant statistical technique for various situations and align proposed statistical tests to software offerings.
For this study’s data analysis requirements, multiple (three) statistical packages were employed.
First, IBM statistical package for social sciences (SPSS) statistics software, version 22 was selected for the descriptive and exploratory section of the analysis chapter. SPSS is an advanced tool that is widely employed for analyzing a range65 of statistical data.
Second, a PLS- SEM 66approach was selected to validate the proposed research model of the study. More specifically, the selection of PLS-SEM was based on the objectives of the research (Gefen, Straub, & Rigdon, 2011), and it capacity to concurrently assess both a structural and measurement model (Vinzi, Trinchera, & Amato, 2010). In essence, SEM-PLS was selected because: (i) it gives optimum prediction accuracy based on its prediction orientation (Fornell &
Cha, 1994), which is one of the objectives of this study; identify the determinants of user continuance intention towards M-pesa. SEM-PLS caters for predictions by determining the portion of the variance in the dependent latent variable that is explained by the independent latent variable. (ii) Where theoretical models are in an infant stage of development, SEM-PLS is apt (Chin & Newsted, 1999). The research model in this study is relatively new because of its integration and application in a non-surveyed context. (iii) Handles complex models with various structural model relationships (Hair et al. 2014b). This study has hypothesized 20 relationships, which can be viewed as several and complex.
65Some of the statistical analytic capabilities of SPSS include:
(i)Descriptive statistics: cross tabulations, frequencies, and descriptive ratio statistics.
(ii)Bivariate statistics: means, t-tests, ANOVA, correlations.
(iii) Predictions for the identification of groups: factor analysis and cluster analysis (iv) Reliability tests- cronbach’s alpha
(v)Predictions for numerical outcomes: regressions
66PLS-SEM is a causal modelling technique applied to maximize the explained variance of a dependent latent construct (Hair, Ringle,
& Sarstedt, 2011)
Third, ANN67was selected as a multi-state approach to verify the results generated by PLS-SEM.
In essence, the study’s research model is tested using PLS-SEM, and the results from SEM are imputted to the ANN. ANN is identical to the human brain because it can attain original information from a given situation during the training procedure. Haykin (2007) explains that the information obtained from training is saved with synaptic weights, then, based on test data, the training procedure adapts the synaptic weights of the ANN in a logical way to achieve the intended goal. Further, ANN acclimatization shows that it reacts to formative transformations in the data production course, and could be retuned to address conditional adjustments (Garson, 1998). As such, ANN is believed to surpass conventional techniques like regression tests (Chiang, Zhang, & Zhou, 2006). Typically, SEM is applied in social science research to validate hypothesized relationships but seldom is it integrated with other artificial intelligence techniques (Hsu, Shih, Huang, Lin, & Lin, 2009). SEM is a linear model that often generalizes the intricacies when analyzing relationships, such as ‘technology use’ related phenomenon (Venkatesh & Goyal, 2010). To overcome SEM’s short-coming, ANN is employed in this study to identify intricate linear and non-linear relationships between the determinants of user continuance intention towards M-pesa. Further, ANN allows for improvement to its accuracy, in that, it evaluates its performance using mean-square error and regression, is designed to learn based on the imputed data, and is thus able to generalise to circumstances not considered previously by the network (Chan & Chong, 2012).
Last, while studies in other disciplines, such as: economics (Choudhary & Haider, 2012);
Marketing (Phillips, Davies, & Moutinho, 2015) have applied ANN in their investigations, only few studies in the IS discipline have employed it (Shmueli & Koppius, 2010).
4.4.2 Method of Analysis
This study focuses on identifying underlying relationships between a criterion variable (continuance intention) and several predictor variables. The predictive nature of the study required that data be submitted to statistical packages and an artificial intelligence tool (SPSS, SEM-PLS, and ANN) for descriptive analysis, structural and measurement model analysis, and predictive analysis. The data analyses procedures employed in this study are outlined next.
67An ANN is a computer-based statistical information processor that is modelled on the human brain (Haykin, 2007).
4.4.3 Analyses Procedures
The use of a paper-based survey instrument required that the data from the individual questionnaires are imputed into an SPSS spread sheet for analysis. Thereafter, the following steps are followed to provide an operational understanding of the research.