Monday, 21 July 2025

Basic Research Methods and Statistical Data Analysis | Book Publisher International

 

Research is an integral component of scientific enquiry and involves the objective investigation of phenomena. Statistical analyses provide an indispensable tool for conducting unbiased testing of scientific hypotheses. While qualitative research uses narratives, phenomenology, ethnographies, grounded theory and case studies in social or behavioural studies, quantitative approaches involve designed experiments and statistical analyses of instrument-based, performance, observational or attitude data. Valid statistical analyses rely on probability sampling to ensure random and unbiased collection of the data.  These include simple random sampling, systematic sampling, stratified random sampling and cluster sampling. Key concepts in statistics are central tendency, which is reflected by the mean, median and mode of a data set. Range, variance and standard deviation indicate the spread and variability of the data. The definition and classification of variables in a study is important as it specifies the type of data being collected, the statistical models that are appropriate, and the statistical test to be used.

 

Probability theory involves making predictions about the chances of the occurrence of events based on assumptions about the underlying probability process. Probability mass functions describe the possible outcomes and their probabilities for discrete random variables, while probability density functions are used to summarise the information in probability distributions for continuous random variables. Binomial and Poisson distributions are examples of discrete probability distributions, whereas t-distribution, normal distribution, Chi-square distribution and F-distribution are continuous distributions.  Degrees of freedom in statistics indicate the possible number for which a factor or parameter is “free to vary” and is usually one less than the number of variables in each source factor. Exploratory data analysis, done before the actual statistical analyses, helps researchers to understand the nature of the data and to choose the best methods to analyse it. Four types of EDA are univariate non-graphical, univariate graphical, multivariate non-graphical and multivariate graphical techniques.

 

Non-parametric tests are methods of analysing data that do not require the data to follow a distribution. They are generally used when the data do not meet the required assumptions for applying the parametric test, such as the t-test or one-way analysis of variance. The Chi-square goodness of fit test is used to evaluate the probability of an expected outcome when it can be approximated by a Chi-square distribution, and is commonly used for categorical data. The chi-square test of independence is used to determine whether two categorical variables are dependent upon each other or not. The Wilcoxon signed-rank test is used to compare two populations when the assumptions for the t-test do not hold. The Wilcoxon signed rank test can be used as a substitute for the paired t-test and employs both the magnitudes and signs of the differences between pairs of measurements that are ranked and compared to a fixed value D0. The Kruskal-Wallis test is an extension of the sum rank test used to compare more than two populations, and therefore, is an alternative to the one-way analysis of variance.

 

True experiments require random assignment of treatments to subjects, and the tests used assume that the data to be analysed are continuous and follow a normal (Gaussian) distribution. The t-test, completely randomised design, two-way analysis of variance, factorial experiments and split-plot or split-block designs are common experimental designs used in true experiments. When a statistical test establishes a significant difference among the treatments, it may be wished to further determine which treatments differ significantly from the others and which do not. Fisher’s Least Significant Difference and Tukey’s W procedure are two popular methods for conducting multiple means comparisons.

 

Regression analysis is done to establish the relationship between a dependent variable and one or more independent variables. Linear regression is used when a dependent variable is related to a single independent variable. The least-squares method minimises the sum of squares of the errors of prediction for fitting a straight line to the data set. When a straight line does not adequately represent the relationship between a dependent and independent variable, non-linear regression models may be used. These can include exponential, power or polynomial equation fitting of the data set. Multiple regression entails using a polynomial model relating a dependent variable to a set of quantitative independent variables.

 

Author(s) Details

Roshan Man Bajracharya
Kathmandu University, Nepal.

Please see the book here:- https://doi.org/10.9734/bpi/mono/978-81-990309-3-0

No comments:

Post a Comment