Correlation Analysis and Correlation Matrix

Correlation Analysis and Correlation Matrix

In almost studies performed in industrialized countries, the residential water demand function is specified as a single equation linking tap water use (the dependent variable) to water price and a vector of demand shifters (like household socio-economic characteristics, housing features, climatologic variables, etc.) to control for heterogeneity of preferences and other variables affecting water demand (Agthe and Biillings, 1987). Most of the models that are employed in residential water demand study both in the developed and developing countries are regression models. They typically use the form: Q=f (P, Z) where P is the price variable and Z are the factors or a range of shifters of demand such as income, household demographics and other characteristics such as weather variables, etc. (Arbuès et al., 2004). The current chapter presents the proposed methodology for assessing the relationships between WCP and the other parameters by using statistical techniques. The subject of statistics is based around the idea that when it exists a big set of data and the purpose is to analyze that set in terms of relationships between individual points.

General Methodology

The chapter will attempt to give some elementary background mathematical skills that will be required to understand the process of many statistical analyses. These analyses will explore the effect of some parameters on water use. Firstly, the methodology covered mathematical techniques used for the investigation, followed by an explanation about the reason why such technique may be used and what the result of the operation tells about the data. Besides, in the second part an overview of the software, tools, used during the analysis. For more details, multivariate analysis will be conducted to see if there is any predictive relationship between water use and the other factors. The examination of results will be presented in the next chapter of this thesis. There is a huge variety of machine learning algorithms existing nowadays.

Firstly, to figure out which variables are connected together and to study the strength of a relationship between these continuous variables, Correlation Analysis will be used for these purposes followed by Correlation Matrix. This last will used to compute the correlation coefficients between variables. Second step is ANOVAs Test. An overview about the use of ANOVA in statistics will be presented. The purpose of this test is to assess any differences in WCP in houses according to every independent variable. In the third step, Factor Analysis (FA) is used, followed by Principal Components Analysis (PCA) in the fourth step. This analysis uses an orthogonal transformation to convert a set ofobservations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. Then the fifth step is Cluster Analysis (CA), which divides data into homogeneous and distinct groups (clusters). FA, PCA & CA were uses because they summarize data so that relationships and patterns can be easily interpreted and understood. The two final steps represent the Artificial Neural Networks (ANNs) and the Adaptive Neuro Fuzzy Inference System (ANFIS). The two techniques were used to configure or to reject the previous tests results, i.e., to assess the main determinants of household water use. Figure 4.1 presents the proposed methodology used in the present thesis. It attempts to better understand the relationship between household water use and indoor or outdoor factors.

Correlation Analysis and Correlation Matrix

The correlation analysis is the statistical technique used to study the closeness of the relationship between two or more variables. The variables are correlated when the movement of another variable accompanies the movement of one variable. The purpose of such analysis is to find out if any change in the independent variable results in the change in the dependent variable or not. Hence, with the correlation analysis the degree of relationship between these variables can be measured in one figure. Usually in statistics, four types of correlations could be measured for this purpose: Pearson correlation, Kendall rank correlation, Spearman correlation and the point-Biserial correlation. In this research the focus will be only on Pearson correlation because is the most widely used.The Pearson correlation is widely used mathematical method wherein the numerical expression is used to calculate the degree and direction of the relationship between linear related variables. The coefficient is popularly known as a Pearsonian Coefficient of Correlation denoted by “r”. If the relationship between two variables X and Y is to be ascertained, then the following.

Cours gratuitTélécharger le document complet

Télécharger aussi :

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *