In this section we explore the concept of correlation (especially using Pearson’s correlation coefficient) and how to perform one and two sample hypothesis testing, especially to determine whether the correlation between populations is zero (in which case the populations are independent) or equal. We briefly explore alternative measures of correlation, namely Spearman’s rho and Kendall’s tau, as well as the relationship between the t-test and chi-square test for independence and the correlation between dichotomous variables.

Topics:

- Basic Concepts
- Scatter Diagrams
- One Sample Hypothesis Testing
- Two Sample Hypothesis Testing
- Multiple Correlation
- Spearman’s Rank Correlation
- Kendall’s Tau Correlation
- Relationship with t-test
- Relationship with Chi-square Test for Independence
- Resampling for Correlation Testing
- Real Statistics Correlation Data Analysis Tool

hello sir,

i really hope u will help me with this problem

i have 19 questions that use likert scale 1-4 (1 never, 2 rarely, 3sometime,4 always)

between this 19 questions i only choose 6 questions that i can say positive (e.g question 1: do you use seat belt?) to indicate positive practice in driving so do the rest of the question. Moreover this questionnaire doesn’t have total score.

so now, how can i analyze this data?

my research question is

1) there is significant different between good practice and gender

2) there is significant different between good practice and year of driving(1: 1-2 years, 2: 3-4 years, 3: 5-6 years, 4: 7 years above)

Farah,

For the first research question, I understand that you want to determine whether there is a significant difference between the good practice scores for males and females. The typical test used in this case is two sample t test for independence or the Mann-Whitney test if the data is not normally distributed. See the webpages http://www.real-statistics.com/students-t-distribution/two-sample-t-test-equal-variances/, http://www.real-statistics.com/students-t-distribution/two-sample-t-test-uequal-variances/ and http://www.real-statistics.com/non-parametric-tests/mann-whitney-test/.

This test can also be accomplished using the correlation coefficient as described in the webpage http://www.real-statistics.com/correlation/dichotomous-variables-t-test/

For the second question you could use one-way ANOVA or chi-square testing of independence. See the webpage http://www.real-statistics.com/chi-square-and-f-distributions/independence-testing/ for information about independence testing.

This test can also be accomplished using the correlation coefficient as described in the webpage http://www.real-statistics.com/correlation/dichotomous-variables-chi-square-independence-testing/.

Charles

Thank you very much…

You help me alot…:)

Hi! I really need your help.

I want to know the appropriate statistical analysis used to test my hypotheses.

Here are my hypotheses:

Ho1: There is no significant relationship (independent) between business profile of the SMEs, to the level of awareness on climate change and related business risks.

Ha1: There is a significant relationship (dependent) business profile of the SMEs, and the level of awareness on climate change and related business risks.

Ho2: There is no significant relationship (independent) between the level of awareness on climate change and the related business risk, and the adaptive measures employed by the SMEs.

Ha2: There is a significant relationship (dependent) between the level of awareness on climate change and the related business risk, and the adaptive measures employed by the SMEs.

The content of my questionnaire

I. Business profiles composed of:

Type of Ownership: Sole Proprietorship, Partnership, Corporation

Number of Years operating: 0-10 years, 11-20 years, 21-30 years, 30 years above

Number of employees: 0-10 employees, 11- 50 employees, 51- 250 employees

Initial Capitalization: 0-3,000,000, 3,000,001-15,000,000, 15,000,001-100,000,000

II. Level of Awareness about Climate Change

10 questions answerable by Aware (Rating scale: 1) and Unaware (Rating scale:0)

III.Level of Awareness about Business Risk associated with Climate Change

A total of 21 questions (7 risk: financial, logistics, legal and regulatory, market, people, operational and physical….3 questions each risk)

And still answerable by Aware (Rating scale: 1) and Unaware (Rating scale:0)

IV. Adaptive measure

A total of 21 statements—>adaptive measures (7 aspects: financial, logistics, legal and regulatory, market, people, operational and physical….3 statements each aspect)

Answerable by Adapted (Rating scale: 1) and Not Adapted (Rating scale:0)

Please help me. Thank you.

Based on a very quick and preliminary review of what you wrote, my first thought is to use Manova. The business profile is the independent variable and Level of awareness and business risk are the dependent variables.

Charles

Thank you, Sir.

Hello,

I really hope you can help me solve this problem,

I have calculate the correlation of return between 10 sectors in stocks using excel.

As the results, the correlation between Manufacturing and Miscellaneous sector is around 87%. I want to create a range of correlation between 75%-95% and see how it affect the sectors’ mean and standard deviation. Can I use data table for that? Can you explain how to create the data table?

Please help me.

Thank You

Hi Natalie,

I need more information before I can answer your question.

Charles

Hello Charles,

I’m having a problem analyzing my data. We polled 5 experts and asked them to rank 6 tests for 82 scenarios. They were asked to rank the tests in order of how likely they were to use that test given a particular scenario. My issue is that one expert gave one test the same rank across all scenarios. When using the correlation function from Excel’s data analysis package, this “constant” gives a #DIV/0! error. I’m trying to see how the experts overall responses correlate. Do they agree for the most part? Is there a different statistical test I can use to find my answer? My statistics skills are not very strong and I’m becoming lost in the details. Any help is greatly appreciated.

Thank you,

Danielle

Danielle,

Yes, the correlation coefficient will be undefined if all the elements in one data set are the same. Generally you can use measures such as Cohen’d kappa, but this too will give disappointing results (or zero no matter what elements are in the other data set).

Charles

Charles,

I appreciate your quick reply. Do you have a recommendation for analyzing the data in another manner? Or will the results always be disappointing because all of the elements in one array are the same?

Thank you,

Danielle

Danielle,

I don’t have another recommendation for you. I would guess that all the results will be disappointing because all the elements in one sample are the same.

Charles

Sir pls which method of analysis and statistical tool will i use to analyze “relationship between parental variables and academic achievement of secondary schools”.

Sorry but I would need more information to answer your question.

Charles

Pls i have a problem on split half test reliability, i don’t know how to compute for the “r” in the formula. 2r/1+r

r is the correlation coefficient between the data in the two halves. Once you split the data in half (into ranges R1 and R2) you can use Excel’s CORREL(R1, R2) function to calculate r. See webpage Split Half Methodology for more details.

Charles

hello…i want to ask a specific method for my case…my objective is to assess relationship between socio demographic of visitors and attitude of visitors…the attitude for visitors used likert scale which from 1 to 5…(1.strongly agree …….5. strongly disagree.) but i do not know how to used my data to do the test…whether i would use correlation or other method…tq

It really depends on what you mean by “assess relationship”. It sounds like you want the correlation coefficient as described on the referenced webpage.

Charles

Sir, How to compute correlation of gender to level of awarenes (poor, average and good). Do I need to assign female as 1 and male as 2? I have 100 respondents and 86 answered the gender profile and 4 respondents leave it blank.

Ezin,

Yes, you could code female as 1 and male as 2.

Charles

Hi charles,

Just wanna ask u wht method should i use if my research is about determining the awareness of eclampsia among women Age 21 to 45?

Jamie,

You need to to supply more information before I am able to answer your question. In particular, what are you trying to demonstrate?

Charles

Sir,

I would appreciate your help, am carrying out a research on impact of effective OHSMS on work performance. Can I do a correlation analysis on the following data I got from my questionnaire(I used Likert scale) SA(69), A(46), U(6),D(16), SD(3).

Sorry, but I don’t know what OHSMS stands for and you haven’t provided enough detail for me to answer your question.

Charles