Skip to content

calvincewalloh12/DataInternshipCodeChallengeMay2018

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Data Internship Code Challenge

Task 1

####If you copy paste a set of steps more than 3 times, it’s time to write a what? Wright a redundancy correction function.

Task 2

Given a dataset on any one of Africa’s Talking products: Voice, SMS, Payments and USSD. Discuss the steps you would take to analyse the data to reach a conclusion.

##Step i

Manipulate the data by creating a pivot table using excel or any other statistica package such as Stata, Spss.

Pivot tabe helps us filter and sort data by different variables and culculate standard devition and mean of the data. ##Step ii

Check on outliers, trends, correlations, variations. this will help focus the analysis on the questions and any other objective.

##Step iii ######Interprate the results by either failing to reject or rejecting the hyothesis. In our interpratation we should ask ourself: How is data defend us against objection? How is the data answering the original question.Is there any limitations in our conclusion?

Task 3

Give an example explaining how K-means clustering works.

K-means clustering is exploratory data technique for a complete dataset analysis.

Example.Apply K-mean clustering for the following dataset for two clusters.

x 10, 15, 13, 16 y 8, 12, 10, 14 K=2 Euclidian distance {(x,y) (a,b)} =√((x-〖a)〗^2+(y-〖b)〗^2 ) = √((10-〖15)〗^2+(8-〖12)〗^2 ) = √41 =6.403
Calculate Euclidian distance from cluster one =√((10-〖13)〗^2+(8-〖10)〗^2 ) =√13 =3.605 Calculate Euclidian distance from cluster two = √((15-〖13)〗^2+(12-〖10)〗^2 ) =√8 =2.828 Calculate updated class centroid = ((15+10)/2,(12+8)/2)
= (12.5, 10) Calculate Euclidian distance from cluster one =√((16-〖12.5)〗^2+(14-〖10)〗^2 ) = √28.25 = 5.315 Calculate Euclidian distance from cluster third = √((16-〖13)〗^2+(14-〖10)〗^2 ) = √25 = 5 C_1 (10,8) (13,10) C_2 (15,12) (16,14)

Task 4

Given a Gigabyte of weather data, how would you go about calculating the mean temperature of a particular place and plotting a graph to show change in variation of daily temperature.
Using Ms Excel creat a pivot table for the data, fiter to obtain the data of a particular place of interest, Using mean function calculate the mean for that place. Finally, using pivot-gragh, plot a bar graph of dailly temperature against time.

About

Data Internship Code Challenge

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors