Building the Kenyan County Wellness Index

Want create site? Find Free WordPress Themes and plugins.

After the formation of 47 counties in August 2010 by the new constitution of Kenya, there was need to rank counties for purposes of resource allocation. The Commission of Revenue Allocation (CRA) was mandated to construct the formula for revenue sharing but fell short of what was expected.  Before commencing resource allocation, it is prudent to first rank counties in terms of development (or wellness) and not simply use poverty index to sit in place of ‘development index’ as CRA did. Then, as part of Doban Africa, we undertook to construct a compound index that incorporated all development indices such as us poverty index, electricity connection, population, percentage of educated population, water supply, number of families with solar power amongst other. The technical details are shown below.

Raw Data


We took a data mining approach and collated data from the Kenyan government open data portal that had inclination as development indicators. Consequent to that used WEKA (Waikato Environment for Knowledge Analysis) data mining software to sift through the data. Out of 15 input fields, 5 produced the biggest correlation margin to predict the output. The correlation coefficient measures the degree of correlation between the actual and the estimated value of the model. The chosen algorithm to construct the index is the M5 Prime (M5P) algorithm which produces a model which is a linear function of weighted sum of the input variables. The first step generates a regression tree using training data. It then calculates a linear model (using linear regression) for each node of the tree generated. The second step tries to simplify the regression tree by deleting nodes of the linear model whose attributes do not increase the error.


Index = -0.9459 * Elec + 0.3537 * Solar – 0.0171 * Pop Den – 0.2441 * Pri Ed – 0.2653 * Infra

+ 0.1172 * Ed + 76.6523



Correlation coefficient                     0.8645

Mean absolute error                        7.0004

Root mean squared error                 9.0576

Relative absolute error                    48.8543 %

Root relative squared error             50.6268 %


Running the latest figures of the variables to the model produces the following county rankings.

  1. Nairobi
  2. Mombasa
  3. Kiambu
  4. Kajiado
  5. Nakuru
  6. Uasin Gishu
  7. Nyeri
  8. Kirinyaga
  9. Embu
  10. Kilifi
  11. Machakos
  12. Lamu
  13. Taita Taveta
  14. Laikipia
  15. Muranga
  16. Kisumu
  17. Meru
  18. Kericho
  19. Isiolo
  20. Nyandarua
  21. Trans Nzoia
  22. Garissa
  23. Vihiga
  24. Kisii
  25. Kwale
  26. Tharaka Nithi
  27. Narok
  28. Nyamira
  29. Migori
  30. Kakamega
  31. Busia
  32. Bungoma
  33. Makueni
  34. Bomet
  35. Nandi
  36. Elgeyo Marakwet
  37. Kitui
  38. Siaya
  39. Baringo
  40. Homabay
  41. Wajir
  42. Tana River
  43. Marsabit
  44. West Pokot
  45. Samburu
  46. Mandera
  47. Turkana


Do you feel it is a better indicator?

Addendum: 25-09-2015



Did you find apk for android? You can find new Free Android Games and apps.

About Author

No Comments

  1. This is the type of stuff I like. Not aware of M5P, but would you get the same results by running a principal component analysis in STATA or something?

Leave A Reply

I never thought I’d make a writer not to mention…
Show Buttons
Hide Buttons
Open chat
Thank you for visiting Blackorwa,

How can we help you?

Regards | Blackorwa
Powered by