Data Analysis

Breaking Safaricom Scratch Card Code

Almost two years ago, Safaricom Ltd extended the scratch card code from 12 digits to 16 in-order to increase the computational time required to break the code thereby making them more secure. However, system theory acknowledges that systems expose their weaknesses at points of change. I set to find out if the move to higher dimensionality introduced a weakness in the scratch card hidden reload number. To begin the analysis I formulated the following assumptions to guide me in the process.

  • The grouping of the hidden reload number into four digits does not reveal the mechanics of the number generator.
  • Increasing the hidden reload number by a factor of four digits provides more data for statistical analysis.
  • The hidden reload numbers are separated into groups of four digits only for the purposes of ease of reading.
  • The hidden reload number represents a 16 digit number generated by a random number generator.

With the assumptions in place I set to curate the data set, my collection of scratch cards came in handy (448 in number). In understanding each digit has relevance with its position, I created a data set with 16 variables each holding the positional value of  the digits as shown below with an additional column of sum of the digits.


Now, here is where everything get’s interesting, mapping the sum of digits produces a near perfect normal distribution as shown below. According to the Central Limit Theorem, the sum of n independent and identically distributed random variables tend to be normally distributed as n becomes sufficiently large. In layman language, it simply means we have proved that the digits are indeed randomly generated which confirms my third and fourth assumptions.

Normal Distribution

Next, I asked the question, what if within the digits there is a pair that is  linearly or otherwise dependent. So, I set my favorite software WEKA to find any rules within the data in a process known as association mining.  Running the apriori algorithm with default settings produced results shown below:

WEKA Results

From the results I knew I was onto something, there is a relation between the third and sixth digit with a confidence interval of 1 (meaning the rule always works). To better understand the relation I loaded the dataset to R statistical analysis software and used the plot() function to visually inspect the relation between the two variables. The diagram below made me go Bazinga! It is a linear equation.

Three vs Six

If X > 0, Y=X-1, otherwise Y = 9. Simply put, if the third number in the scratch card is greater than 0 then the sixth number is the third number minus one, but if the third number is 0 then the sixth number is 9. Pick up any card and test the formula, in a cryptanalytic sense, I’ve broken part of the code used to generate the hidden reload number of Safaricom scratch cards.

scratch card

Download the dataset here



  1. Just join the channel
    And share our techniques I know how to make safaricom promotional bundles and I will share the trick there join it for more


  2. I think u nid to understand a formula used those two numbers are just to make u lose ur concentration coz the xmcode is very simple


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s