Breaking Safaricom Scratch Card Code

0
Want create site? Find Free WordPress Themes and plugins.

Almost two years ago, Safaricom Ltd extended the scratch card code from 12 digits to 16 in-order to increase the computational time required to break the code thereby making them more secure. However, system theory acknowledges that systems expose their weaknesses at points of change. I set to find out if the move to higher dimensionality introduced a weakness in the scratch card hidden reload number. To begin the analysis I formulated the following assumptions to guide me in the process.

  • The grouping of the hidden reload number into four digits does not reveal the mechanics of the number generator.
  • Increasing the hidden reload number by a factor of four digits provides more data for statistical analysis.
  • The hidden reload numbers are separated into groups of four digits only for the purposes of ease of reading.
  • The hidden reload number represents a 16 digit number generated by a random number generator.

With the assumptions in place I set to curate the data set, my collection of scratch cards came in handy (448 in number). In understanding each digit has relevance with its position, I created a data set with 16 variables each holding the positional value of  the digits as shown below with an additional column of sum of the digits.

Safaricom

Now, here is where everything get’s interesting, mapping the sum of digits produces a near perfect normal distribution as shown below. According to the Central Limit Theorem, the sum of n independent and identically distributed random variables tend to be normally distributed as n becomes sufficiently large. In layman language, it simply means we have proved that the digits are indeed randomly generated which confirms my third and fourth assumptions.

Normal Distribution

Next, I asked the question, what if within the digits there is a pair that is  linearly or otherwise dependent. So, I set my favorite software WEKA to find any rules within the data in a process known as association mining.  Running the apriori algorithm with default settings produced results shown below:

WEKA Results

From the results I knew I was onto something, there is a relation between the third and sixth digit with a confidence interval of 1 (meaning the rule always works). To better understand the relation I loaded the dataset to R statistical analysis software and used the plot() function to visually inspect the relation between the two variables. The diagram below made me go Bazinga! It is a linear equation.

Three vs Six

If X > 0, Y=X-1, otherwise Y = 9. Simply put, if the third number in the scratch card is greater than 0 then the sixth number is the third number minus one, but if the third number is 0 then the sixth number is 9. Pick up any card and test the formula, in a cryptanalytic sense, I’ve broken part of the code used to generate the hidden reload number of Safaricom scratch cards.

scratch card

Download the dataset here https://www.dropbox.com/s/dvkpoq35u9bmy2t/Hidden.csv?dl=0

Did you find apk for android? You can find new Free Android Games and apps.
Share.

About Author

No Comments

  1. Pingback: Breaking Safaricom Scratch Card Code | RAZOR THE INFORMATICIAN

  2. for the first time ever i’ve seen someone who went to school and made the full use of the knowledge he aquired there,
    the analysis was great and and precise

  3. hey dude…there is something called a check digit. i hear they are related in some way, and that digit is generated in order to form the final series of a scratch card..

    • That’s a good idea, from my understanding check digits are only digital communication to verify that bits sent are bits received. I think it is worth exploring if it is in use in cryptography.

      • Not just in communication but check digits are used to check that a generated number is valid, e.g a credit/debit card number (see Luhn Algorithm). However, in this particular case it’s only useful if you know the algorithm used to generate the numbers.

    • You have to deploy data mining skills. Load the data set into a statistical analysis software such as R, WEKA, Matlab, Excel e.t.c and start playing around with formulas and algorithms.

  4. Can’t even remember when I signed up here, seeing as my email ad is here, but I suppose a geek will be a geek. No offense.

    That said, you’re quite the hack. How about we make you rich to boot? Herewith: crack the whole nine yards of the 16 digit code then take it to safaricom-they’ll buy your loyalty with a perky job for sure. Decline and opt for a consultancy-money’s better, plus no tie-downs. Register a firm to that end now, if you haven’t already.

    If they don’t listen, look to the West via the net-don’t bother with the local arms of IT multinationals, for their staff will not let you outshine them to their bosses.

    You can’t beat getting paid to work your love.

    • Thank you for the suggestions, I suppose you registered on the blog a while ago and perhaps forgot about it.

      That said, when I get to crack the 16 digits I’ll apply for Nobel prize instead of paying Safaricom a visit. The underlying cryptic code is in use by numerous multinationals and its prize is worth more than what Safaricom can offer.

      • Yeah, by all means break the whole 16 digits code but watch out, they kill pipo, the majority of very bright pipo are actually dead, good brains kills (or rather gets u killed). The fellow who wrote the 1st Mpesa software was killed soon afta it caught on

  5. Very impressive. very meticulously done and seeing as I am allergic to statistics this is a nice read. Blog subscribed pap.

    I am interested to know who manufactures the cards for safaricom. is it an inhouse thing or contracted?

  6. You know what, i’ll take your work a step further. I have been working on fraud detection algorithms especially for mobile banking. I think I can play devil’s advocate and see how well to apply Benford’s Law to try and catch “un-natural numbers” in the sequence. I have a python implementation now in it’s beta phase.

  7. I do not know the A from Z on what you guys are speaking here about numbers, word is my thing. Let me tell the story when it happens, and this is amazing. Keep it up, and the best of luck.

  8. Interesting!

    Away from internal relationships between the digits, what if there is a correlation between the value of the scratchcard and one/a set of digits?

  9. Interesting, though the correlation between the value of the scratchcard, value of the scratchcards, serial number and expiry date of the Cards should have been included for a deeper analysis.

  10. interesting analysis. good job. suppose you group the data according to the package say 20s, 50s, 100s etc and analyse separately?

  11. Great work dude. I thought of doing a similar thing but when I actually got down to work I become suddenly very lazy or as I justified it ‘very busy’. So props for putting in the effort and time.

  12. interesting. if the numbers are randomly generated then what you need to know is the algorithms the random number generator and the seed number. Random by definition implies ‘lack any pattern’ I would thus be surprised if you found in meaningful or predictive pattern. But keep at it

  13. impresive work buddy, i like it honestly, its a sincere application of statistical knowledge and modelling outside school….i am glad you like many apreciate that R gui is the best statistical package available

    • Thank you, after fiddling with various statistical software R provided the best alternative given it’s programmable nature. I however have adopted a work style of using various software in one project since each software has it’s own strengths and weaknesses.

  14. Fascinating! I have some exposure to Monte-Carlo simulation using MATLAB, and applying my ‘skills’ on such a project would be interesting… are you willing to share your database of hidden numbers?

    • Thanks you, I have provided a link at the bottom of the post where you can download the dataset. If you need additional data drop you e-mail on the Get in Touch section and I’lll send them over.

  15. That was an awesome read man. Well, i am doing my final year in my electrical & electronics engineering and i must say it’s so sad that, very few graduate engineers opt to develop their passion or even apply their skill to get the real meaning of the underlying facts. What i am getting from the above excerpt is that you chose to be an hobbyist (allow me to use this term lightly) in statistical works. This is quite exceptional and need i say i’m impressed!! I think people need to put their minds and skills to task more intensively, no offence!

    • Thank you Mwongera, it is always a person choice on what to pursue either as a hobby or a career. I hope you make a great engineer and contribute to open knowledge.

  16. Don’t think you’re right:

    Frequency Distribution:
    1st number: 43 45 46 38 43 41 55 39 57 40
    2nd number: 44 37 48 53 38 33 56 52 46 40
    3rd number: 44 54 53 47 39 36 43 39 53 39
    4th number: 44 47 54 47 62 44 48 33 33 35
    5th number: 60 50 38 31 42 50 44 36 45 51
    and so on…

    you will notice it’s approximately equal NOT unequal the way you’ve presented in pairs. You are fitting data to a hypothesis… You have no evidence that pairs works. It could be tuples or 4-groups. I’d need a reason to believe your grouping is based on anything but a hunch.

    Oh, and also, the space you’re “cracking” is 10^16 = 10,000,000,000,000,000. Given the size of the space the probability of collision on a single digit is: .0000000000000001. So it’s likely they’re just generating randomly as it’s cheap to do so and the probability of error is low. You are also using the parity check codes as part of your distribution. Given it’s functionally dependent on the other numbers this is not right.

    • Hi Slim,

      Perhaps you misunderstood the blog, the frequency distribution wasn’t done on individuals digits but on the sum of the digits appearing as the last column on the attach image.

      You can test the pair with any scratch card you pick up. I formulated a hypothesis and went ahead to test it out which turned out to be true.

      From the onset of the blog post I mentioned my intention is to find a weakness in the system and not breaking the whole state space.

  17. Huh! I like your reasoning, but I bet you could have given more statistical details, like margin of error for this hypothesis…
    I want to try the same thing in SAS and see the output. Though I have basics of R, I don’t prefer it for data manipulation & mamangement, but I acknowledge its power in graphical presentation over SAS and given that it is a open source as compared to SAS which is damn expensive.

    • Hi Ba,

      I abstracted the details so as not to bore everyone down with statistical jargon. Drop your e-mail on the Get In Touch page and I’ll provide you with more details of the analyis.

      Thank you.

  18. Gerishom Andalia on

    That’s amazing data analysis dude! Think you will have trouble justifying your hypothesis when you get to the sixteenth digit. However, you are too generous with your findings though.

    • Thank you, hypothesis can always be disapproved – we use them to guide analysis and the process can be iterated or changed if desired results are not obtained.

      They say when you operate from a place of abundance you are never mind sharing 😉

  19. Why can’t you employ a number theorist. I’ve heard of one who made safaricom increase their codes from 12 to 16 digits

  20. Seronei Kimutai on

    #Blackorwa i am a form four leave, i have try these many times only to succeed in one to swap some amount hunging like 100000 and am on a move to do more am to analyse 1000 used cards!!

  21. Well done BlackOrwa. Could you email me a method of competence to the system. An upgrade measure of your own. In fact, I would be glad to look into this much further as the prospects of it look really lucrative. With the right connections you could be top level management in any of the top four mobile telephony systems.

    • Hi Keith,

      Thank you for the vote of confidence, as you have read on the blog the analysis focusses on the weakness of the system rather than its strengths. A separate analysis would be required to test that but I can shre what I think about it, drop a message on the contact section and we’ll talk.

      Thanks,

  22. Orwa nimekurank, kweli hubahatishi. Stumbled upon Weka last year pretty neat Machine Learning stuff. Will be working on a optimisation solver for a specific industry this year, hoping I could clue you in as a contractor?

  23. Orwa, this is really interesting….the main question is are the other 14 numbers random or is there a pattern?

  24. hey,nice work was wondering if you have used the Monte Carlo stimulation, factoring the serial numbers if the scratch card and assigning it a variable and you could say run a permutation sequence of the all the digits and in theory you could generate the next batch of scratch cards in production… anyway just a thought.

  25. hey Orwa,
    i am still in high skul but am coming up with something ,what i would like to know is the software u are using …….please e-mail me if possible

  26. kudos man. I like it. I’ve always read my scratch card and said, there is a huge relationship between the numbers, they literary are not just random! I once took 7digits of a 250 card an my seed in generating a random 16-digit code using r function rnorm() and got 100 codes. piloted a scatter plot and normalized the graph. picked randomly three outliers and two were valid. anyway, that was guts leading me then. I say do not fear a thing, if were not to get in, they should build it better!! never run q(yes)

  27. hi Chris. am doing some statistics and could kindly request for the kind of software you used . am impressed for that commendable statistical analysis you did. I would like to collaborate

  28. jst thinkin if yu add the series of 4 digits and get their difference from the two scratch card i have it generates a parten thou a complex one confirm this from your data

  29. This is amazingly amazing,its encouraging to see a bunch of us youths coming up with such facts,the future is bright technologicaly,safaricom mlale ivo ivo and watch me and Blackorwa bring you down,Blackorwa,email me at derrickmageto@rocketmail.com/derryk.em@gmail.com.Thanks in advance

  30. hey Orwa i just figured out another one…,the second digit gives u the fourth digit.when the second digit is less than 5 add five to it but wen its greater than5 subtract five from it.wen the second digit is five definately the fourth digit is 8.test with all scratch cards n u will proof mi right.

  31. hey guys check out it so easy to know the patern but it wont help the airtime system consists of two systems the series in scratch cards uwa sina fanywa off line then they are loaded to a system after packing only the loaded airtime works and the system is in such a way that its a FIFO after loading you with the airtime it deletes the history so labda mhack the two systems you are wasting time guys

  32. i dont know if it means sth or its normal_find the average of each card-find the average of all those averages-that average equals that of the average of the averages of each of the digits(eg all the first digits)…i hope someone gets me..though vaguely

  33. Lenny Walter on

    Dude your so cool…now i know of the relationship between two digits…what of the others???? I’m tired of scratching the cards

  34. sir,,could you tell me which course should i undertake and in which university …..would like to speak the same language as you…..”weka,,R,,Sas,, blablabla…otherwise it won’t be amiss saying .u r great nigga..

  35. cool kid. there is only one question. when i look at the equation i wonder how if you reverse the equation given a random digit, the result is always fascinating. so why not split the numbers and see the magic. Safaricom wishes we assume the number is one but actually there is more than one generator whose a logarithm is quit simple.

  36. Pingback: Airtel Scratch Card Analysis | BLACKORWA

  37. PAUL MUITI Jnr on

    Am nearly gettin 3 other digits and i guess it won’t take me 3 yrs before i break the codes…You can join me into this new orgnization am creating (M.U.I.T.I) Members United In Technology Innovation…follow me in fb #aul Muiti,whatsapp +254701729321,Email paulmuiti9@gmail.com and u will learn much more.

  38. I really like it and have a great interest in computer security, and ethical hacking, but please guys don’t take this for granted. Do it only for educational purpose, not anything else.

  39. u really challenged me and i went on to find other more relationship between the digits and guess what am almost there…i jus need a couple of hours to reveal the whole thing!

  40. That was big mental exploration….am an undergraduate in applied statistics and by at most 2years I will make this my main case study.

  41. hello….I think I got it….after several months of analysis.I finaly found out the alogarithm behind generating the codes.

  42. Hello guy I am currently doing my post graduate in Applied Statistics and have recently been wondering about the whole concept behind the randomization of the numbers used in credit cards as well as scratch cards. It is very awesome to find other people such as myself who are trying to break the code. I employ you to continue analyzing the whole concept and as I join in the research may we work towards success.

  43. Pingback: INTERVIEW:FINDING DEEP STRUCTURES IN DATA WITH CHRIS ORWA - Data Science Africa

  44. Ok Thank you..But i tried once and i could not make so i dont know why but may be i didnt get the theory well..May be you can explain to me Abit sir if you are willing to..i mean the third and sixth theory or may be another theory if there is

  45. Rodney sitienei on

    I have now discovered the sequence of the first eight digits of the scratch card.
    step 1
    take cards that are attached and of the same denomination.
    step 2
    compare digits 1,3 &6
    step 3
    compare digits 2 , 4 & 7
    step 4
    compare digits 5&8

  46. Since you posted this, they have acquisitioned an algorithm from a German Company and other safeguards such as regional 16- digit codes, minimising the number of attempts and only activating the cards once it has reached the destination. Attempting to bruteforce their security network is hard man. The Chinese tried and were caught. The data set is invalid at the moment.

  47. Pingback: How To Generate Scratch Card Numbers | Information

  48. Haha Crazy Geeks in Kenya and yet they say Africans have no knowledge people like you should stand up and show what you can do… A flying car should originate from Kenya I see we have the right people.

  49. Just join the channel telegram.me/androidbrothers
    And share our techniques I know how to make safaricom promotional bundles and I will share the trick there join it for more

  50. I think u nid to understand a formula used those two numbers are just to make u lose ur concentration coz the xmcode is very simple

      • 1. Discovered that First Digit is always not equal to zero, rand()%9=0 to 8,and 1+rand()%9=1 to 8 therefore First Digit is 1+rand()(9 for the answer not to be zero.
        2. Third digit is always = 0-8, and formula is rand()%9 , for the difference between third and sixth has to be 1 or 9 . In other words,Probability of occurrence of 9 in the formula is 0. Mathematically, in examples;
        a. 360/9=40 remainder 0.
        b. 361/9=40 remainder 1.
        c.362/9=40 remainder 2 ……………….369/9=41 remainder 0,
        …….remainder 9 does not occur in any case.
        and therefore,formula for sixth digit has to be 1+rand()%9 for probability of 9 to be more than not occurring in the sixth digit……………..more info later..

  51. Hello I have ever dreamt of dealing with such people like you please add me to your watsapp group 0797826775

Leave A Reply

Recommended
Your favorite fast food eatery has loads of information about…
Show Buttons
Hide Buttons
Open chat
1
Thank you for visiting Blackorwa,

How can we help you?

Regards | Blackorwa
Powered by