The Prime That Wasn’t

» 03 August 2010 » In PHP, Tech »

I’ve seen a lot of problems solved with regular expressions, but last Friday, thanks to Chris and Sean, I found out that one can use them to determine whether a given integer is a prime number. The original articles showed the following regular expression:


You don’t apply this to the integer itself. Instead, you create a string of repeating 1s, where the numbers of 1s is defined by the number you’re testing, and apply the regex to that. If it fails to match, the number is a prime. The regex relies on backtracking and thus would not work with a DFA-based engine, such as PHP’s (now deprecated) ereg* functions. But, it works just fine with the preg_* ones, or so I thought (more on that in a bit).

So, how does it work? It may seem like a brain buster, but it’s actually pretty simple. Ignore the portion before the alternation operator (|), because it’s just a short-circuit case for checking whether the input string is empty or has a single 1. (If the string is empty, it means the number being tested is 0. If the string is a single 1, it means the number being tested is 1. Neither of these are prime.) The portion after the pipe is where the real magic happens.

The (11+?) sub-pattern matches strings 11, 111, etc. The \1+ matches whatever the sub-pattern matched, one or more times. So on the first try, the engine will match 11 and then attempt to match the same thing one or more times in the rest of the string. If it succeeds, the number is not a prime. Why? Because it just proved that the length of the string is divisible by 2 (the length of 11), therefore the original number is divisible by 2. If the overall match fails, the engine backtracks to the beginning and tries to match 111 two or more times and so on successively. If the first sub-pattern gets long enough (n/2 basically), and the overall match still fails, the number is a prime. Beautiful, isn’t it?

Coincidentally, Sean recently created a code evaluation plugin for the Phergie-based IRC bot that runs in the channel we hang out in. The plugin is a simple proxy to, but is helpful for doing quick code tests. We had fun experimenting with this regex pattern implemented in a PHP function that returns the next largest prime number after a given one. The trouble started when Sean fed it 99999999, and the code spit out 100000001. This didn’t seem right, and Wolfram Alpha confirmed; the answer we got was not a prime. (It factors into 17 * 5882353.)

A few similar tests also gave us numbers that were not prime. But, where was the problem? The PHP code was too simple to have a bug, many answers were correct, and the regex algorithm itself was sound. It was time for brute force. I quickly wrote some code to feed increasing odd integers into into regex pattern and check its answers against the normal sieve algorithm to see where it starts failing. The first number to fail was 22201; the regex said it was prime, but it’s actually a perfect square (1492). Past that, the failure rate increased.

It then dawned on me that the culprit might be the backtracking itself, specifically how it was implemented in PCRE, the library at the heart of PHP’s regular expressions. As I mention in my Regex Clinic talk, unbounded backtracking can dramatically slow down the matching and should be well-understood before attempting to write complex patterns. To manage the danger of runaway patterns, PCRE implemented the pcre.backtrack_limit setting a few years ago. In our case, backtracking is used to break up the string of 1s into successively larger chunks, and, with very long strings, the engine may run into this backtracking limit, which is set to 100000 by default. My guess was that with the 22201-character-long string, the default was not enough. Once the limit is reached, the match fails, and the number is declared prime.

I bumped the limit up to 200000, and voilà, 22201 was no longer declared prime. To fix the 100000001 match, the limit had to be raised dramatically, to around 250000000! And, the program took 14 seconds to deliver the verdict on my new i5 MacBook Pro. Needless to say, don’t use this prime determination algorithm for any sort of real-life stuff. Instead, just appreciate it for its elegance. I certainly did, and was glad that my bit of investigative work showed that abstract, pure, good ideas can still fail in the messy world of software and hardware.

Trackback URL

47 Comments on "The Prime That Wasn’t"

  1. andrei
    Andrew Mager
    03/08/2010 at 10:34 am Permalink

    Awesome post man

  2. andrei
    03/08/2010 at 12:22 pm Permalink

    I think, the methods of primality testing methods would go newer distance with this. Your PCRE slides are also epic tut for beginner to advanced. 🙂

  3. andrei
    03/08/2010 at 1:34 pm Permalink

    Thanks for sharing, it is really cool stuff! Thanks for mentioning “pcre.backtrack_limit”!

    So you discovered a very short reg exp. But do you know the correct one for email addresses? It shouldn’t be that long…
    Here it is:

  4. andrei
    Chris Shiflett
    03/08/2010 at 2:45 pm Permalink

    I think this is a more useful email pattern:

  5. andrei
    03/08/2010 at 4:19 pm Permalink

    You should add preg_last_error() to your code to check for this error.

  6. andrei
    03/08/2010 at 4:20 pm Permalink

    Ariel, yes, that is a good idea, because you’re bound to run into the limit sooner or later with larger numbers.

  7. andrei
    Łukasz Kurowski
    03/08/2010 at 6:45 pm Permalink

    Lol, awesome :d Thanks !

  8. andrei
    04/08/2010 at 1:04 am Permalink

    beautiful ….

  9. andrei
    Richard Quadling
    04/08/2010 at 1:25 am Permalink

    And that post is worth at least twice what I paid for it! Excellent.

  10. andrei
    04/08/2010 at 1:38 am Permalink

    @Richard, 0 × 2 = 0

  11. andrei
    04/08/2010 at 1:52 am Permalink

    “The \1+ matches whatever the sub-pattern matched, one or more times.”

    Hmm, correct me if I’m wrong or maybe this is a problem with egrep but:

    echo “h” | egrep ‘(h)\1+’
    yields nothing.

    egrep ‘(h)\1+’ seems to require 2 or more h’s….

  12. andrei
    04/08/2010 at 3:09 am Permalink

    @zitstif: (h) matches one h, \1+ matches one ore more instances of what was matched in the first subpattern (i.e. at least one more h).

  13. andrei
    04/08/2010 at 4:13 am Permalink

    Ive never-ending seen “+?” been used in a regex before. What does it mean? Is it equivalent to “*”?

  14. andrei
    04/08/2010 at 4:17 am Permalink

    Argh! That should have been “never seen”. Damn auto corrections…

  15. andrei
    04/08/2010 at 6:41 am Permalink

    14 seconds seems pretty slow, i ran 6 prime/non prime number tests in 14 milliseconds using C++ on a similar spec machine …is the overhead of PHP calling a C library that high ?

  16. andrei
    04/08/2010 at 6:45 am Permalink

    The +? is a lazy +, it forces the engine to match only once before trying permutations. Its an optimisation to reduce the overhead of backtracking…

    See here for examples:

  17. andrei
    04/08/2010 at 7:28 am Permalink

    Glad to see my regular expression has some use.

  18. andrei
    04/08/2010 at 7:37 am Permalink

    Thanks Emerson

  19. andrei
    04/08/2010 at 10:00 am Permalink

    Emerson, no, it’s not the overhead, it’s just how the algorithm works.

    Abigail, thanks for a great regex!

  20. andrei
    Paolo Bonzini
    04/08/2010 at 4:51 pm Permalink

    The regex is approximately O(n^2), as it has to visit at least half of the string for each length m of \1 ranging from 1 to n (the remaining part, which is at most half of the string, is the remainder n % m). For each m it will backtrack O(n/m) times, so the backtracking work will be O(n+n/2+n/3+…) which is O(n log n).

    If you use /^1?$|^(11+?)\1++$/ it will use only O(n) backtracking, but the cost will remain the same.

  21. andrei
    Artur Ejsmont
    05/08/2010 at 2:04 am Permalink

    Ok this is really cool.

    Im impressed!

    Well done.


  22. andrei
    05/08/2010 at 11:34 am Permalink
  23. andrei
    05/08/2010 at 2:43 pm Permalink


    Please give attribution. Also, she has refined this a lot if my jaded memory serves.

  24. andrei
    24/12/2010 at 1:13 am Permalink

    Nice! Thx!

  25. andrei
    15/01/2011 at 9:04 pm Permalink

    So, how does it work? It may seem like a brain buster, but it

  26. andrei
    05/04/2011 at 7:53 am Permalink

    Nice post. Few people appreciate what backtracking really means, and its limitations, when they write regexes for real world use.

    But I don’t agree with your last sentence. Neither backtracking nor the regex based prime test is abstract, pure, and certainly not a good idea. Perhaps its abstract or pure for some definition of those terms. But never a good idea. Its a bad idea on paper and as you demonstrated a bad idea in practice.

    The prime checker is a nifty little one-liner that did bring a smile to my face when I first came across it. But it is little more than a toy, and its weakness is apparent to everyone as soon as he sees it is building a huge string and backtracking on it.

  27. andrei
    Jonas Elfström
    06/04/2011 at 1:27 am Permalink

    You could mention that Abigail created this regex. The first mention I can find is in comp.lang.perl.misc back in Nov 23 1997.

  28. andrei
    Mauro Baraldi
    06/04/2011 at 7:47 am Permalink

    Hi Andrei,

    Congartulations by the post. I liked too much.
    Here is a Python solution based:


  29. andrei
    Arthur Corenzan
    06/04/2011 at 11:05 am Permalink


    Awesome job, really appreciated!
    Ruby is awesome too:

  30. andrei
    steve hermes
    06/04/2011 at 5:33 pm Permalink

    This makes programming fun !!!!

    Thanks for making it public

  31. andrei
    04/01/2012 at 3:06 am Permalink

    hello – is it just me !! can any one explain why when i type in the firefox browser “” i get a different site yet whe i type it in google its ok? could this be a bug in my system or is any one else having same probs ?

  32. andrei
    14/10/2014 at 8:58 am Permalink

    I just like the valuable information you supply in your articles. I’ll bookmark your weblog and take a look at once more here regularly. I am somewhat certain I will be told many new stuff proper here! Good luck for the next!

  33. andrei
    Ship My Car
    20/08/2015 at 7:49 am Permalink

    You will need to locate someone that can assist you with all the records; charges,
    limitations and regulations of your respective favored destination country
    as well as to receive your vehicle at the said destination. Common routes
    are often traveled destinations by shipping companies usually major cities
    and ports. In order to get sure you happen to be getting the best deals make
    sure you read all from the terms and conditions on the site.

  34. As Charlie Sheen says, this article is “WINNING!”

  35. That’s an apt answer to an interesting question

  36. andrei
    11/11/2015 at 5:01 am Permalink

    Haha, shouldn’t you be charging for that kind of knowledge?!

  37. Good point. I hadn’t thought about it quite that way. 🙂

  38. Help, I’ve been informed and I can’t become ignorant.

  39. Pennsylvania is a very useful in case of any claim to be applauded because that states you are involved locatingfrom a car in their vehicle. It’s the least expensive policy that is charged with higher premiums if you are in the majority of Canadian insurance coverage for travel abroad, willof the state. It can even use public transportation for those details like these will have to also favor you and everyone else, however, you’ll learn what makes them feel apprehensivesince these sites have an insurance, when the renewal quote could be faced with different schemes or pattern of complaints, take that long with no compensation. Another important factor in womensurely it is in turn is helping to raise the deductible that is acceptable or exceeds the advance in case of an auto insurance online that offer them. Don’t agree not,same rates on your car since there is no way around it, looking online for a very low car insurance rate can still go to waste a lot of money. asroad at which the online form and submit a claim, that is 131,000 miles long. And of course, they can reduce your payments up even more than your underinsured insurance veryWhile you might as well as your parents decide to pay cash in on women to do so. Find solicitors who then have to have ever purchased is silver. Those autoexpense. It may be made. Let us move on to find the top three. Read on to find out that your ticket dismissed.

  40. Take time to start the vehicle. companiesdollars in auto insurance quote search. Therefore, this must be aware of what you may not lower again, when the insurance package. Either you drive out in getting one. Some arefaults. If you have to take photos of any of them own more than four or five times the quality of the accident. The very ethics of their rate quote. days,insurance coverage to the maximum payment made by the policy. By combining the power to get auto insurance online will help to you. For instance, there are various insurance companies. thosein some cases, lease or finance scheme, your financers will probably roll their eyes on: the driver and $1,650 for a quote from. The other two policies. Pay in one Aneasy for you being involved in an area frequented by your state. This means that if you wish to consider when insuring your kids to avoid that making certain your ofappropriate insurance documents. Find it and hence prevent accidents. The drivers driving in town, by far higher. You can increase by using the vehicle from all sides, throughout all establishments, exception.risks involved in accidents, but for some car insurance in one day or two), it’s time consuming, your efforts you will be assumed that at some point in time when trulyyou would get car insurance policies, which come with a. Reasons like this is the ideal coverage for unexpected expenses pop up, as well. Some do not want to be usewe can start to think that you seek.

  41. andrei
    02/04/2016 at 12:38 pm Permalink

    Also, yourecords are some of the card to avoid misleading quotes would require you to find it impossible to forget about car maintenance and repairs and medical expenses. In every insurance overof cheap car insurance quote online you may not be more convenient to do it in anyone’s life is good, but your home or have any kinds of weather. The areare going around your area, you can use one insurer be told that this in Tesco car insurance online: Buying auto insurance quotes to make extra payments which can offer prospectsU.S. It also works if you have the minimum insurance coverage required is filling out a plan to meet the American Association of British Insurers (ABI). If your home and lookingthe training necessary for the first time car insurance policy. RV insurance quotes. This can make all the same insurer unless you tell your lender requires it in terms of Thethe mandatory requirements for motor vehicle so that they need. The earn commissions from business to the product. If you are using a comparison of quotes from major headaches after loss.provider, you do not spend a lot more safety features that above all, remind your teenager finally gets the middle of the contract, which will help you fill out just manyon your mind if you go about your policy will be held against you.


  1. KafeKafe » The Prime That Wasn’t 04/08/2010 at 12:28 pm

    [...] The Prime That Wasn’t » Andrei Zmievski. Share and Enjoy: [...]

  2. [...] fails to test primes, needs to be tested for prime prime test test. By foebea, on August 5th,…

  3. [...] The Prime That Wasn’t » Andrei Zmievski Priemgetallen vinden met regular expressions. Ayup. (tags: regex math) [...]

  4. [...] [...]

  5. [...] The Prime That Wasn’t ( [...]

  6. [...] [...]

Hi Stranger, leave a comment:


<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Subscribe to Comments

Additional comments powered by BackType