24 July 2013

A classical problem is to generate all the primes below a limit \(N\). For example this could be used to efficiently compute the prime decomposition of a given integer. A very simple way to do that is to use an Erathostenes Sieve.

Naive Erathostenes Sieve

The classical version of the algorithm works as follows:

  • Create a boolean array \(p\) of size \(N\) and initialize it to true. \(p_n\) represents whether integer \(n\) is prime.
  • For every integer \(n\) between \(2\) and \(N-1\), if \(p_n\) then set \(p_{mn}\) to false for any integer \(m\) greater than \(1\) and lower than \(\lfloor N/n \rfloor\).

Complexity of this algorithm is \(N/2 + N/3 + ... + N/p\). As the prime density tends to \(\log(N)/N\) where \(N\) goes to infinity, this algorithm has complexity \(O(N\log(\log(N)))\).

A straightforward imperative ocaml implementation of this using the bigarray module is given below. On my ChromeBook, it computes the sum of all primes below \(10^8\) in 3.9 seconds, not to far from the C equivalent which runs in 3.3s on the same machine (ocaml is compiled in native code, C is compiled with gcc -O3).

let erathostene max_n =
  let open Bigarray in
  let p = Array1.create char c_layout max_n in
  for n = 2 to max_n - 1 do Array1.set p n 't' done;
  for n = 2 to max_n - 1 do
    if Array1.get p n = 't' then (
      let m_times_n = ref (n + n) in
      while !m_times_n < max_n do
        Array1.set p !m_times_n 'f';
        m_times_n := !m_times_n + n;
      done)
  done

There are two aspects that we may want to optimize: speed and memory consumption. For memory, we use \(N\) bytes. A straightforward way to improve this is to use an array of bits, and also not to handle even numbers. This way, memory consumption would be down by a factor 16. Let us see how we can improve speed.

Optimized Erathostenes Sieve

As for memory, a simple first optimization consists in not handling even numbers. Another straightforward improvement is that instead of setting all \(p_{mn}\) to false for \(m > 1\), we could do it only for \(m \geq n\) (this is because integers \(mn\) for \(m < n\) have already been set to false in the step corresponding to the lowest prime factor of \(m\)). Using this version to sum primes up to \(10^8\) runs in 1.2s using the following ocaml code, so three times faster than the naive approach. The same code in C runs just below a second, still on the same machine.

let erathostene max_n =
  let open Bigarray in
  let size = (max_n - 3) / 2 in
  let p = Array1.create char c_layout (size + 1) in
  for n = 0 to size do Array1.set p n 't' done;
  for n = 0 to size do
    if Array1.get p n = 't' then (
      let nn = 2 * n + 3 in
      let m_times_n = ref (2 * n * n + 6 * n + 3) in
      while !m_times_n < size do
        Array1.set p !m_times_n 'f';
        m_times_n := !m_times_n + nn;
      done)
  done

In order to understand the code above, note that \(p_n\) is true now denotes that integer \(2n+3\) is prime.