# Intro and Problem Statement

Moving on, let me introduce Project Euler problem # 16:

Power digit sum

215 = 32768 and the sum of its digits is 3 + 2 + 7 + 6 + 8 = 26.

What is the sum of the digits of the number 21000?

# The Approach

I’ll spare you the details of my initial approach, and leave you to the “About This Site” section if you are curious. However, I will describe as best as I can the approach which led me to the correct answer. Hopefully I’ve stated enough times on this site that I would never be able to settle with the trivial and lazy “one liner library” approach, as the purpose of Project Euler is not to teach one how to use libraries and functions which others have created, but to pique one’s curiosity to try and understand the mathematics and underlying concepts behind a problem. Thus I refrained from using Java’s very convenient BigInteger library and decided to construct my own from scratch. A nice challenge indeed. Now folks, here’s my attempt at explaining how I was able to solve this problem.

# The Solution in a Nutshell

So the idea here was basically to first take in some user input consisting of two ints, one for the base and one for the exponent (I wanted to test the algorithm for any arbitrary base and exponent at the time). The next step would be to calculate the number of digits in the result and create an array of ints (or any integer primitive data type. The ideal primitive would be byte, since you would only need the numbers from 1-9 in each cell. Moreover, this would consume the least memory) with the same length as the number of digits to be expected in the final result. After instantiating the array, a value of one would be assigned to the last cell in the array. Now here’s where the tricky part comes, implementing your own BigInteger operations. So in this particular scenario, all you would need is a simple single digit multiplication algorithm that treated the array and its cells as a whole unit (each cell consists of one digit, read from left to right, just like on paper). Essentially, this would involve having carryover and product variables which would store the result of a a multiplication between the base and a single digit (one cell in the array) as well as the carryover to the next multiplication. The individual operations would involve arithmetic multiplications and taking division remainders. After performing the exponentiation, the next step in the process would be to find the sum of the elements in the array and print the answer. This would be one of the easiest steps in the process, as it could be done trivially with a for loop and a counter variable.

# Calculating the Number of Digits in xn

So let’s start off with the calculation of the number of digits in the exponentiation. The derivation for the formula that will be used is as follows:

$\text {Let } x \text{ be the number of decimal digits in} \ 2^n \text{.}$

${\ 10^{x-1}=2^n}$

${\ \log 10^{x-1}=\log 2^n}$

${\ (x-1)\log 10=n\log 2}$

${\ x\log 10=n\log 2+\log 10}$

${\ x=\frac{n\log 2}{\log 10}+1}$

${\ x=n\log 2+1}$

$\ \text {This holds true for any base a, } \therefore$

${\ x=n\log a+1} \text {, however x will never be an integer unless}$ ${\ a \text{ is a positive integer power of } 10 \text {, } \therefore}$

$\ x=\left \lfloor {n\log a+1} \right \rfloor$

# Getting User Input and Translating Math into Java

Now we can start talking Java. Putting together the first bit, the program asks for user input (base x and exponent n), then calculates the number of digits in xn. Next, the array holding the product is declared and is instantiated with 1 in its last index. Then we declare our digit product and carryover variables. We will also need something to keep track of how many zeroes are scanned in the calculation loop with a counter variable to avoid scanning the entire array at each iteration, as there will be many zeroes to sift through. Finally we need our index variable which will keep track of the current index of the array the program is operating on.

private static Scanner in;

public static void main(String[] args) {

// ---GET BASE AND EXPONENT FROM SCANNER INPUT---

System.out.println("Enter a positive integer base of an exponent to calculate (from one to ten)");
System.out.println();
in = new Scanner(System.in);
int x = in.nextInt();
System.out.println();
System.out.println("Enter a positive power of " + x + " to calculate");
System.out.println();
double n = in.nextDouble();
double digits;

// ---CALCULATION FOR NUMBER OF DIGITS---

digits = Math.floor(n * Math.log10(x) + 1);

// ---VARIABLE DECLARATION FOR EXPONENTIATION ALGORITHM---

int[] pow = new int[(int) digits];
pow[(int) digits - 1] = 1;
int digprod = 0;
int carryover = 0;
int zerocount = 0;
int i = (int) digits - 1;
}


# The Exponentiation Algorithm

Next comes the real deal, the multiplication algorithm loop which will carry out the exponentiation. Once it is finished, the answer will be stored in our int array. Unfortunately if I were to explain every single line of code in this snippet, this blog post would become a novella. Instead, I’ve annotated this section with brief comments.

// ---EXPONENTIATION ALGORITHM FOR CALCULATING X^N---

// loop executes n (exponent) number of times
for (int a = 0; a < n; a++) {

// Check if 100 leading zeroes have been counted and if the
// index variable has reached the
// beginning of the array. If both are true, perform the next iteration

while (zerocount < 100 && i >= 0) {

// Get the product of the base and the digit
// stored in the current index.

digprod = pow[i] * x;

// Get the mod 10 of the product and add it to carryover.
// Take the mod 10 of the whole expression.
// This will ensure that no matter how large the product is
// (it will always be between 0 - 81), only a single digit
// will be stored in each cell.

pow[i] = (digprod % 10 + carryover) % 10;

// The carryover will simply be the tens digit of the product plus
// the previous carryover.

carryover = (digprod + carryover) / 10;

// Check if all that remains in the array are leading zeroes.
// Zerocount will increment by 1 every time 4 consecutive
// zeroes are found. This only applies if the base is not equal
// to ten.

if (i > 2 && pow[i] == 0 && pow[i - 1] == 0 && pow[i - 2] == 0
&& pow[i - 3] == 0 && x != 10) {

zerocount++;

}

// Shift the array index down by one for the next iteration.

i--;
}

// Reset the values of the carryover, zerocount and the index variable
// for the subsequent multiplication.

carryover = 0;
zerocount = 0;
i = (int) digits - 1;

}


# Icing the Cake

Finally, we write our little summation loop and print out the result. I’ve included a little extra code which will count the number of occurrences of each digit in the answer as well.

// ---CALCULATE SUM OF DIGITS---
public static int sum(int[] a) {
int summation = 0;

for (int q = 0; q < a.length; q++) {
summation += a[q];
}

return summation;
}

// ---PRINT OUTPUT - THE EXPONENTIATION RESULT, THE SUM OF THE DIGITS,
// THE NUMBER OF DIGITS AND THE OCCURRENCES OF EACH DIGIT---
System.out.println();
System.out.println(x + " to the power of " + (int) n + " is: ");
System.out.println();

int charcount = 0;
int dignums[] = new int[10];

for (int c = 0; c < pow.length; c++) {
System.out.print(pow[c]);
charcount++;
if (charcount == 180) {
System.out.println();
charcount = 0;
}
if (pow[c] >= 0 && pow[c] <= 9)
dignums[pow[c]]++;
}

System.out.println();
System.out.println();
System.out.println("The number of digits in the result is: "
+ pow.length);
System.out.println();
System.out.println("The sum of the digits of the result is: "
+ sum(pow));
System.out.println();
System.out.println("# Zeroes: " + dignums[0]);
System.out.println("# Ones: " + dignums[1]);
System.out.println("# Twos: " + dignums[2]);
System.out.println("# Threes: " + dignums[3]);
System.out.println("# Fours: " + dignums[4]);
System.out.println("# Fives: " + dignums[5]);
System.out.println("# Sixes: " + dignums[6]);
System.out.println("# Sevens: " + dignums[7]);
System.out.println("# Eights: " + dignums[8]);
System.out.println("# Nines: " + dignums[9]);


# Piecing Together the Solution

Putting it all together, we have:

package projectEuler;

import java.util.Scanner;

public class PowerDigitSum {

private static Scanner in;

// ---CALCULATE SUM OF DIGITS---

public static int sum(int[] a) {
int summation = 0;

for (int q = 0; q < a.length; q++) {
summation += a[q];
}

return summation;
}

public static void main(String[] args) {

// ---GET BASE AND EXPONENT FROM SCANNER INPUT---

System.out
.println("Enter a positive integer base of an exponent to calculate (from one to ten)");
System.out.println();
in = new Scanner(System.in);
int x = in.nextInt();
System.out.println();
System.out.println("Enter a positive power of " + x + " to calculate");
System.out.println();
double n = in.nextDouble();
double digits;

// ---CALCULATION FOR NUMBER OF DIGITS---

digits = Math.floor(n * Math.log10(x) + 1);

// ---VARIABLE DECLARATION FOR EXPONENTIATION ALGORITHM---

int[] pow = new int[(int) digits];
pow[(int) digits - 1] = 1;
int digprod = 0;
int carryover = 0;
int zerocount = 0;
int i = (int) digits - 1;

// ---EXPONENTIATION ALGORITHM FOR CALCULATING X^N---

for (int a = 0; a < n; a++) {

while (zerocount < 100 && i >= 0) {

digprod = pow[i] * x;

pow[i] = (digprod % 10 + carryover) % 10;
carryover = (digprod + carryover) / 10;

if (i > 2 && pow[i] == 0 && pow[i - 1] == 0 && pow[i - 2] == 0
&& pow[i - 3] == 0 && x != 10) {

zerocount++;

}

i--;
}

carryover = 0;
zerocount = 0;
i = (int) digits - 1;

}

// ---PRINT OUTPUT - THE EXPONENTIATION RESULT, THE SUM OF THE DIGITS
// THE NUMBER OF DIGITS AND THE OCCURRENCES OF EACH DIGIT---

System.out.println();
System.out.println(x + " to the power of " + (int) n + " is: ");
System.out.println();

int charcount = 0;
int dignums[] = new int[10];

for (int c = 0; c < pow.length; c++) {
System.out.print(pow[c]);
charcount++;
if (charcount == 180) {
System.out.println();
charcount = 0;
}

if (pow[c] >= 0 && pow[c] <= 9)
dignums[pow[c]]++;
}

System.out.println();
System.out.println();
System.out.println("The number of digits in the result is: "
+ pow.length);
System.out.println();
System.out.println("The sum of the digits of the result is: "
+ sum(pow));
System.out.println();
System.out.println("# Zeroes: " + dignums[0]);
System.out.println("# Ones: " + dignums[1]);
System.out.println("# Twos: " + dignums[2]);
System.out.println("# Threes: " + dignums[3]);
System.out.println("# Fours: " + dignums[4]);
System.out.println("# Fives: " + dignums[5]);
System.out.println("# Sixes: " + dignums[6]);
System.out.println("# Sevens: " + dignums[7]);
System.out.println("# Eights: " + dignums[8]);
System.out.println("# Nines: " + dignums[9]);

}

}


# Runtime Complexity

My attempt at evaluating the complexity of the algorithm for calculating the nth power of x: The complexity is in polynomial time and is: $\ O(nx-100n)$, as the nested while loop within the for loop executes until there are 100 leading zeros found after the last digit has been multiplied. Therefore the complexity of the nested while loop is approximately $\ O(x-100)$. The complexity of the for loop is $\ O(n)$, since it executes n times. I plan on improving this algorithm in the near future and implementing exponentiation by squaring or binary splitting to reduce the complexity significantly.

# Wrapping Up

In summary, this program should take in user input, namely a base from 1 to 10 and an exponent (less than 231-1), perform the exponentiation and return the answer including the sum of the digits and the occurrences of each digit. I hope you enjoyed this post!

## 7 thoughts on “PE #16”

1. OK, nice.
So far 😉

Working with decimal number safes you the hassle of base conversion, admitted, but it is extremely slow if you use only one digit at time. With 32 bit signed integers you can use any power of 10 lower than 2^15 without bit-juggling. The first such power is 4 and it is 9 for 64 bit respectively.
It is a bit more work to extract the individual digits but not much and if you use the smaller base (10^4) you can even make a table with ten thousand entries such that you can get the digit-sum of the individual big-digits in O(1).

I would also part the multiplication from the exponentiation.

The exponentiation of x^n as is needs O(n) multiplications. It can be done with O(log n) multiplications by way of multiplication by squaring.
Pseudocode (assuming WordPress’ bbcode works in comments, too) for a recursive version:

function pow(integer x, integer n)
if n == 1
return x
if (n%2) * 2 == n
return pow(x^2, n/2)
else
return x * pow(x, n-1)


See also the Wikipedia entry for this method which has more pseudocode and talks a lot about the different variations of it.

BTW: squaring can be done faster than multiplying because you have a lot of similar small products. It is faster only by a constant but it gets quite significant for larger numbers!

And while you are at it: why not implement a complete BigInteger library? You may take a look at Tom St Denis’ book which is freely available[1] on the net, just search for a file named tommath.pdf. It has all the necessary things in it to build one. Some are amiss, like FFT multiplication, fast division, logarithms, and roots (it is meant as a basis for cryptography, no need for such things in crypto) but it is a very good start; highly recommended!

Another thing I saw: you are using only one data type for your integers, so the whole bunch of

  if (pow[c] == x)
dignums[x]++;


can be cut down to the single line

    dignums[pow[c]]++;


It might be good style to check if the integer at pow[c] is inside the range [0-9].
“Will never happen! Never!” is one of the most frequently used last words of programmers! 😉

[1] it has been put by the author into Public Domain explicitly.

Like

• Hi Cristoph,

Working with decimal number safes you the hassle of base conversion, admitted, but it is extremely slow if you use only one digit at time.

Agreed. What exactly do you mean by using one digit at a time? Are you referring to the fact that my algorithm performs the calculation by operating on a single index in the array at any given time? Is there a way to do this using multithreading or is there an even simpler way?

With 32 bit signed integers you can use any power of 10 lower than 2^15 without bit-juggling. The first such power is 4 and it is 9 for 64 bit respectively.
It is a bit more work to extract the individual digits but not much and if you use the smaller base (10^4) you can even make a table with ten thousand entries such that you can get the digit-sum of the individual big-digits in O(1).

OK, I’m a little confused here. Why are you referring to 215 as an upper bound? In Java the max value of a signed int is 231-1. So do you mean take the highest power of 10 below the max value of a signed int/long? So essentially what you’re saying is to work in a larger base system? I thought of this earlier, and this is something I might do in the future.

The exponentiation of x^n as is needs O(n) multiplications. It can be done with O(log n) multiplications by way of multiplication by squaring.

Great idea, I will consider implementing this. Just out of curiosity, why stop at squaring? Why not extend to cubing, or any power x? 😀

 if ((n%2) * 2 == n)


Is this condition supposed to check if n is even? It doesn’t really make sense, so do you mean

 if ((n%2)  == 0)


?

OK I get the idea, basically it shrinks down the number of multiplications to make. Pretty cool. I’ve checked out the Wikipedia page and get the gist of it.

And while you are at it: why not implement a complete BigInteger library?

That is my ultimate goal. This is probably going to be the topic of my first journal article that I get published. Fascinating stuff. 🙂

Another thing I saw: you are using only one data type for your integers, so the whole bunch of

if (pow[c] == x)
dignums[x]++;


can be cut down to the single line

dignums[pow[c]]++;


Thanks for that catch! Although I don’t actually use this bit of code in my app, it’s a sensible thing to change. I’ll do it right away.

Like

2. Hi,

(Thsi is the comment I had problems with, so let’s see if it gets published twice)

What exactly do you mean by using one digit at a time?

The inner loop computes only one decimal digit per round.

Is there a way to do this using multithreading or is there an even simpler way?

Yes, you can use multiple threads to do it but the overhead is way too large. It is more profitable to keep as much data as possible in the L1 cache to speed up the multiplication of native integers. That means that it is better to make the individual limbs (that’s how the big-digits get called in big integer libraries) as big as possible. It is quite common to use the largest native integer for that task.

So essentially what you’re saying is to work in a larger base system?

Yes.

As said above it is common to use the largest native integer for it but that needs a base conversion for the output and some bit-juggling for multiplying because the magnitude of the result is obviously larger than the biggest native integer.
A way to avoid the conversion is to use base 10 limbs and a way to avoid the bit-juggling is to use only half of the maximum such that the result of a multiplication is always smaller than the native integer.
Hence my suggestion to use 10^4 for 32 bit native integers (2 * 4 = 8 and the largest possible power of ten is 9 for 32 bit) and 10^9 for 64 bit. Java has 64 bit integers (I don’t know how they handle it on 32 bit machines and how fast/slow that is) so I would propose to use 32-bit integers for the limbs, 10^9 as the decimal size, and 64 bit integers for the intermediate results.

One large disadvantage: all of the descriptions of algorithms for arbitrary precision assume binary bases, so you have to work it out all on your own. Not the worst thing to do for a student but long division will get a bit tricky to get right and sufficiently fast. Feel free to ask if you get stuck.

Just out of curiosity, why stop at squaring? Why not extend to cubing, or any power x?

Not any power of x, it must be a positive integer larger than one.

You stop at squaring because the overhead of cubing or more would be too large. But nobody keeps you from giving it a try. Modern hardware might support it in some way I do not know. Have fun digging through thousands of pages of opcode description for the myriads of modern CPUs (http://ref.x86asm.net/ might be a start andhttp://www.agner.org/optimize/instruction_tables.pdf has a list of latencies,http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html has all the opcodes for Intel cpus and http://developer.amd.com/resources/documentation-articles/developer-guides-manuals/ does the same for AMD) and GPUs (somewhere at http://developer.amd.com/ for AMD (e.g.: http://developer.amd.com/wordpress/media/2012/10/R700-Family_Instruction_Set_Architecture.pdf but there are most probably newer ones) http://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#instruction-set-ref fro NVIDIA and so on). It’s a wide subject.
I did it a couple of years ago to see if I can use the then new 128 bit SSE registers to compare two MD5 sums (yes, a long time ago 😉 ) for equality.
Doing it the old way was faster btw. The md5 sum consisted of four 32 bit integers in a row and the probability that the first integers of two such structures differ is already very high. The 128 bit were also not really 128 bit because the data bus was still 64 bit wide at most and the xmm registers were floating point registers. It was quite a waste of time but interesting nevertheless.

A quick search for an already published paper was not successful so your paper might get published in a well-known journal if you found a way to do it.

It doesn’t really make sense

Your judgement is correct and it was of course the fault of Signore Copypasta, as always! 😉
But don’t ask me how it got there and in that form, I honestly cannot imagine.
What the…?
Oh, my… *sigh*

That [big integer library] is my ultimate goal. This is probably going to be the topic of my first journal article that I get published.

Good luck with that, it’s hard these days.
But I think you mean the school internal journal you mentioned once?

Like

• Yes, as it turns out, Akismet automatically flagged your two comments as spam as it contained a tad too many links. 😉

Anyways, I’ve changed those parameters so that the issue shouldn’t be as frequent.

Like

• The inner loop computes only one decimal digit per round.

OK good, so this is what I thought you meant. Quite a good point indeed.

That means that it is better to make the individual limbs

Good luck with that, it’s hard these days.
But I think you mean the school internal journal you mentioned once?

Yeah, it’s my school’s journal. It is considered a research journal for experimental science, so it is worth a shot. Not sure how strict the procedure is and what the formalities are, but I’ve browsed through a few publications and it seems there are none on the subject of computer science. They mostly contain articles regarding medicine, biology, chemistry, and even mathematical proofs (I would’ve never thought!). I think they’ll be open to the idea. 🙂

So you mention the term “limbs” quite often, but what exactly is a limb in this context? Could you point me to a resource explaining this concept? I have my own mental idea of what you’re saying and this is what I’m thinking:

Instantiate a two-dimensional array, which will be an array of 32 bit integer limbs, with each limb holding a maximum possible value of 109. But then where I’m confused is, how can you operate on all limbs at once? Once you do perform the multiplications on each of the limbs, what do you do when the maximum value is reached, and you can’t multiply any further? Perhaps I’m not thinking on the right track with this whole “limbs” concept.

Like

3. Hi,

Not sure how strict the procedure is and what the formalities are

Probably not as strict regarding the themes (I don’t think they expect original work, just something outside of the standard curriculum) but I hope that the procedures and formalities do not differ much.

Could you point me to a resource explaining this [limbs] concept?

Yes.
*points to himself*

Instantiate a two-dimensional array, which will be an array of 32 bit integer limbs, with each limb holding a maximum possible value of 109.

It is actually a one-dimensional array (zero dimensions make a point, one dimension is a line, two dimensions is a plane, three dimensions is a body, four dimensions include the time that body lasts, five dimensions look a bit like mice in the three dimensional projections…but I digress 😉 ) but otherwise correct, yes.
“limb” is just an expression for “single digit of the base x”.

But then where I’m confused is, how can you operate on all limbs at once?

You don’t. You work with one after the other. You can slice it up and give every CPU its own slice to work with but that is normally not very useful, because it is hard to implement, the overhead is just to large and there are better methods to speed up multiplication which are also better suited for multi-threading.

Once you do perform the multiplications on each of the limbs, what do you do when the maximum value is reached, and you can’t multiply any further?

That’s called a carry.

I guess you are thinking way too complicated?

The concept is very simple, just like the form of long multiplication you learned in primary school with the single difference that the digits are larger. That’s all. (and actually wrong, because you learned something different in primary school, you applied a shortcut but nobody told you that it is a shortcut)

Digits are the numbers from zero to nine in a number based on the number ten. Base ten is quite common these days but in some under-developed countries you may find a measurement system with base 12 (the foot and inch system used in your southern neighbourhood, y’know? 😉 ).
And for multiplying numbers that consist of several digits you multiply two digits and when the result x is larger than 10 you slice the result r = x%10 and c = floor(x/10) where r is the rest and c is the carry.
The only difference between digits and limbs is the size.
You can take the above equations and replace the “10” with “10^9” and you have what you need, so in general with $x$ the number, a positive integer $b$ the base, $c$ the carry and $r$ the rest.

${\displaystyle{ r = x - \left\lfloor\frac{x}{b}\right\rfloor \cdot b }}$

${\displaystyle{ c = \left\lfloor\frac{x}{b}\right\rfloor }}$

Another difference is that the carry $c$ can be larger than one.

And you have to manage the memory yourself, so don’t forget that the magnitude of every multiplication is up to twice as large as the individual factors (e.g.: 9*9 = 18, 99*99 = 9801 and so on.), so if the product of two of the largest possible limbs does not fit into the same data type as the limb you need a bigger one. For example: you need a 64 bit large data type to hold the product of two 32 bit large numbers. You don’t need to but avoiding it needs some sophisticated bit-juggling. It is not that complicated but nothing I would recommend for a start and if you keep it modular you can replace it later without a problem, it will be only at one place (actually two places, with the other one being squaring). if you want to try it nevertheless: take both 32-bit values as two digit numbers (low and high) and do long multiplication with them.
Assuming 32 bit integers a and b and written for legibility.
Caveat: the following pseudocode might contain typographical and other errors but does not contain any checks and balances, please control before use.

function low(x)
return x & 0xFFFF

function high(x)
return x >>> 16

function multiply (a,b, carry=0, rest=0)
high_a = high(a)
low_a = low(a)
high_b = high(b)
low_b = low(b)
base = 16

temp, t0, t1, t2, t3 = 0;

comment: were doing simple long multiplication here, do not get distracted by the view!

temp = low_a * low_b
t0 = low(temp)

temp = high_a * low_b + high(temp)
t1 = low(temp)
t2 = high(temp)

temp = t1 + low_a * high_b
t1 = low(temp);

temp = t2 + high_a * high_b + high(temp)
t2 = low(temp)
t3 = high(temp)

carry = t3 << base | t2
rest   = t1 << base | t0


Obvious problem: that is for binary based numbers not for decimal numbers but the difference is only superfluous.
Bit shift right for binary numbers is the same as dividing (dropping the rest) by a power of ten for decimal numbers (the difference between “>>” and “>>>” in Java and JavaScript is between signed and unsigned shifting).
Bit shift left is the same as multiplying with a power of ten
Oring (the vertical bar “|”) is the same as adding.

But as said at the start: keep it simple and just multiply and use a larger data type as the intermediate result.

Ah, found a rather good (YMMV, of course!) explanation of the squaring algorithm here: http://www.cygnus-software.com/docs/html/64bitcalcs.htm

Another hint if you want to implement a full big-integer library, although it’s one of the YMMV tips:
You do all computation on unsigned big-integers and handle the sign apart. In my JavaScript implementation I used a JavaScript object containing the array, the length of the array (can differ from the actual length) and the sign. You can put it all in one large array (length and sign need only one integer) but that makes it unnecessary complicated and the overhead of a full object is negligible in contrast to the many operations the array inside of the object gets treated with.

Like