Monday, June 29, 2009

A C programmer in Android world : discovering Java allocations

I'm mainly, in my day world, a C / C++ programmer.
So I'm used with this language and sometimes, I feel like I would prefer to use another, more modern language !

So I was quite happy to try Java. I saw Java as a kind of modern, cleaner C++, with some nice features, and a more improved object model.

But for the 'WordProspector' game, in the process of taking the database out, and replacing it with 'something' else, more lighter (see here ), I found a serious limitation of this Java object model...


The issue :
As 'WordProspector' is a word game, it is shipped with a word dictionary.
At first, I stored the dictionary as a SQL database, as it was simpler for me at that time.
but when I found how memory costly it is, I decided to switch to another way to store it.

So I came up with a kind of tree with a letter for each node.
In my effort to compress the dictionary as much as possible, I found a way to store each node as a 8 bits integer plus a 16 bits integer, so it was 3 bytes.
So I had a Node class, with simply a byte and a short.
But I had a big number of node instances in my tree : something like 300 000 nodes.

So when I loaded the tree, I started by allocating all the nodes.
Allocating the nodes with java was something like :




NodeArray = new Node[nbNodes];

for ( int i = 0; i <>
NodeArray[i] = new Node();




When I tried it on the emulator... It took several minutes... Then the program crashed...
Too much allocations of this size !

I was amazed that I just couldn't easily create this really simple array !!

Then I discovered that allocations of simple type arrays didn't demand to allocate each element !

So finally, I get rid of my nice Node class, save the whole thing as a byte array, and interpret, on the fly, the byte array as a byte plus a short.
With that, the allocation is as fast as it could be :


NodeArray = new byte[nbNodes * 3];



Conclusion:
With this allocation issue, I had to get rid of my Node class, I just have a big array of bytes.
Acces to member are much more complicated, the code that was once so simple is now much more complex.
Add on this issue the fact that what I really wanted was an unsigned short, and not a short.
My code is now full of bits manipulation to create and interpret some bytes as unsigned short, or as bytes, depending of the situation.


With C/C++, I would never have this problem, and I would have a clean object code !

4 comments:

viroos said...

I think using ArrayList instead of arrays is good idea.

AndroidBlogger said...

Well, I don't know.
I was under the impression that an arrayList is even more complex than a simple Array.
So using a ArrayList for a high performance array of 300 000 nodes would be a bad idea.
Am I wrong ?

viroos said...

Yes it is more complex but it allows allocate all needed memory at once.

AndroidBlogger said...

Yes, indeed.
Actually, blame my lacking knowledge of Java, the term 'ArrayList' made me think it is a kind of linked list.
I've just checked, and found I was completely wrong, as it is more like a STL vector ( for C++ programmers ).
So indeed, the cost of using a ArrayList would only the call of the empty constructor on all of the elements.
I have no idea how heavy that is...
When I have some time, I will give it a try, as it would result in a much cleaner code !

Thanks for your comment !!