Wednesday, May 27, 2009

How to ship an application with a pre baked database...

And why you should probably not doing it !


1) Why did I tried that
When I first started to program 'Word Prospector', my word game on Android, I already had a Pc game of this kind, written with my Python / C++ 2D engine.
The dictionary part, in particular, was in C. It was brute force : I just compared the input string with every word in my dictionary ( with 300 000 words ! ), and it was just Ok !

When I tried to do the same thing on the Android emulator ( the real device hadn't shipped yet here in France ), it was horribly slow ! Actually the worst part was loading a big array of words. Launching the game, on the emulator, was something like 20 minutes !!! Even if the real device could do better, I couldn't take the risk !
So my first though was to try to use native code for the dictionary. And the SQL database was native code.
I tried it, and the results were good : no loading time, no sensitive time to check if a word was in the dictionary...

So how to do that ?

2) Create the database on the device ( with the emulator )
The first thing to do, in order to have a pre baked database, is to create it !
So that is what I did.
I launch the game, loading all my dictionary, and inserting every word in my database. It took something like 20 minutes or more, but after that, I had a complete database, somewhere on my emulator.

3) Get the database from the device to the Pc
That step is quite trivial, but only once you know how to do it !
The simpler to do it is through Eclipse :
* launch the emulator from within eclipse,
* Open the DDMS perspective ( Windows / Open Perspective / Others... / DDMS )
* Look for the database file : its located in the data/data/YourApplicationPackageName/Databases/ directory, and its named with the name you give the database.
* Click on this file, then click on the little icon on the upper right of the DDMS window called 'Pull a file from the device' ( see the screen shot below ).
* Give the directory on your PC you want to copy the file to.
* Wait till its done. As my file was really big, sometime, this step fired a timeout beofre it could properly finish. But doing it again was enough to have it.

DDMS Get a File From Device

Note that the purist can also do it with a command line.
Something like 'adb pull NameOnDevice NameOnPC'


4) Splitting the file
NOTE : THIS STEP IS NOT MANDATORY !!!
See the "Edit2" point at the end of the post for an explanation !!!!!
This file was pretty big, about 5 Mo.
A prebaked database will be a raw file in the resources.
But in the resources, files are limited to 1 Mo, so in order to provide a prebaked database, I had to split my database.
Here is the Pc side java method that did it :


private static void CutFilesInSizeParts(String InputFileName, String OutputFileName, int MaxPartSize)
{
try
{
File f = new File(InputFileName);
FileInputStream fis = new FileInputStream(f);
int TotalLength = fis.available();
byte[] buffer = new byte[ TotalLength + 1 ];
int len = fis.read( buffer );
int nbPart = len / MaxPartSize + 1;
int CurPos = 0;
for ( int i = 0; i < nbPart; i++ )
{
int PartLen = MaxPartSize;
if ( CurPos + PartLen >= len )
PartLen = len - CurPos;
String outRealFileName = OutputFileName + (i+1);
FileOutputStream fos = new FileOutputStream(outRealFileName);
fos.write(buffer, CurPos, PartLen);
CurPos += PartLen;
}
}
catch( IOException e)
{
System.out.println("issue");
}
}



I then could copy this generated parts of my database in the res/raw directory of my Word Prospector project.

5) Last step : creating the database file in your program
Some notes :
* We need to know the directory where to create the database. The getDatabasePath method is our friend here, but need a database as an argument ! So we will always start by using a SQLHelper that will create the database if it's not already present.
* We only need to recreate the database the first time the application is launched. After that, the file is created, everything is OK. So we can put a request on the database to see whether or not it is populated, and only copy the files if it is not populated.
* Creating the whole database file, is just appending all the database in the the file !

So here the code :


public void CreateFromRawDbFiles()
{
// Creation of an empty database if needed, with SQL Helper :
CreateMinimum();
// check for emptyness ( try a request on the database ) :
if ( !bIsDatabaseInstallationNeeded( ))
return;
// if empty, overwrite the file from ressources :
// Get file dir :
String DBFileName = mOwner.getDatabasePath( MY_DATABASE_NAME ).toString();
ConstructNewFileFromressources( DBFileName );
}

public void ConstructNewFileFromressources( String DBFile )
{
//Log.v("DataBase Installation", "Before creating files");
int ResourceList[] = new int[] {
R.raw.my_dico_db1,
R.raw.my_dico_db2,
R.raw.my_dico_db3,
R.raw.my_dico_db4,
R.raw.my_dico_db5,
R.raw.my_dico_db6
};
try
{
FileOutputStream Fos = new FileOutputStream( DBFile );
for ( int FileId : ResourceList )
{
InputStream inputFile = mOwner.getResources().openRawResource(FileId);
int TotalLength = 0;
try
{
TotalLength = inputFile.available();
}
catch ( IOException e)
{
Toast.makeText( mOwner,"Error Reading File",Toast.LENGTH_SHORT).show();
}

// Reading and writing the file Method 1 :
byte[] buffer = new byte[TotalLength];
int len = 0;
try
{
len = inputFile.read(buffer);
}
catch ( IOException e)
{
Toast.makeText( mOwner,"Error Reading File",Toast.LENGTH_SHORT).show();
}
Fos.write( buffer );
inputFile.close();
}
Fos.close();
}
catch( IOException e)
{
Toast.makeText( mOwner,"IO Error Reading/writing File",Toast.LENGTH_SHORT).show();
}
//Log.v("DataBase Installation", "End of creating files");
}






And that's all !!
Creating the database file was really fast, and only happens on the first launch, so everything was quite perfect...

BUT...

6) Conclusion : was it really a good idea ?
The first issue with this technique is : how solid it is ?
It looks like a hack and I felt a little unsecured with this.

But the real, terrible issue was somewhere else : the application size !
The database is really present twice with this technique : once in the .apk ( hopefully in a compressed form ), and the second time, as the real database file.

In the Word Prospector case, I finally had a game that weighted 2 Mo on the market, but once fully installed was 7 Mo !!! ( and for the french version it was even worse : the installed version was 9 Mo ) !

People are not used yet to check the size of your application, so it took some time before some one complained on the market, but it finally happened ! And even I, now that I have the device, was feeling it was really unpleasant !

Finally, I developed a algorithm to store my dictionary as a compact letter - tree, and got rid of this database !
Word Prospector is now about 700 ko !
And the french version, 'Chasseur de mots' is about 500 ko ( for some reasons, the french dictionary is much more compression friendly ).
It took me two evenings to create and implement this tree, I really should have done this in the first version !!

The real conclusion is : if you want to ship your application with a prebaked database, you can still do it. But beware the weight of your application. If your database is big, think about it twice, and try another way...

As an android user, my experience is : the bigger the application, the sooner I will get rid of it !!


EDIT : Read somewhere a good practise to avoid to have twice the database in your installed program : don't include it in your .apk, but post it somewhere on a website.
This way, the program will start by downloading it ( with a nice dialog box to explain it is only for the first time ), and it will then be only present in the database folder, so everything will be OK !

EDIT2 :
Actually, the point concerning the split of the database ( and the construction with several resource files ) is not necessary :
I was putting the database file in the resouces / Raw folder, and there is a 1 Mo by file limitation for every files in the ressources.
But you can also put this file in the asset folder, where no more size limitation exists !!!

7 comments:

Brandon Konkle said...

Great advice! Thanks for sharing your experience, it is greatly appreciated. I'm just getting started on Android development, but the project I have in mind will have some built-in data. This will definitely help. Thanks!

AndroidBlogger said...

Glad to hear you find it helpful !

Thanks for your comment and good luck for your development !

Anonymous said...

Hi
I'm coding my own chinese dictionary for my chinese night-class. Some traditional approaches for storing word-list for searching such as ListView or SQLite is inefficient (too slow or occupy big device's memory). The word-list is about 20.000 items. Could you please tell me about you algorithm or some advices to take over.
Thank in advance!

Name: Hoang
Email: chelskyboy@gmail.com

Unknown said...

nice article. any chance you would share your final algorithm solution?

Android said...

hi, could you show me the full source code???

I have the same problem, but still don't know which way to start...

Anonymous said...

Interesting!
Though this article is old, not optimal solution, still i'm impressive by your way!

Anonymous said...

I read this in 2016. Can you please mail me about your algorithm?
It seems like it's the solution to my current problem. Thanks.. I hope you get this.
destruct@tuta.io