Renovating our foundations: new database schema
For a long time now we have been planning to extend the amount of information we store in the database for the next version, 0.10, which will run on KDE4. As the database is the central storage and digikam is built around it, such a move involves deep structural changes. We have done preparations, discussed on the mailing list what we want and what we need, made up documentation, and then sat down to code.
So during the last weeks I have been working on implementing the new database schema, and tonight I have merged my commits.
As this is the first time that our ideas have found their way into real code that you can build and use from SVN (although you should not use digikam trunk SVN for production currently) it is time to give some insight.
I will give a short overview on digikam's history, what we have reached now and what we intend to do with all the information that we find now at hand.
In version 0.7, digikam dumped the old configuration file and started to use a real database, with SQLite2. Version 0.8 brought a major update the SQLite3. The 0.9 generation focussed on different areas, but it has since become clear that we need more from our database than the implementation could provide.
In spring this year I have prepared the basics, such as moving all db related code to a common place (removing so much duplicate code) and using the Qt SQL module.
In digikam 0.9 we store comment, date and rating for an image (where the rating does not have a real place in the db schema). This was all right at the time, but now we have to ask: What is the language of the comment? How to store multiple comments? Is it the creation date, the digitization date? This is meant to show: There is so much more information now accessible in the metadata of an image, and we want to store that in our own highly specialized way.
The new database schema now offers the room to breathe that we need. There is a wealth of new possibilities. For example, now that GPS coordinates are stored in the db, we can offer search for "pictures that are taken near this one". Search is an important topic here: give me all images with focal length 100-150mm and flash light used.
All these features need to be implemented still, but the basis is there.
There is even more possible now:
- similarity searching using a Haar algorithm
- multiple paths to store your pictures; previously, digikam was restricted to one path
- use Solid to detect if a removable media is present
- file tracking by contents
Expect more blog entries on technical aspects in the future.
One question that might arise is on the relation of all this to Nepomuk and strigi. The current situation is that we plan to do some integration work sometime with Nepomuk, but we dont want to drop our own database as it is an integral part of digikam, and it is highly specialized in one single field of knowledge, photo metadata.
Be warned: If you use current SVN, you wont see much of all this glory so far. But you know, the backend is there...