Extract tags out of digikam

I've always wanted to tag my image collection, but it's big and a pain to do by hand. Today I found out that digikam has got to a point in which it can be used as a nice interface to categorise images, so I now have something cool for the tagging work.

Now, I'd like to play with the tags using my debtags toolchain. Digikam stores data in a SQLite3 database, so it's easy to convert it into the text format used by tagcoll. Here's the script to do it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#!/usr/bin/ruby

# gettags - Extract a tagged collection out of the digikam database
#
# Copyright (C) 2006  Enrico Zini <enrico@debian.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

# Usage: gettags [path/to/digikam3.db]
# If the database file is not given, use the one in the current directory

require 'sqlite3'

db = SQLite3::Database.new( ARGV[0] || 'digikam3.db' )

# Build a table of id => complete tag name
tagByID = {}
db.execute( "select id, pid, name from Tags" ) do |row|
        if row[1] != 0 and tagByID[row[1]] then
                tagByID[row[0]] = tagByID[row[1]]+'::'+row[2]
        else
                tagByID[row[0]] = row[2];
        end
end

# Why does it work with instant queries but not with precompiled queries?
#db.execute("select tagid from ImageTags where imageid = ?", 36887) { |r| print r[0], "\n" }
#gettags = db.prepare("select tagid from ImageTags where imageid = ?")
#gettags.execute(36887) { |r| print r[0], "\n" }

lastname = nil
lastdir = nil
ids = []
db.execute( "select i.name, t.tagid, a.url from Images i, ImageTags t, Albums a where i.id = t.imageid and i.dirid = a.id" ) do |row|
        if row[0] != lastname then
                if lastname != nil then
                        print lastdir, "/", lastname, ": ",
                              ids.collect { |id| tagByID[id] }.join(', '), "\n"
                end
                lastname = row[0]
                lastdir = row[2]
                ids = []
        end
        ids <<= row[1]
end
if ! ids.empty? then
        print lastdir, "/", lastname, ": ", ids.collect { |id| tagByID[id] }.join(', '), "\n"
end

exit 0

I didn't understand why if I perform a precompiled query I get a row object that cannot be indexed, while if I perform an instant query then everything is fine (see the comment in the script).

I'm still on the learning side of Ruby, so I welcome people telling me of better ways to write this script, and I'll be glad to update this entries with what I receive.

Done the little script, now I can plug my images collection into tagcoll and have a bit of fun::

./gettags | tagcoll related /2005/12-15-03-Taiwan-Newcamera/dsci0051.jpg ./gettags | tagcoll hierarchy ./gettags | tagcoll implications ./gettags | tagcoll findspecials

Cute! And this is another little step to somehow connect pieces of the debtags toolchain to the web, which is something I'd like to explore both for the web interface of the central database, my blog and my picture gallery.