Benedikt Meurer JavaScript Engine Hacker and Programming Language Enthusiast.

Caching MIME information

RedHat's Matthias Clasen came up with the idea of caching the MIME information provided by the Shared MIME database (the list of MIME type aliases, subclass information, the glob patterns, the magic patterns and the XML namespaces) in a mmapable file today. With his attempt, there'll be one cache file (mime.cache) per mime directory, e.g. on most Linux systems, this will be

  • /usr/local/share/mime/mime.cache
  • /usr/share/mime/mime.cache
  • ~/.local/share/mime/mime.cache
while on for example FreeBSD, this will be
  • /usr/X11R6/share/mime/mime.cache
  • /usr/local/share/mime/mime.cache
  • ~/.local/share/mime/mime.cache
(both with default $XDG_DATA_DIRS settings). He also provides patches for both xdgmime (the MIME implementation used by gtk+ and gnome-vfs) and update-mime-database.

The general idea is good, and the implementation looks good too. We should consider using the cache for Thunar as well. My major concern about this currently is the bad data locality with many mime.cache files (which should only happen if the admin is on crack) and the probably reduced performance due to the big endian to little endian conversions (we need a benchmark here to decide if it's really worth to spend another thought on this conversion). If we'd adopt this idea for Thunar, we could further reduce startup time, as the process would not need to parse the MIME database first, and it would reduce the memory overhead, tho this is less critical in case of Thunar, since all windows (and the desktop background) run in the same process space and thereby share the MIME database.

Anyways, it's nice to see some useful activity on the xdg mailinglist again, after all this discuss it to death-threads about D-VFS and gconf.

Edit: Additional notes can be found in this mail.