::::: : the wood : davidrobins.net

MythVideo upgrade: better metadata

News, Technical, Media ·Friday November 27, 2009 @ 23:30 EST (link)

I recently upgraded the MythTV box (to the current Gentoo ebuilds, including 0.22 the MythTV application). I did this to get some new features of MythVideo including improved metadata storage and lookup for videos and TV shows. The new themes look nice too, for the most part; sometimes text is too small, and since I don't have cover art/posters for most of my movies/shows there are a lot of very big images of question marks (or blanks). But that will be remedied. The included scripts in /usr/share/mythtv/mythvideo/scripts (mainly tmdb.pl and ttvdb.py) that find episodes (using TheTVDB.com and themoviedb.org) are a reasonably good place to start, but they don't conform too well to my naming conventions. Fortunately they are both in languages I know extremely well: open source means you can make things better, and fairly easily (even if it's just a local benefit); it's a knowledge economy.

Having posters for most movies makes the MythVideo browsing UI pretty nice. By default, the shortcut key W will by default invoke one of the two scripts above (configurable, of course; so I could fix the scripts or replace them with my own) which will pull metadata from one of those two free sites (there are other scripts that will use IMDb or other sites as sources). They have a well-defined interface as to how information is communicated between them and MythTV.

For instance, it would be nice to be able to select the poster used for a movie visually. Is it possible to pop up a UI from a script within an existing X session? I don't see any reason why not; and indeed, the PyGTK tutorial's hello world script works great. (I went with GTK because Tk and Motif and wx and the older ones looks butt-ugly and have severely limited functionality.) It can even be done from an ssh session as long as it's prefaced by export DISPLAY=:0.0 (as appropriate). (It's lousy that it segfaults if DISPLAY isn't set, though.)

So what's wrong with the tmdb.pl and ttvdb.py scripts? Why not use them and be happy? Perhaps I can, but these are the hurdles:
  1. They don't understand my naming convention. For example, I group some movies into directories and remove redundancies, e.g. Harry Potter/1 The Philosopher's Stone (the 1 keeps the order correct; there isn't a sort override field in the videometadata database). The script tries to find a movie called "1 The Philosopher's Stone" rather than keeping the directory and stripping the numeric prefix. If I used the full names (Harry Potter and …) they'd still be out of order (alpha-sorted). Possibly fixable by modifying the script itself, but then I'm doomed when I upgrade.

  2. It picks the first movie poster available; I'd like to be able to select one (see above regarding PyGTK). E.g., the first poster for Transformers: Revenge of the Fallen is pretty dark and gloomy and uninformative.

  3. When the database is rescanned, old metadata is destroyed and has to be re-fetched from scratch, meaning unnecessary and slow network traffic, and items with multiple matches have to be user-selected again. Granted, one fix is to just not run "Scan for changes", although I don't know what MythTV will do to the database in future. But this can be overcome by automated database backups.

  4. I want to tie it in with my DVD watch list.

There's a useful new utility called Jamu (Just Another Metadata Updater—really, those names aren't clever any more, people, although at least it provides a recognizable and unique program name). It can do lookups for various metadata and artwork, so I'll certainly make use of it in my own utilities.

Aside from the MythTV stuff there was the usual collection of blocking package issues which eventually got sorted. The Myth box needed 430 packages; then I decided to emerge -uDNv world on the server too, which also had its share of issues. For a while I had some corruption on torrents created by one machine on a (Samba) shared directory on another. When I upgraded both to kernel 2.6.31 the problem went away; I'd narrowed it down to Samba (vs. LVM or something else) being the culprit, and there were fixes for CIFS corruption between my old (2.6.28) kernel and this one.

Speaking of database backups above, I borrowed a script from another machine and started automatic backups of the MythTV database. I suppose I can just use the videometadata table if I can be confident that it won't randomly disappear, or if it does, I can get it back with minimal hassle. (I never need to actually rescan for new content, since I already have programs that monitor my video directory with inotify and add videometadata entries appropriately, and a non-destructive scanner.

Given the current cost of HD space (e.g. an external USB 1.5T drive is a little over $100 today at Newegg.com), it's almost as cheap to record data (show episodes, or whole DVDs) on a (potential chain of) USB HDs (possibly linked as a logical volume, or symlinked from a common directory) than it would be to use DVD±Rs or (given the stage of development—8x and 12x burners are rare and expensive, especially external—and burn speed, and low penetration compared to DVDs) Blu-Ray disks. Perhaps Bob is onto something, at the current price point.

I also installed MythMusic, but need a tiny utility to set ID3 properly (just something simple that uses the directory and filename) (Mutagen, a promising Python ID3 library). When the TV suspends or powers down it also turns off the music (since it goes through the TV to the amp), so I increased the blanking time in xorg.conf.

Useful modules: IMDb lookup in Perl (IMDB::Film; tested, works) and Python (imdbpy); Python 3 (what I'm using but modules can be hard to find since it's relatively new) PostgreSQL module.