Piperka blog

Archive thumbnails

I implemented a new feature for Piperka: archive thumbnails. So far, there's only one comic I've enabled it for: Pepper & Carrot. The page number count has been linkified and clicking it will open two dialog windows, one with a listing of archive pages and a second one with thumbnails. Thumbnails would work better with a comic with a fixed page size but this is all I have for now.

The thumbnails are generated with Selenium which is used to render the page as a regular browser would do (which it does indeed use behind the scenes) and to save a screenshot which is then compressed into a smaller size both to save space and to allow showing the whole archive in a single view. Also to make sure that this form can't actually be used to read the comic.

I got a bit enthused about this feature while planning for it and implementing it but now I'm a bit uncertain about how to proceed. I certainly would like to go ahead and download and compress thumbnails for all the 2.2 million archive pages indexed on Piperka. I think I would have a pretty good case for fair use with what I'm doing as my use is transformative and it doesn't subtract from the content's original intended use, that is, reading. But I'm subject to Finnish and EU copyright laws and practices and not US ones and they don't recognize that concept over here.

I generally like living in this part of the world but EU's increasing copyright maximalism doesn't make me feel like singing Ode to Joy. I would expect that most authors wouldn't mind that I'd generate thumbnails. It's not hard to find most of their comics copied on archive.org and they have the originals in full size. It's a nice idea that I'd ask all the authors but at this scale and with me doing it alone it's more a matter of "can't" rather than "won't". Many of them wouldn't likely even respond even when they'd be fine with my use. Some may even be more annoyed to have me contact them at all and would rather have me do whatever I do without bothering them.

Regardless of copyrights, I'd be certain to drop the thumbnails for a comic on request. I couldn't be running Piperka without web comic artists' goodwill and that's not codified in any law. I'd just like it if I could assume that I had a better default position with fair use. If you're an author and would like to have thumbnails generated for your comic then feel free to drop me an email. I just won't get anywhere far with this feature if I make it opt in and wait for authors to contact me.

I'd love to hear your opinions about this feature. Especially if you're an author.

Even without thumbnails, the archive dialog is now openable for all comics on the info pages. I haven't stored the titles for any of the archive pages and the text used on them is a part of the raw archive URL. It's a bit crude but it works. The same dialog was available on Reader all along and I did plan to add it for the info page but I never returned to do it until now.

My next development goal is to add more automation and better reporting to crawl issue detection. With the recent crawler update I have much more data available on its actions in an easily processable form and I would do well to have an interface for reviewing it. I should also add an extra periodical run to check on the health of those long quiet comics. It should tell plenty if trying to download an old page with a known following page would fail the parse to find it.

Thu, 15 Nov 2018 15:10:51 UTC