Omnimaga

General Discussion => Technology and Development => Computer Projects and Ideas => Topic started by: TC01 on November 20, 2010, 05:21:03 pm

Title: Python script to download all xkcds
Post by: TC01 on November 20, 2010, 05:21:03 pm
Well, actually, this is a Python library for accessing xkcd.

It does contain five scripts- one of which does download all xkcds onto your computer. (I haven't actually downloaded all ~850 yet though). There are two others, for opening up random/last xkcds in your web browser, and two others for downloading the random/latest xkcd to your computer.

Plus, you can use the functions available in the library for doing other things.

You need Python 2.x (2.7 is the latest version) to install: you can get it here (http://www.python.org/download/).

As of version 1.1, you no longer need the feedparser module. I've modified it to use xkcd's JSON interface, which is good for two reasons: the RSS feed sometimes has things in it that aren't comic) which break the old implemention, and secondly: it removes the only third-party dependency.

The download of xkcd.zip includes seven files (minus the readme, which has all this text):

-dowallxkcd.py: script to download all xkcds
-dowlastxkcd.py: script to download latest xkcd
-dowrandxkcd.py: script to download random xkcd
-lastxkcd.py: script to open a random xkcd in browser
-randxkcd.py: script to open the latest xkcd in browser
-xkcd.py: the library itself
-xkcd-1.0.zip: a zipped Python package - if you know how to install Python packages just install this

To install manually on Windows, you'd put xkcd.py in C:\Python27\lib\site-packages (wherever Python is installed, then \lib\site-packages). The scripts can go anywhere.

To install manually on Linux... it goes to the same lib/site-packages folder, but I'm not sure where this is- either /lib/python/ or /usr/lib/python, probably.
Title: Re: Python script to download all xkcds
Post by: yunhua98 on November 20, 2010, 05:23:26 pm
Nice!  /me will download once he frees up some disk space.  :P
Title: Re: Python script to download all xkcds
Post by: DJ Omnimaga on November 20, 2010, 05:33:59 pm
Nice, I hope if this is popular that this doesn't overload their servers, though. O.o (eg: if someone decided to post the script on 4chan or another ultra-busy community site)
Title: Re: Python script to download all xkcds
Post by: TC01 on November 21, 2010, 06:54:08 pm
Reuploaded with a readme (that I realized I omitted previously)- it contains everything in the first post as well as some legal disclaimers.
Title: Re: Python script to download all xkcds
Post by: DJ Omnimaga on November 21, 2010, 07:24:55 pm
Ah cool! :D
Title: Re: Python script to download all xkcds
Post by: ruler501 on January 17, 2011, 01:27:33 pm
very nice I will have to use this. Is there any way to change the website it accesses and download from others?
Title: Re: Python script to download all xkcds
Post by: SirCmpwn on January 17, 2011, 02:02:07 pm
Does this handle this (http://www.xkcd.com/404)?
Title: Re: Python script to download all xkcds
Post by: DJ Omnimaga on January 17, 2011, 02:14:54 pm
Lol SirCmpwn XD

It would be funny if the script actually found a comic hidden there. :P
Title: Re: Python script to download all xkcds
Post by: jnesselr on January 17, 2011, 03:03:36 pm
Lol SirCmpwn XD

It would be funny if the script actually found a comic hidden there. :P
Nope, none hidden there.  But there is a comic hidden deep within xkcd.
Title: Re: Python script to download all xkcds
Post by: Builderboy on January 17, 2011, 03:21:20 pm
There is? o.O I bet this script would fail for some of the newer interactive comics ;)
Title: Re: Python script to download all xkcds
Post by: SirCmpwn on January 17, 2011, 03:26:42 pm
graphmastur, you will now share all knowledge you have on the subject, or face my wrath.
Title: Re: Python script to download all xkcds
Post by: jnesselr on January 17, 2011, 05:04:28 pm
graphmastur, you will now share all knowledge you have on the subject, or face my wrath.
lol, figure it out. I'll give you a hint.  You must see the source for uni.xkcd.com, and your scouter must be broken, since obviously the konami code is broken as well.
Title: Re: Python script to download all xkcds
Post by: SirCmpwn on January 17, 2011, 05:07:13 pm
Oh, I've seen the unixkcd source before, but I'm not able to see it from my phone.  Could you do me the favor of putting the beautified code in a spoiler, seeing as it may be quite a while before I can see it?
Title: Re: Python script to download all xkcds
Post by: jnesselr on January 17, 2011, 05:37:32 pm
Oh, I've seen the unixkcd source before, but I'm not able to see it from my phone.  Could you do me the favor of putting the beautified code in a spoiler, seeing as it may be quite a while before I can see it?
sure:
Spoiler For Spoiler:
THE GAME (https://github.com/chromakode/xkcdfools/raw/master/src/over9000.png)
Click if you dare. :D
Title: Re: Python script to download all xkcds
Post by: TC01 on January 17, 2011, 05:41:05 pm
To answer the original question(s). :P

It does handle 404, because all Internet access runs in a try-except block, and it simply skips over 404 if it fails.

It can't be changed to use other websites, because it depends on the way xkcd works- the page encoding saying "direct link to comic", the RSS feed,  and the page numbers for comics: "xkcd.com/843".
Title: Re: Python script to download all xkcds
Post by: TC01 on February 06, 2011, 12:34:17 pm
I've released a 1.1 update to this library (and replaced the attachment in the first post).

The only change this makes: it now uses xkcd's built-in JSON interface rather than the RSS feed to get the latest comic number. This is good for two reasons:

1. It removes the third-party dependency (you now only need Python 2.x)
2. It stops the script from breaking when the RSS contains things other than comics in it (as it occasionally does).
Title: Re: Python script to download all xkcds
Post by: DJ Omnimaga on February 07, 2011, 11:45:22 pm
Cool to hear. Is it less intensive on their server, too? Also by different stuff do you mean for example XKCD #404?
Title: Re: Python script to download all xkcds
Post by: TC01 on February 09, 2011, 04:37:44 pm
Cool to hear. Is it less intensive on their server, too? Also by different stuff do you mean for example XKCD #404?

Well, I'm not sure how more/less intensive it is on their server. I don't know how more or less intensive it is to access JSON or an RSS feed.

As for "different stuff"- no, not like 404, but sometimes they post information about their site store or other miscellaneous news in the RSS as opposed to new comics.