Okay, so I know everyone’s written some kind of web scraper in their time, but
I’m still proud of myself for
my own take on the subject.
Recently, I saw an
interesting
post on Reddit that provided some source code for scraping images based on the
similarity of their names. This might be useful for downloading all of the
images in a gallery (for example, if you want to keep something posted on Imgur)
without having to follow all of the links by hand. However, there was quite a
lot of it missing, it relied on some random site, and it was bad at handling all
kinds of common use cases: for example, thumbnails.
So I rewrote it. And expanded it. Massively. The current usage output is
Usage:
galleryscraper.py URL DIR [--threads N --log-level N -q -s]
galleryscraper.py -h | --help | --version
Options:
--threads N the number of threads to use [default: 4]
-V, --log-level N the level of info logged to the console, which can be
one of INFO, DEBUG, or WARNING [default: INFO]
-s, --skip-duplicates ignore files that have been downloaded already
-q, --quiet suppress output to console
-v, --version show program's version number and exit
-h, --help show this help message and exit
Continue Reading →
I recently tried to get an IRC server working on an installation of Arch Linux,
and it failed to work “out of the box”. Below are the steps I took to get it
working, including the work I did to get an SSL connection up and running.
Continue Reading →
This post is an inquiry into some of the drawbacks with using the random()
function to generate all your random values, and discusses the circumstances in
which the Beta distribution might prove a compelling alternative. It is written
with procedural games in mind, there is no expectation that the reader know what
a ‘distribution’ is, and all of the code examples are written in Python.
Continue Reading →
![NoisyPy Perlin Demo](./images/noisypy-perlin.png)
I’ve gotten a good chunk of the Libnoise wrapper working. And despite several
sessions of head-against-wall bugs, my SIP and disutils
setup seems to be
working nicely. Some of the C++ code in noiseutils.h
(which comes with the
Libnoise examples) is quite outdated, so I wrote several of my own output
functions in pure C++. This also gave me the opportunity to write exporters to
OpenGL textures, which seem to work seamlessly with
Pyglet. I should have a whole directory of examples
up and running shortly, drawing from the standard Libnoise examples and as well
as demonstrations of OpenGL textures.
Continue Reading →
Note: this post can be read as a series of terminal commands if you’re not
interested in the intervening explanations.
Recently, I started writing a Python extension that wraps the procedural
generation library libnoise using
SIP-generated code, and
encountered the inevitable segfaults. It didn’t take long to establish that the
error was somewhere in my C++ code, but I couldn’t figure out where exactly it
was. A couple of searches online revealed that the way to go about finding these
things was to use a proper debugger, and because the GNU Debugger – ``gdb` –
supports Python, that seemed to be the way to go.
Getting it working was not, however, particularly straightforward. gdb
supports debugging Python code as of version 7. Unfortunately, Apple’s version
of gdb
is currently sitting at 6.3. So all I had to do was get a newer
version, right? Wrong.
Continue Reading →