Unconstant Conjunction latest posts

Parsing tape data with Python

Recently I had to use an older dataset that hadn’t been nicely sanitized into something that Stata or Pandas could understand. In particular, I was working with the NHANES I Epidemiologic Follow-up Study1, which has data available only in the original tape format from the 1980s and early 1990s. An individual record from one of the smaller files looks like this (scroll right for the full effect): 9220809 511 12 0112 996102410232091442411 1 2 9 486 21400520 230031142750215185031486 0 03 42750486 051850 Where every number might be a single variable or part of a multi-column variable. Continue Reading →

Moving to Pelican

I recently moved the site over to Pelican. Although I liked Octopress, it wasn’t working exactly like I wanted it to, and I’m not comfortable enough working with Ruby to modify it. Since Python has become my go-to language for most things these days, it made sense to move to what seems to be the most popular static site generator written in that language. For posteriority’s sake, the following is a guide based on the steps I took to get it working. I’ve also included most of my configuration files and a short program for creating new posts.

Continue Reading →

Writing a Web Scraper

Okay, so I know everyone’s written some kind of web scraper in their time, but I’m still proud of myself for my own take on the subject.

Recently, I saw an interesting post on Reddit that provided some source code for scraping images based on the similarity of their names. This might be useful for downloading all of the images in a gallery (for example, if you want to keep something posted on Imgur) without having to follow all of the links by hand. However, there was quite a lot of it missing, it relied on some random site, and it was bad at handling all kinds of common use cases: for example, thumbnails.

So I rewrote it. And expanded it. Massively. The current usage output is

  galleryscraper.py URL DIR [--threads N --log-level N -q -s]
  galleryscraper.py -h | --help | --version

      --threads N        the number of threads to use [default: 4]
  -V, --log-level N      the level of info logged to the console, which can be
                         one of INFO, DEBUG, or WARNING [default: INFO]
  -s, --skip-duplicates  ignore files that have been downloaded already
  -q, --quiet            suppress output to console
  -v, --version          show program's version number and exit
  -h, --help             show this help message and exit
Continue Reading →

Using More Interesting Random Values in Procedural Content

This post is an inquiry into some of the drawbacks with using the random() function to generate all your random values, and discusses the circumstances in which the Beta distribution might prove a compelling alternative. It is written with procedural games in mind, there is no expectation that the reader know what a ‘distribution’ is, and all of the code examples are written in Python.

Continue Reading →

An Update on the Libnoise Wrapper

NoisyPy Perlin Demo

I’ve gotten a good chunk of the Libnoise wrapper working. And despite several sessions of head-against-wall bugs, my SIP and disutils setup seems to be working nicely. Some of the C++ code in noiseutils.h (which comes with the Libnoise examples) is quite outdated, so I wrote several of my own output functions in pure C++. This also gave me the opportunity to write exporters to OpenGL textures, which seem to work seamlessly with Pyglet. I should have a whole directory of examples up and running shortly, drawing from the standard Libnoise examples and as well as demonstrations of OpenGL textures.

Continue Reading →