So, I wanted to work on Julia’s sampling problem and, at the same time, try to add some functionality to iCrawl, by making better use of the HTML parser included in the standard library for python. For even a slightly experienced programmer, this would be child’s play, methinks. But since I have never adequately grasped the whole idea of Object-Oriented Programming, I feel a bit like I’m flying blind. I have no problem talking about classes and self in the social context, but when it comes to Python classes and references to self, I feel like an idiot every time. Basically, I end up cutting and pasting cookbook examples, and praying it will work. Someday, I’ll reserve some time to go through the appropriate chapters in something like How to Think Like a Computer Scientist, and maybe that will help. Maybe.
In the end, though, the error (isn’t it always) was just a dumb mistake, having nothing to do with OOP. When the crawler read in the config file, it changed everything to lower case for ease of processing. This probably would work about 95% of the time, since most folks don’t have caps in their URLs. Unfortunately, here it made all of the difference. And it only took me a few hours of banging my head against the monitor to figure it out.