robinturner | Question for any LJ hackers or mathematicians out there

Well this is puzzling. I'm working on a quick dirty Perl hack for downloading my journal (comments and all). It goes like this:

#!/usr/bin/perl
use LWP::Simple;
for ($count=579; $count<600; $count++) {
$head="http://www.livejournal.com/talkread.bml?journal=solri&itemid=";
$url=$head . $count;
$content = get($url);
print "$content \n";
}

Of course the last bit will be changed to append to a file, rather than fill the terminal with HTML. The problem with this method is that most itemids aren't used (so you download zillions of error pages), and I can't see a pattern for the one's which are used. I mean, can anyone see anything meaningful in this sequence?

76946
77116
77555
77741

OK, the numbers get bigger, but that's not much help. Of course I could include a search string for "No such entry" and not print that to the file, but I'd still waste time downloading a few hundred error messages for each journal entry.

Flat | Top-Level Comments Only

From:

solri.livejournal.com

Oh yes, so it does. This is worth studying, since I'm largely doing this exercise to improve my Perl skills, which were never up to much and are now rustier than my car.

If I can get this written, then I can progress to the fun bit, which is to convert the downloaded entries to LaTeX. Of course I could just use latex2html, but then I'd still have to write something to strip out the stuff I don't want, and besides, the HTML involved in so basic it shouldn't be too hard to convert.

Nice userpic, BTW.

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Robin Turner

Question for any LJ hackers or mathematicians out there

Question for any LJ hackers or mathematicians out there

no subject

Profile

June 2014

Links

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags