robinturner: (Default)
[personal profile] robinturner
Well this is puzzling. I'm working on a quick dirty Perl hack for downloading my journal (comments and all). It goes like this:

#!/usr/bin/perl
use LWP::Simple;
for ($count=579; $count<600; $count++) {
$head="http://www.livejournal.com/talkread.bml?journal=solri&itemid=";
$url=$head . $count;
$content = get($url);
print "$content \n";
}

Of course the last bit will be changed to append to a file, rather than fill the terminal with HTML. The problem with this method is that most itemids aren't used (so you download zillions of error pages), and I can't see a pattern for the one's which are used. I mean, can anyone see anything meaningful in this sequence?

76946
77116
77555
77741

OK, the numbers get bigger, but that's not much help. Of course I could include a search string for "No such entry" and not print that to the file, but I'd still waste time downloading a few hundred error messages for each journal entry.

Date: 2002-12-06 03:28 pm (UTC)
From: [identity profile] solri.livejournal.com
LJ::Simple does the opposite of what I want; i.e., it uploads rather than downloads. But I'm browsing through the code for useful hints. I guess I'll just have to learn those damn LJ protocols. Sigh.

Date: 2002-12-06 03:40 pm (UTC)
From: [identity profile] thedward.livejournal.com
It does both ways, look here.

Date: 2002-12-06 04:24 pm (UTC)
From: [identity profile] solri.livejournal.com
Oh yes, so it does. This is worth studying, since I'm largely doing this exercise to improve my Perl skills, which were never up to much and are now rustier than my car.

If I can get this written, then I can progress to the fun bit, which is to convert the downloaded entries to LaTeX. Of course I could just use latex2html, but then I'd still have to write something to strip out the stuff I don't want, and besides, the HTML involved in so basic it shouldn't be too hard to convert.

Nice userpic, BTW.

Dang!

Date: 2002-12-06 04:42 pm (UTC)
From: [identity profile] solri.livejournal.com
getentries.pl: Failed to get entries - LJ request failed: Client error: Protocol version mismatch: Cannot display/edit a Unicode post with a non-Unicode client.

Re: Dang!

Date: 2002-12-07 02:23 pm (UTC)
From: [identity profile] solri.livejournal.com
Hah, scotched that one - just needed to comment out a few lines. Just need to sort out the weird timestamp thing and I'm there.

Profile

robinturner: (Default)
Robin Turner

June 2014

M T W T F S S
      1
2345678
9101112131415
16171819202122
232425 26272829
30      

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags