Welcome, Guest. Please login or register.

Author Topic: How to backup or download this?????  (Read 2595 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline ID4Topic starter

  • Jr. Member
  • **
  • Join Date: Oct 2002
  • Posts: 74
    • Show only replies by ID4
    • http://www.franciscoguerrero.com
How to backup or download this?????
« on: March 30, 2007, 01:19:25 PM »
http://web.archive.org/web/20040415065133/www.nethkin.com/bmori/amiga/dos1.html

Any Idea? I tried web download software, but no luck :-(
...
 

Offline motorollin

  • Hero Member
  • *****
  • Join Date: Nov 2005
  • Posts: 8669
    • Show only replies by motorollin
Re: How to backup or download this?????
« Reply #1 on: March 30, 2007, 01:24:55 PM »
I just tried it with SurfOffline 2.0 beta, and got loads of "Forbidden" errors. Maybe they recognise bot activity and block it? If so you might have to manually download the pages and modify any image tags if necessary to point to local paths (if they are absolute URLs).

--
moto
Code: [Select]
10  IT\'S THE FINAL COUNTDOWN
20  FOR C = 1 TO 2
30     DA-NA-NAAAA-NAAAA DA-NA-NA-NA-NAAAA
40     DA-NA-NAAAA-NAAAA DA-NA-NA-NA-NA-NA-NAAAAA
50  NEXT C
60  NA-NA-NAAAA
70  NA-NA NA-NA-NA-NA-NAAAA NAAA-NAAAAAAAAAAA
80  GOTO 10
 

Offline Colani1200

  • Hero Member
  • *****
  • Join Date: Jul 2006
  • Posts: 707
    • Show only replies by Colani1200
Re: How to backup or download this?????
« Reply #2 on: March 30, 2007, 01:39:32 PM »
You might want to try wget, maybe with the option --user-agent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" to fake the user agent string.  ;-)
 

Offline motorollin

  • Hero Member
  • *****
  • Join Date: Nov 2005
  • Posts: 8669
    • Show only replies by motorollin
Re: How to backup or download this?????
« Reply #3 on: March 30, 2007, 01:41:41 PM »
I think have spotted the problem. In the source code of dos1.html there is a tag "http://www.nethkin.com/bmori/amiga/dos1.html">". When I look in the log file for SurfOffline I see that it is trying to download, for example, http://www.nethkin.com/bmori/amiga/ados7.gif, when the file is actually located in http://web.archive.org/web/20010619122216/www.nethkin.com/bmori/amiga/ados7.gif. I think you would need to get a web spider software which ignores the BASE tag.

--
moto
Code: [Select]
10  IT\'S THE FINAL COUNTDOWN
20  FOR C = 1 TO 2
30     DA-NA-NAAAA-NAAAA DA-NA-NA-NA-NAAAA
40     DA-NA-NAAAA-NAAAA DA-NA-NA-NA-NA-NA-NAAAAA
50  NEXT C
60  NA-NA-NAAAA
70  NA-NA NA-NA-NA-NA-NAAAA NAAA-NAAAAAAAAAAA
80  GOTO 10
 

Offline blobrana

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 4743
    • Show only replies by blobrana
    • http://mysite.wanadoo-members.co.uk/blobrana/home.html
Re: How to backup or download this?????
« Reply #4 on: March 30, 2007, 02:26:42 PM »
Hum,
just view source code, and use screen capture to rip the images (if needed)
[img=http://img339.imageshack.us/img339/4072/image3nw7.th.gif]

Offline James

  • Full Member
  • ***
  • Join Date: Mar 2007
  • Posts: 150
    • Show only replies by James
Re: How to backup or download this?????
« Reply #5 on: March 30, 2007, 03:41:52 PM »
Have you emailed the author? He's already sharing all the info for free, I don't see why he wouldn't agree to give you a way to back it up for personal use.
 

Offline Piru

  • \' union select name,pwd--
  • Hero Member
  • *****
  • Join Date: Aug 2002
  • Posts: 6946
    • Show only replies by Piru
    • http://www.iki.fi/sintonen/
Re: How to backup or download this?????
« Reply #6 on: March 30, 2007, 04:07:01 PM »
This will allow you to get at least some of the files:

wget --user-agent "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.0.3705)" http://web.archive.org/web/20040415065133/www.nethkin.com/bmori/amiga/dos1.html --output-document - | perl -p -e 's/\/\/www.nethkin.com/\/\/web.archive.org\/web\/20040415065133\/www.nethkin.com/g' | wget --input-file - --force-html --user-agent "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.0.3705)" --convert-links --force-directories --no-host-directories --cut-dirs 3 --wait 20 --random-wait

The pages will appear in bmori/amiga/ directory (and subdirectories).

Note that archive.org has robots.txt file that if followed prohibits apps from recursively grabbing content. In this case I've added "--wait 20 --random-wait" to make the leeching less distruptive. Downloading takes longer, but shouldn't piss off archive.org admins.

I know this is far from perfect solution, but at least it works somewhat (without need for downloading everything by hand).
 

Offline AmiKit

Re: How to backup or download this?????
« Reply #7 on: March 30, 2007, 07:07:34 PM »
or you might want to try HTTrack.