Welcome, Guest. Please login or register.

Author Topic: Word documents!! Talk about b.l.o.a.t!  (Read 1908 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline vortexauTopic starter

  • Hero Member
  • *****
  • Join Date: Feb 2002
  • Posts: 1341
    • Show all replies
    • http://home.swiftdsl.com.au/~vortexau
Word documents!! Talk about b.l.o.a.t!
« on: June 05, 2003, 03:42:46 PM »
I was exchanging emails with a Training Agency and they sent me an application form as a PDF.

I replied to them that I was using neither x86 nor Mac, and that I didn't think that I could decode their form. I said that I could handle plain text (ASCII), RTF, and WordWorth files ..... maybe even plain Postscript as I have an old version of PageStream.

WELL ... I did[/i] find "pdftotext"; found that it outputed plain text, and re-formatted the copy in WordWorth!

THE NEXT day they sent me an email with, they said, a "Word" file! The actual attached file ended in ".doc"!

Well, I imported it into WordWorth to see what would happen, and got a heap of gibberish 150 times BIGGER than the plain-text file! The 'original' readable part was near the start and, after a heap (remember 150 times bigger) of gibberish at the very end was:
Quote
      È   Ê   6   8           3   5           Ò   Ô   Ö   ñ   ó           ï   m                                    
        h   h   þ                 h   h   þ                             h   h   þ  ;  X       ÿ  Body Text 2 Body Text 3 Body
Text 2 Body Text Indent 2    Body Text Balloon Text  
         
         
                     D                 D   H          ÿ   F                 F                 F                                      
        À  ù              Ð   ú              Ð   û              Ð   Ð  ü              ý         þ          ÿ                    
   x                            Ð   Ð               Ð   Ð  0ý                                 Þ                É#  (/    ¨0    ¾0    
  ¨0  °     ¨0  ³ 8
  Tms Rmn     ` Symbol    Helv    Times New Roman    Arial    Wingdings       Tahoma    Times New Roman CE  
 Times New Roman Cyr    Times New Roman Greek    Times New Roman Tur    Times New Roman (Hebrew)    
Times New Roman (Arabic)    Times New Roman Baltic    Times New Roman (Vietnamese)    Arial CE    Arial Cyr    
Arial Greek    Arial Tur    Arial (Hebrew)    Arial (Arabic)    Arial Baltic    Arial (Vietnamese)    Tahoma CE
  Tahoma Cyr    Tahoma Greek
  Tahoma Tur    Tahoma (Hebrew)    Tahoma (Arabic)    Tahoma Baltic    Tahoma (Vietnamese)    Tahoma (Thai)  '  
T   m   u'   '   '   '  º'  »'  )/     ÿ   9 ÿ   9 ÿ  "         Ð         Q«u Q«u N«u                   (    0    SPOT-ON JOBS
PROJECT  
Ken Houliston Colin  
:roll:
-vortexau; who\\\'s still waiting! (-for AmigaOS4! ;-) )
savage Ami bridge parody
 

Offline vortexauTopic starter

  • Hero Member
  • *****
  • Join Date: Feb 2002
  • Posts: 1341
    • Show all replies
    • http://home.swiftdsl.com.au/~vortexau
Re: Word documents!! Talk about b.l.o.a.t!
« Reply #1 on: June 05, 2003, 04:12:50 PM »
odin asked;
Quote
Uhm....APDF? Or didn't that digest the PDF?


From the "ReadMeFirst" with Apdf:
Quote

Apdf is a PDF document viewer based on Derek B. Noonburg's xpdf 0.90.
The Amiga part was written by Emmanuel Lesueur. This distribution also
contains a PDF plugin for Voyager 3, Olivier Wagner's web browser.

To use it, you need the following:

- AmigaOS 3.0
- a 68020 CPU or better
- MUI 3.8
- gzip 1.2.4 or something equivalent to the unix 'uncompress' command.

And for the plugin:

- Voyager 3.1



To install:
-----------

Get:
  - the Apdf_common.lha archive;
  - the processor specific archive suitable for your system.
  - the Apdf_fonts.lha archive if you have neither Ghostscript
fonts, nor the Acrobat Reader 4 fonts, nor the standard 14 base
Postscript fonts.
-snip-
For encryption support, see http://elesueur.free.fr/Apdf

(etc)

Well, I couldn't use THAT as I don't run MUI!

What I DID use was pdftotext which is an inclusion with xpdf:
Quote

xpdf
====
and: pdftops, pdftotext

version 0.7 (beta)
97-may-28

The xpdf, pdftops, and pdftotext software and documentation are
copyright 1996, 1997 Derek B. Noonburg.

Email: derekn@ece.cmu.edu
WWW: http://www.contrib.andrew.cmu.edu/usr/dn0o/xpdf/xpdf.html
-SNIP-
To generate a plain text file, run pdftotext:

  pdftotext file.pdf

So I used THIS to produce a readable version from their PDF.

What I was calling to folk's attention was the HUGE size of the attachment they sent me the NEXT day ..... the Word file ending in ".doc"!
-vortexau; who\\\'s still waiting! (-for AmigaOS4! ;-) )
savage Ami bridge parody