Amiga.org

Amiga computer related discussion => Amiga Software Issues and Discussion => Topic started by: vortexau on June 05, 2003, 03:42:46 PM

Title: Word documents!! Talk about b.l.o.a.t!
Post by: vortexau on June 05, 2003, 03:42:46 PM
I was exchanging emails with a Training Agency and they sent me an application form as a PDF.

I replied to them that I was using neither x86 nor Mac, and that I didn't think that I could decode their form. I said that I could handle plain text (ASCII), RTF, and WordWorth files ..... maybe even plain Postscript as I have an old version of PageStream.

WELL ... I did[/i] find "pdftotext"; found that it outputed plain text, and re-formatted the copy in WordWorth!

THE NEXT day they sent me an email with, they said, a "Word" file! The actual attached file ended in ".doc"!

Well, I imported it into WordWorth to see what would happen, and got a heap of gibberish 150 times BIGGER than the plain-text file! The 'original' readable part was near the start and, after a heap (remember 150 times bigger) of gibberish at the very end was:
Quote
      È   Ê   6   8           3   5           Ò   Ô   Ö   ñ   ó           ï   m                                    
        h   h   þ                 h   h   þ                             h   h   þ  ;  X       ÿ  Body Text 2 Body Text 3 Body
Text 2 Body Text Indent 2    Body Text Balloon Text  
         
         
                     D                 D   H          ÿ   F                 F                 F                                      
        À  ù              Ð   ú              Ð   û              Ð   Ð  ü              ý         þ          ÿ                    
   x                            Ð   Ð               Ð   Ð  0ý                                 Þ                É#  (/    ¨0    ¾0    
  ¨0  °     ¨0  ³ 8
  Tms Rmn     ` Symbol    Helv    Times New Roman    Arial    Wingdings       Tahoma    Times New Roman CE  
 Times New Roman Cyr    Times New Roman Greek    Times New Roman Tur    Times New Roman (Hebrew)    
Times New Roman (Arabic)    Times New Roman Baltic    Times New Roman (Vietnamese)    Arial CE    Arial Cyr    
Arial Greek    Arial Tur    Arial (Hebrew)    Arial (Arabic)    Arial Baltic    Arial (Vietnamese)    Tahoma CE
  Tahoma Cyr    Tahoma Greek
  Tahoma Tur    Tahoma (Hebrew)    Tahoma (Arabic)    Tahoma Baltic    Tahoma (Vietnamese)    Tahoma (Thai)  '  
T   m   u'   '   '   '  º'  »'  )/     ÿ   9 ÿ   9 ÿ  "         Ð         Q«u Q«u N«u                   (    0    SPOT-ON JOBS
PROJECT  
Ken Houliston Colin  
:roll:
Title: Re: Word documents!! Talk about b.l.o.a.t!
Post by: odin on June 05, 2003, 03:47:09 PM
Uhm....APDF? Or didn't that digest the PDF?
Title: Re: Word documents!! Talk about b.l.o.a.t!
Post by: filson on June 05, 2003, 03:51:40 PM
its quite well know that ms has some very fishy file formats.
and .doc is one of their "best" :)

not even the guys at open office.org can keep up with the import functions. but then again, you should look for some of the bugs reported for Wine for linux. they found several parts of the win32 api that was funcionally wrong compared to the docs, and several chunks of code that were just floating without any relation to anything. In the final binary !
 :-D  :-D
Title: Re: Word documents!! Talk about b.l.o.a.t!
Post by: vortexau on June 05, 2003, 04:12:50 PM
odin asked;
Quote
Uhm....APDF? Or didn't that digest the PDF?


From the "ReadMeFirst" with Apdf:
Quote

Apdf is a PDF document viewer based on Derek B. Noonburg's xpdf 0.90.
The Amiga part was written by Emmanuel Lesueur. This distribution also
contains a PDF plugin for Voyager 3, Olivier Wagner's web browser.

To use it, you need the following:

- AmigaOS 3.0
- a 68020 CPU or better
- MUI 3.8
- gzip 1.2.4 or something equivalent to the unix 'uncompress' command.

And for the plugin:

- Voyager 3.1



To install:
-----------

Get:
  - the Apdf_common.lha archive;
  - the processor specific archive suitable for your system.
  - the Apdf_fonts.lha archive if you have neither Ghostscript
fonts, nor the Acrobat Reader 4 fonts, nor the standard 14 base
Postscript fonts.
-snip-
For encryption support, see http://elesueur.free.fr/Apdf

(etc)

Well, I couldn't use THAT as I don't run MUI!

What I DID use was pdftotext which is an inclusion with xpdf:
Quote

xpdf
====
and: pdftops, pdftotext

version 0.7 (beta)
97-may-28

The xpdf, pdftops, and pdftotext software and documentation are
copyright 1996, 1997 Derek B. Noonburg.

Email: derekn@ece.cmu.edu
WWW: http://www.contrib.andrew.cmu.edu/usr/dn0o/xpdf/xpdf.html
-SNIP-
To generate a plain text file, run pdftotext:

  pdftotext file.pdf

So I used THIS to produce a readable version from their PDF.

What I was calling to folk's attention was the HUGE size of the attachment they sent me the NEXT day ..... the Word file ending in ".doc"!
Title: Re: Word documents!! Talk about b.l.o.a.t!
Post by: KennyR on June 05, 2003, 04:15:45 PM
The Word fileformat is a heap of crap, and includes lots of stuff in the file that just don't need to be there. Just like all Microsoft stuff it is bloated, badly thought-out and kludged from beginning to end.
Title: Re: Word documents!! Talk about b.l.o.a.t!
Post by: lorddef on June 05, 2003, 04:30:11 PM
I tried to use apdf, but all I got in the archive was apdf.module, so I put it in my plug-ins folder in voyager but nothing, the plugin manager doesnt dee it.

I noticed that the vflash plugin has other files, that I guess apdf needs equivalents of.

How do you get apdf to work?
Title: Re: Word documents!! Talk about b.l.o.a.t!
Post by: cecilia on June 05, 2003, 04:53:30 PM
you must not have much experience with windows files! :-o

seriously, try these doc datatypes:DocDataypes (http://internetpages.bravepages.com/docdatatypes/)

it loads a silly word doc without the crap!