18.333 value of PDF

From: Humanist Discussion Group (by way of Willard McCarty willard.mccarty_at_kcl.ac.uk>
Date: Thu, 4 Nov 2004 08:26:57 +0000

               Humanist Discussion Group, Vol. 18, No. 333.
       Centre for Computing in the Humanities, King's College London
                     Submit to: humanist_at_princeton.edu

   [1] From: Alan Sondheim <sondheim_at_panix.com> (16)
         Subject: Re: 18.329 value of PDF

   [2] From: Norman Gray <norman_at_astro.gla.ac.uk> (70)
         Subject: Re: 18.326 value of PDF

   [3] From: Norman Hinton <hinton_at_springnet1.com> (17)
         Subject: Re: 18.329 value of PDF

         Date: Thu, 04 Nov 2004 07:55:56 +0000
         From: Alan Sondheim <sondheim_at_panix.com>
         Subject: Re: 18.329 value of PDF

One of the books I'm currently reviewing is O'Reilly's PDF Hacks, "100
Industrial-Strength Tips & Tools," by Sid Steward (this year). The book is
terrific for anyone working with PDF, and shows the advantages of the
medium. Unfortunately, PDF isn't transparent, and as many have pointed out,
it's chunky/clunky. If you get into it, however, it becomes highly
flexible. I recommend the book.

Some hack titles - Copy Data from PDF Pges; Browse One PDF in Multiple
Windows; Speed Up Acrobat Startup; Generate Document Keywords; Maximize PDF
Portability; Get and Set PDF Metadata - and so forth.

This should be required reading I think for anyone using the format.

- Alan

recent http://www.asondheim.org/ WVU 2004 projects
http://www.as.wvu.edu/clcold/sondheim/files/ recent related to WVU
Trace projects http://trace.ntu.ac.uk/writers/sondheim/index.htm partial
mirror at http://www.anu.edu.au/english/internet_txt

         Date: Thu, 04 Nov 2004 08:10:22 +0000
         From: Norman Gray <norman_at_astro.gla.ac.uk>
         Subject: Re: 18.326 value of PDF


The flurry of remarks about PDF illustrate two things: first, PDF is good
at the things it was designed for, and rather poor at the things it wasn't;
second, it's quite easy to misuse it and produce anti-social PDF files.

If you want to produce portable printable formatted files, PDF is hard to
beat. It seems to me, however, that there are a variety of ways to
frustrate this portability, which others have alluded to. My experience of
ways to facilitate this (and other random remarks in this rather techie of
threads) include:

      * Embed fonts properly, and only when necessary. There's a variety of
ways you can do this, but if you stick to the Standard 35 postscript fonts
you don't need to embed any fonts at all. This shouldn't cramp your style
unless you're a graphic designer, and if you're that, then get a book and
learn how to do this properly.

      * If possible, don't embed fonts at all. As well as causing problems
directly, embedding fonts pushes the file size way up. And don't use
bitmap fonts unless you've absolutely no alternative.

      * If your document is likely to be read on-screen, design the layout
with that in mind. That means a larger font size, a rather square `paper
size', and make sure you generate bookmarks and active
cross-references. With a little work here, screen-based reading of PDF
files can be perfectly comfortable. Of course, such a version is hard to
read on paper, so offer a screen-readable and a printable version (and you
might as well offer standard and US paper sizes while you're at it). You
are generating this PDF from XML or LaTeX source, aren't you?

      * Offer your source files as well, if that's appropriate, so that folk
can use the structure there, or otherwise manipulate the content in ways
they desire.

      * I haven't actually tried this myself, but I'd think that if you
disable ligatures and kerning in the program that generates the PDF, then
the result would be both easier to cut and paste, and search, and probably
available to screen readers, too.

      * Think about graphics. If you have big graphics you'll have big PDF
files. Do they have to be that big? Simple graphics can look perfectly OK
at lowish resolution. If you want the PDF to be read on-screen, then 72dpi
graphics are plenty. If you can use postscript vector graphics for your
images, they'll take up very little space indeed.

      * If you want to produce a document which is editable, or preserves
structure, or which is just meant to be read quickly quickly and discarded
(ie, an email message...), then don't use PDF, since this is not what it's for.

      * Did I mention generating it from XML or LaTeX? Distilling
Postscript is the other way I know of, but I'm sure there are other tools
which can be used.

(I'm echoing several other messages here)

A PDF file without graphics and using only the Standard 35 fonts will
probably come in at a factor of a few times the size of the corresponding
text file. Yes, really -- it's a compact format.

Finally, what are the alternatives? Word? Of course
not. Postscript? Still pretty good, but it has most of the gotchas of PDF
and fewer folk have readers. HTML? That's generally happily portable and
cut-and-pasteable, and you can probably generate it from the same source
from which you generate your PDF. XML? Massively better for some
purposes, but hopeless if you simply want to simply read the
document. Graphics files? Possibly best if what you're offering is
facsimiles, but too much hassle otherwise. Plain text? It'll never go out
of fashion, but it's not exactly easy on the eye.

All the best,


Norman Gray  :  Physics & Astronomy, Glasgow University, UK
http://www.astro.gla.ac.uk/users/norman/  :  www.starlink.ac.uk
Norman Gray  :  Physics & Astronomy, Glasgow University, UK
http://www.astro.gla.ac.uk/users/norman/  :  www.starlink.ac.uk
Norman Gray  :  Physics & Astronomy, Glasgow University, UK
http://www.astro.gla.ac.uk/users/norman/  :  www.starlink.ac.uk
         Date: Thu, 04 Nov 2004 08:11:44 +0000
         From: Norman Hinton <hinton_at_springnet1.com>
         Subject: Re: 18.329 value of PDF
For me, the best thing about PDF is that it reproduces the type faces and
layouts of the books that were scanned in.  I think this is absolutely
necessary for citations and other such references....one needs to be able
to find the quote I use whether on-line or in a printed book.  Anything
else would be bad for scholarly purposes.  (I know editors at journals who
won't even allow citations of printed version unless they refer to the
original, no matter how hard it may be to get hold of that, and no matter
how easy it may be to find a later reprinting of the piece, or inclusion in
an anthology.  Same reason, I guess, though I find carrying the notion that
far to be a bit weird.)
So the idea that I can play with the text and reproduce it in a font of my
choosing or in different layouts is irrelevant and not useful....
My pet peeves about PDF are only two{  it changes screens on m
automatically at the end of a display instead of waiting for me to tell it
what I want to do, and the page numbers in PDF don't accord with the page
numbers in the original, which is most annoying, especially when the
material has been scanned and cannot be searched.
Received on Thu Nov 04 2004 - 03:45:36 EST

This archive was generated by hypermail 2.2.0 : Thu Nov 04 2004 - 03:46:08 EST